Standard POSIX shell scripting — A quick cheatsheet by examples

Because Ubuntu change the default shell from bash to dash, many people encounter syntax errors for using bash specific syntax. In this article we will give a cheatsheet for quick reference for standard POSIX shell syntax. And you can learn how to make a bash script work in dash.

Why standard POSIX shell syntax

As Dash as bin/sh said, in Ubuntu 6.10, the default system shell, /bin/sh, was changed to dash (the Debian Almquist Shell) mainly for efficiency; previously it had been bash (the GNU Bourne-Again Shell).

Therefore those scripts that could run with bash could not run anymore due to this change. If you writes scripts with bash syntax, you probably encounter kinds of compliant issues.

A good philosophy is sticking to the standard POSIX shell syntax to make your scripts available for a variety of UNIX systems.

Check which shell you are using

$ echo $SHELL
/bin/bash

Positional parameters

Positional parameters are initially assigned when the shell is invoked , temporarily replaced when a shell function is invoked (see Function Definition Command), and can be reassigned with the set special built-in command.

# $?
# The exit status of the most recent pipeline.

# $@
# All positional parameters.

# $*
# All positional parameters.

# Differences between $@ and $*
# Loop over $@
for a in "$@"; do
    echo "Arg: $a" # Run
done

# Loop over $*,
for a in "$*"; do
    echo "Args: $a" # Run one time
do

# $#
# The count of positional parameters.

Variables

Quoted variables

# Single-quotes preserve the literal value of each character.
str1='$'
echo $str1 # $

foo=10
x=abc
y='$'$x
echo $y # $foo

# Double-quotes preserve the literal value of all characters with the exception
# of the characters including dollar-sign($), backquote(`) and bashslash().
str2="today is $(date)"
echo $str2 # Today is Wed Oct 28 13:25:17 UTC 2020

str3=abc
echo $str3 # abc

# Characters needed to be quoted to indicate themselves:
# |  &  ;    (  )  $ `   " '      
# Chareacters need to be qoted to indicate themselves in cirtern circumstances:
# *   ?   [   #   ˜   =   %
str4="*?[~=%"

Empty string, null

# Set value with empty string explicitly
str4=''
# Set value with empty string
str5=

# null value
b=null
echo $b # null

Unset variables

x=123
set y z

# Once a variable is set, it can only be unset by using the unset special built-in command.
unset x
unset y z

Shell environment variables

  • HOME

    The pathname of the user’s home directory.

  • PWD

    Set by the shell and by the cd utility.

  • IFS

    IFS stands for “Input Field Separator”. A string treated as a list of cheracters that is used for filed splitting and to split lines into fileds with the read command. If IFS is not set, the shell shall behave as if the value of IFS is ,, and “; see Field Splitting. Implementations may ignore the value of IFS in the environment at the time the shell is invoked, treating IFS as if it were not set.

  • PATH

# Update environment variables with export
export HOME=/

# or
HOME=/
PATH=/
export HOME PATH

Expansions

Parameter expansions

$s=abc
echo ${s}abc # abcabc
echo $sabc   # Nothing is putput for there is no variable named as sabc

# Use default values
echo ${x:-abc} # abc

# Assign defualt values
echo ${x}      # Nothing is output
echo ${x:=abc} # abc
echo ${x}      # abc

# String length
echo ${#s} # 3

# Error if null or unset
unset x
echo ${x:?}  # x: parameter null or not set
echo ${x:?unset parameter} # x: unset value

Arithmetic expansions

i=1
i=$(($i+1)) # Now i=2

For arithmetic operations, except using $((...)), you can also use expr command to evaluate (See utilities part in the bottom).

Commands

Redirection

$ cat <<EOF > file_name
line1
line2
EOF

Pipelines

A sequence of commands can be connected by the control operator |, the shall shall connect the standard output of a command to the standard input of the next command as if by creating a pipe.

# Get count of files
ls -a | wc -l

# Limit files to be list at a time
ls -alh * | less

Lists (multiple commands )

And lists (&&)

# And list
# Only the first command exit with zero, the second command shall be executed
git checkout topic && git status

Or list (||)

# Or list
# If the first command exits with non-zero, the second command shll be executed,
# otherwise only the first command is executed
cd .. || ls -a

Sequential lists (;)

# Sequential list
# Multiple commands executed sequentially in one line.
cd ..; ls -a

Asynchronous lists (&)

# Asynchronous list
# If a command is terminated by the control operator <ampersand> ( '&' ), the shell shall execute the command asynchronously in a subshell. This means that the shell shall not wait for the command to finish before executing the next command.
# Format: command1 & [command2 & ... ]

mv ./build/ ../ &

Command substitution

Command substitution allows the output of a command to be substituted in place of the command name itself.

echo "Hello $(date -u +'%Y-%m-%d')" # Hello 2020-10-28
# or
echo "Hello `date -u +'%Y-%m-%d'`"  # Hello 2020-10-28

# Nested command
echo "next year is $(expr $(date +%Y) + 1)"

Conditions

If statement

# If there are less 2 parameters
if [ $# -lt 2 ]; then exit 1; fi
# Or use test explicitly instead of []
if test $# -lt 2; then exit 1; fi

# Put variable inside double-quotes to prevent expanding its value into different words
if [ "$commit_type" = "feat" ]
then
    echo "New feature added"
elif [ "$commit_type" = "fix" ]
    echo "A bug is fixed"
else
    echo "Something else"
fi


# Multiple conditions
if [ $refname = "refs/heads/master" ] || [ $refname = "refs/heads/main" ]; then
    echo "This is the main branch"
fi

Case statement

case $file in
    *.txt)
        type="txt"
        echo "This is a txt file"
        ;;
    *.jpg|*.jpg|*.gif)
        type="image"
        echo "This is an image"
        ;;
    *)   # Match anything
        type="other"
        echo "This is other type"
        ;;
esac

Loops

For statement

for i in 1 2
do
    if test -d "$i"
    then break
    fi
done

While statement

i=1
while [ $i -lt 3 ]
do
    echo "i=$i"
    i=$(($i+1))
done

# output:
# i=1
# i=2

Until statement

i=1
until [ $i -gt 3 ]
do
    echo "i=$i"
    i=$(($i+1))
done

# output:
# i=1
# i=2
# i=3

# Infinite loop
until false
do
    ...
done

Loop over the output of a command

# The results of a shell command is splitted by $IFS, the default value is tab, space, newline

# Loop over results of "ls"
ls | while read -r f ; do echo ${#f}; done

Functions

Definition

total_files () {
    find $1 -type f | wc -l
}

Redirection

Redirect input, output

# Redirect input format: [n]word
# n, represents the file descriptor number. If the number is omitted,
#    the redirection shall refer to standard output (file descriptor 1).

Appending redirected output

# Appending redirected output format: [n]&gt;&gt;word

# Appending "hello" to a.txt
echo "hello" &gt;&gt; a.txt

Here-document `< | Relational | Greater Than |

| >= | Relational | Greater Than or Equal to |
| < | Relational | Less Than |
| Note:
>

The XSI extensions specifying the -a and -o binary primaries and the '(' and ')' operators have been marked obsolescent.

test "$1" -a "$2"

should be written as:

test "$1" &amp;&amp; test "$2"

Examples

# Exit if there are not two or three arguments (two variations):
if [ $# -ne 2 ] && [ $# -ne 3 ]; then exit 1; fi
if [ $# -lt 2 ] || [ $# -gt 3 ]; then exit 1; fi

# Perform a mkdir if a directory does not exist:
test ! -d tempdir && mkdir tempdir

# Wait for a file to become non-readable:
while test -r thefile
do
    sleep 30
done
echo '"thefile" is no longer readable'

# Perform a command if the argument is one of three strings (two variations):
if [ "$1" = "pear" ] || [ "$1" = "grape" ] || [ "$1" = "apple" ]
then
    command
fi


case "$1" in
    pear|grape|apple) command ;;
esac

Best practice

# The two commands:

test "$1"
test ! "$1"

# could not be used reliably on some historical systems. Unexpected results would occur if such a string expression were used and $1 expanded to '!', '(', or a known unary primary. Better constructs are:

test -n "$1"
test -z "$1"

true

Return true value.

The true utility shall return with exit code zero.

while true
do
    <command>
done

false

Return false value.

alias

Define or display aliases.

# Format:  alias [alias-name[=string] ...]

# Change ls to give a columnated, more annotated output:
alias ls="ls -CF"

# Create a simple "redo" command to repeat previous entries in the command history file:
alias r='fc -s'

# Use 1K units for du:
alias du=du -k

# Set up nohup so that it can deal with an argument that is itself an alias name:
alias nohup="nohup "

Troubleshooting

Common bashism errors

bashism presents a shell command specific to the Bash interpreter. If people use bashisms (bash extensions) in their scripts, they would have many syntax errors.

There are generally several solutions for these errors, they are from Dash as bin/sh.

  • Solution 1. Change #! /bin/sh to #! /bin/bash to use bash as the interpreter

    If you have limited files that occur such errors, you can adopt this solution and it is a simplest one.

  • Solution 2. Stick to standard POSIX shell syntax to avoid these issues

    Someone would like to stick to stand POSIX shell syntax to avoid bashism errors. Doing so makes your scripts more portable to a variety of Unix systems and bring other goodness such as more maintainable.

  • Solution 3. Change the default system shell to bash

    If you have widespread such problems, you can instruct the package management system to stop installing dash as /bin/sh:

    sudo dpkg-reconfigure dash
    

    Be careful to use this solution for this is a more invasive change and may cause other problems.

bashism error examples

  • function : not found

  • [[ : not found

  • Bad substitution

Resources

About bashism, bash vs dash

  • Dash as bin/sh. https://wiki.ubuntu.com/DashAsBinSh

    It lists some more common bash extensions that are not supported by dash. This list is not complete, but we believe that it covers most of the common extensions found in the wild. You can use dash -n to check that a script will run under dash without actually running it; this is not a perfect test (particularly not if eval is used), but is good enough for most purposes. The checkbashisms command in the devscripts package may also be helpful (it will output a warning possible bashism in for every occurrence).

    “`shell
    [
    [[
    ((
    $((n++)), $((–n))
    {
    $'…'
    $"…"
    ${…}
    ${parm/?/pat[/str]}
    ${foo:3[:1]}
    ${foo:3[:1]}
    $LINENO
    $PIPESTATUS
    $RANDOM
    function
    echo options
    let
    local
    << A sub-session or sub-process is little different from sub-shell we have seen before. Any code inside parentheses <code>()</code> runs in sub-shell. A sub-shell is also a separate process (<em>with an exception in</em> <a href="https://docstore.mik.ua/orelly/unix3/korn/ch08_06.htm#FOOTNOTE-125"><em>KSh</em></a>) started by the main shell process but in contrast with sub-session, it is an identical copy of the main shell process. <a href="https://docstore.mik.ua/orelly/unix3/korn/ch08_06.htm"><strong>This</strong></a> article explains the difference between sub-shell and sub-process.

    It is possible to pass an environmental variable to a process directly from the command which started it using below syntax. This way, our environment variables can be portable and we can avoid writing unnecessary boilerplate.

    “`shell

    main.sh

    MY_IP=’192.168.1.7′ bash ./child.sh


~$ bash main.sh
⥤ MY_IP inside child.sh : 192.168.1.7
“`

If you need to set an environmental variable for all the process started by the current terminal session, users can directly execute export command. But once you open a new terminal, it won’t have that environmental variable. To make environmental variables accessible across all terminal sessions, export them from .bash_profile or any other startup script.

Code block

If we need to execute some code as a block, then we can put our code in {} curly braces. Code block ({} block) executes the code in the same shell, hence in the same shell process, while () block which executes the code inside it in a sub-shell.

If we want to run some code as a block on a single line, we need to terminate the statements wit ; character (unlike a sub-shell).

“`shell
MAIN_VAR="main"
{ sleep 1; echo "code-block: $MAIN_VAR"; }
( sleep 1; echo "sub-shell: $MAIN_VAR" )

# Output:
code-block: main
sub-shell: main
“`

In the example, code-block and sub-shell both have access to MAIN_VAR because code-block runs in the same environment of the main shell while sub-shell might run in the different process but it has an identical copy of main process which also contains the variables from the main process.

The difference comes where we try to set or update a variable in the main process from a sub-shell. Here is a demonstration of that.

“`shell
VAR_CODE_BLOCK="INIT"
VAR_SUB_SHELL="INIT"
{ VAR_CODE_BLOCK="MODIFIED"; echo "code-block: $VAR_CODE_BLOCK"; }
( VAR_SUB_SHELL="MODIFIED"; echo "sub-shell: $VAR_SUB_SHELL" )
echo "main/code-block: $VAR_CODE_BLOCK"
echo "main/sub-shell: $VAR_SUB_SHELL"

# Output:
code-block: MODIFIED
sub-shell: MODIFIED
main/code-block: MODIFIED
main/sub-shell: INIT
“`

We can see that when we try to update or set a variable in the sub-shell, it will set a new variable in its own environment.

A good use case of code block would be to pipe (|) or redirect (&gt;) the output of some statements as a whole.

“`shell
{ echo -n “Hello”; sleep 1; echo ” World!”; } > hello.txt


~$ bash main.sh && cat hello.txt

# Output:
# Hello World!
“`

Leave a Reply