Undesired behaviour - an error, can manifest in various ways. In most high-level programming languages errors are represented as dedicated objects, allowing to handle them without interrupting program execution. They usually require developer to explicitly mark bad program flows by throwing or raising problem description. Bash don't provide any special syntax for this. Instead it allows to hook custom actions when non-zero exit code from command occurs. It's not universally treated as an error, since some applications use more than one exit code to signal success. Yet Bash contains some builtin commands and options named after error for handling such cases.
So called unofficial strict mode consist of the following options:
set -o errexit
set -o nounset
set -o pipefail
and is usually shortened to set -euo pipefail
. It's probably the easiest for
starter, but it doesn't provide much information about error when it actually
happens.
NOTE: Even with above flags Bash will ignore some problems like non-zero exit code from subshell invoked in builtin command invocation. It's the best to always check scripts additionally with tools like Shellcheck and review Bash Pitfalls wiki.
When everything else fails, script can be always launched with bash -x
or set -o xtrace
, but it can be overkill for bigger solutions and might reveal
confidential data from variables. Another approach requires to define trap
functions. They act as hooks and run only on POSIX signals or when some specific
condition is met. See example with ERR
trap signal below:
#!/usr/bin/env bash
set -euo pipefail
on_error() {
declare exit_code=$?
echo "Something went wrong!" >&2
exit "$exit_code"
}
trap on_error ERR
ls /nonexistent
Upon execution it prints:
ls: cannot access '/nonexistent': No such file or directory
Something went wrong!
ls
might not be the best example here, since it describes what went wrong by
itself, but imagine checking for TCP port availability with OpenBSD netcat
:
# ...
nc -zw 3 127.0.0.1 22
We wouldn't know that there was error, unless we printed last exit code echo $?
. With ERR
trap useful message appears without any further interaction and
program stops, if we also use errexit
.
There's edge case in above solution when we use functions. They don't inherit
ERR
trap, unless we use errtrace
option. Now strict mode looks like this
set -eEuo pipefail
.
It's usually enough for small scripts, but we have to guess why program
stopped, if error can appear in multiple places. BASH_COMMAND
can be used to
show exactly which command caused script to fail:
#!/usr/bin/env bash
set -eEuo pipefail
on_error() {
declare \
cmd=$BASH_COMMAND \
exit_code=$?
echo "Failing with exit code ${exit_code} at ${*} in command: ${cmd}" >&2
exit "$exit_code"
}
trap on_error ERR
function_a() {
nc -zw 3 127.0.0.1 22
}
function_b() {
function_a
}
function_b
It save us some time when error occurs without xtrace
:
Failing with exit code 1 at in command: nc -zw 3 127.0.0.1 22
Additionally Bash provides some diagnostic variables defined as arrays, which can be used to locate where error happened:
BASH_SOURCE
- Source file of currently executed function. It's not always
available, since scripts can be sourced from arbitrary file descriptors.FUNCNAME
- Currently executed function name.BASH_LINENO
- Line number from source file.They can be iterated to generate a call trace:
#!/usr/bin/env bash
set -eEuo pipefail
on_error() {
declare \
cmd=$BASH_COMMAND \
exit_code=$? \
i=0 \
end="${#FUNCNAME[@]}" \
next
end=$((end - 1))
echo "Failing with exit code ${exit_code} in command: ${cmd}" >&2
while [ "$i" != "$end" ]; do
next=$((i + 1))
echo " ${BASH_SOURCE["$next"]:-}:${BASH_LINENO["$i"]}:${FUNCNAME["$next"]}" >&2
i=$next
done
exit "$exit_code"
}
trap on_error ERR
# Source wrapped_ls function
source ./t2.sh
function_a() {
nc -zw 3 127.0.0.1 22
}
function_b() {
wrapped_ls /asdf
}
function_b
Running prints:
$ ./t.sh
ls: cannot access '/asdf': No such file or directory
Failing with exit code 2 in command: ls "$1"
./t2.sh:2:wrapped_ls
./t.sh:31:function_b
./t.sh:34:main
There's one problem with this approach. Bash allows to throw custom error
message, when variable is null or not set with syntax ${varname:?"message"}
.
It does end program, but it doesn't trigger ERR
trap. To handle this properly
we need to replace ERR
trap with EXIT
. It runs unconditionally when script
ends, so exit code needs to be additionally checked. We no longer need
errtrace
(-E
) option in this case.
#!/usr/bin/env bash
set -euo pipefail
on_exit() {
declare \
cmd=$BASH_COMMAND \
exit_code=$? \
i=0 \
end="${#FUNCNAME[@]}" \
next
if [ "$exit_code" = 0 ]; then
return 0
fi
end=$((end - 1))
echo "Failing with exit code ${exit_code} in command: ${cmd}" >&2
while [ "$i" != "$end" ]; do
next=$((i + 1))
echo " ${BASH_SOURCE["$next"]:-}:${BASH_LINENO["$i"]}:${FUNCNAME["$next"]}" >&2
i=$next
done
exit "$exit_code"
}
trap on_exit EXIT
# Source wrapped_ls function
source ./t2.sh
function_a() {
nc -zw 3 127.0.0.1 22
}
function_b() {
# Force exit with undefined variable.
: "${dgfdfg:?}"
wrapped_ls /asdf
}
function_b
It's usually not the issue, since commands used in conditionals are excluded
from errexit
and ERR
trap. Difficult cases are when we want to use functions
or pipe commands.
It's tempting to use function directly in conditional. Both full syntax and
&&
/||
turn off errexit
and ERR
trap. They turn them off globally! It
means that even nested function calls won't properly terminate on error. This is
almost never a desired behaviour, so instead we have to wrap function call in
subshell. It's highly suboptimal solution, since a new process gets spawned, but
it's still better than not handling errors at all.
#!/usr/bin/env bash
set -euo pipefail
on_exit() {
# ...
}
trap on_exit EXIT
get_exit_code() {
set +e
(
set -e
"$@"
)
echo "$?"
set -e
}
function_a() {
nc -zw 1 127.0.0.1 80
return 0
}
function_b() {
function_a
return 0
}
if function_b; then
echo "Expected non-zero!"
fi
declare i
i=$(get_exit_code function_b)
if [ "$i" = 0 ]; then
echo "Expected non-zero, but got: ${i}"
fi
Running gives:
$ bash t.sh
Expected non-zero!
assuming port 80 is not listening.
One way to handle non-standard error codes in pipelines is to wrap invocations
in functions and use builtin return
to signal actual error. Defining a new
function can be skipped, if command list is used with syntax { }
or if short
conditional is used.
# ...
nc -zw 1 127.0.0.1 80 || [ "$?" = 1 ] \
| cat
{
declare i=0
nc -zw 1 127.0.0.1 80 || i=$?
echo "Returned ${i}"
if [ "$i" != 1 ]; then
exit 1
fi
} \
| cat
Running above gives:
# bash t.sh
Returned 1