# 2023-01-03 Error handling in Bash Undesired behaviour - an error, can manifest in various ways. In most high-level programming languages errors are represented as dedicated objects, allowing to handle them without interrupting program execution. They usually require developer to explicitly mark bad program flows by throwing or raising problem description. Bash don't provide any special syntax for this. Instead it allows to hook custom actions when non-zero exit code from command occurs. It's not universally treated as an error, since some applications use more than one exit code to signal success. Yet Bash contains some builtin commands and options named after _error_ for handling such cases. ## Unofficial strict mode So called _unofficial strict mode_ consist of the following options: ``` set -o errexit set -o nounset set -o pipefail ``` and is usually shortened to `set -euo pipefail`. It's probably the easiest for starter, but it doesn't provide much information about error when it actually happens. > **NOTE**: Even with above flags Bash will ignore some problems like non-zero > exit code from subshell invoked in builtin command invocation. It's the best > to always check scripts additionally with tools like > [Shellcheck](https://www.shellcheck.net/) and review [Bash Pitfalls > wiki](https://mywiki.wooledge.org/BashPitfalls). ## Using traps When everything else fails, script can be always launched with `bash -x` or `set -o xtrace`, but it can be overkill for bigger solutions and might reveal confidential data from variables. Another approach requires to define trap functions. They act as hooks and run only on POSIX signals or when some specific condition is met. See example with `ERR` trap signal below: ```shell ##!/usr/bin/env bash set -euo pipefail on_error() { declare exit_code=$? echo "Something went wrong!" >&2 exit "$exit_code" } trap on_error ERR ls /nonexistent ``` Upon execution it prints: ``` ls: cannot access '/nonexistent': No such file or directory Something went wrong! ``` `ls` might not be the best example here, since it describes what went wrong by itself, but imagine checking for TCP port availability with OpenBSD `netcat`: ``` # ... nc -zw 3 127.0.0.1 22 ``` We wouldn't know that there was error, unless we printed last exit code `echo $?`. With `ERR` trap useful message appears without any further interaction and program stops, if we also use `errexit`. There's edge case in above solution when we use functions. They don't inherit `ERR` trap, unless we use `errtrace` option. Now strict mode looks like this `set -eEuo pipefail`. It's usually enough for small scripts, but we have to guess why program stopped, if error can appear in multiple places. `BASH_COMMAND` can be used to show exactly which command caused script to fail: ``` #!/usr/bin/env bash set -eEuo pipefail on_error() { declare \ cmd=$BASH_COMMAND \ exit_code=$? echo "Failing with exit code ${exit_code} at ${BASH_SOURCE[0]}:${FUNCNAME[0]} in command: ${cmd}" >&2 exit "$exit_code" } trap 'on_error ${BASH_SOURCE[0]}:${BASH_LINENO[0]}' ERR function_a() { nc -zw 3 127.0.0.1 22 } function_b() { function_a } function_b ``` It save us some time when error occurs without `xtrace`: ``` Failing with exit code 1 at /home/jakub/d/kb/other/t.sh:20 in command: nc -zw 3 127.0.0.1 22 ``` ## Printing call traces Additionally Bash provides some diagnostic variables defined as arrays, which can be used to locate where error happened: - `BASH_SOURCE` - Source file of currently executed function. It's not always available, since scripts can be sourced from arbitrary file descriptors. - `FUNCNAME` - Currently executed function name. - `BASH_LINENO` - Line number from source file. They can be iterated to generate a call trace: ```shell ##!/usr/bin/env bash set -eEuo pipefail on_error() { declare \ cmd=$BASH_COMMAND \ exit_code=$? \ i=0 \ end="${#FUNCNAME[@]}" \ next end=$((end - 1)) echo "Failing with exit code ${exit_code} in command: ${cmd}" >&2 while [ "$i" != "$end" ]; do next=$((i + 1)) echo " ${BASH_SOURCE["$next"]:-}:${BASH_LINENO["$i"]}:${FUNCNAME["$next"]}" >&2 i=$next done exit "$exit_code" } trap on_error ERR # Source wrapped_ls function source ./t2.sh function_a() { nc -zw 3 127.0.0.1 22 } function_b() { wrapped_ls /asdf } function_b ``` Running prints: ``` $ ./t.sh ls: cannot access '/asdf': No such file or directory Failing with exit code 2 in command: ls "$1" ./t2.sh:2:wrapped_ls ./t.sh:31:function_b ./t.sh:34:main ``` There's one problem with this approach. Bash allows to throw custom error message, when variable is null or not set with syntax `${varname:?"message"}`. It does end program, but it doesn't trigger `ERR` trap. To handle this properly we need to replace `ERR` trap with `EXIT`. It runs unconditionally when script ends, so exit code needs to be additionally checked. We no longer need `errtrace`(`-E`) option in this case. ```shell #!/usr/bin/env bash set -euo pipefail on_exit() { declare \ cmd=$BASH_COMMAND \ exit_code=$? \ i=0 \ end="${#FUNCNAME[@]}" \ next if [ "$exit_code" = 0 ]; then return 0 fi end=$((end - 1)) echo "Failing with exit code ${exit_code} in command: ${cmd}" >&2 while [ "$i" != "$end" ]; do next=$((i + 1)) echo " ${BASH_SOURCE["$next"]:-}:${BASH_LINENO["$i"]}:${FUNCNAME["$next"]}" >&2 i=$next done exit "$exit_code" } trap on_exit EXIT # Source wrapped_ls function source ./t2.sh function_a() { nc -zw 3 127.0.0.1 22 } function_b() { # Force exit with undefined variable. : "${dgfdfg:?}" wrapped_ls /asdf } function_b ``` ## Handling non-zero exit codes It's usually not the issue, since commands used in conditionals are excluded from `errexit` and `ERR` trap. Difficult cases are when we want to use functions or pipe commands. ### In functions It's tempting to use function directly in conditional. Both full syntax and `&&`/`||` turn off `errexit` and `ERR` trap. They turn them off *globally*! It means that even nested function calls won't properly terminate on error. This is almost never a desired behaviour, so instead we have to wrap function call in subshell. It's highly suboptimal solution, since a new process gets spawned, but it's still better than not handling errors at all. ```shell #!/usr/bin/env bash set -euo pipefail on_exit() { # ... } trap on_exit EXIT get_exit_code() { set +e ( set -e "$@" ) echo "$?" set -e } function_a() { nc -zw 1 127.0.0.1 80 return 0 } function_b() { function_a return 0 } if function_b; then echo "Expected non-zero!" fi declare i i=$(get_exit_code function_b) if [ "$i" = 0 ]; then echo "Expected non-zero, but got: ${i}" fi ``` Running gives: ``` $ bash t.sh Expected non-zero! ``` assuming port 80 is not listening. ### In pipelines One way to handle non-standard error codes in pipelines is to wrap invocations in functions and use builtin `return` to signal actual error. Defining a new function can be skipped, if command list is used with syntax `{ }` or if short conditional is used. ```shell # ... nc -zw 1 127.0.0.1 80 || [ "$?" = 1 ] \ | cat { declare i=0 nc -zw 1 127.0.0.1 80 || i=$? echo "Returned ${i}" if [ "$i" != 1 ]; then exit 1 fi } \ | cat ``` Running above gives: ``` # bash t.sh Returned 1 ```