The robust way to read lines in a loop

while IFS= read -r line || [[ $line ]]; do ...; done < /path/to/input

September 28, 2023 — bashoneliners

Explanation

The one-liner is a robust way to write this simpler naive one-liner:

while read line; do ...; done < /path/to/input

For each line in /path/to/input, store the content of the line in the variable line, and do something with it.

What can go wrong? Why are all the extra things necessary in the one-liner to make the line by line text processing robust?

Spaces at the start and at the end of lines

Without the IFS= prefix (setting IFS to empty string for the read), whitespace at the beginning and at the end of lines will not be stored in line. Take for example this input:

  hello  
^^     ^^ spaces!

Without the IFS= prefix, the value of line would be "hello" instead of " hello ". If you want to preserve these spaces, make sure to use the IFS= prefix.

Backslashes in the input

Without the -r flag, read would interpret backslash \ characters as escape symbols. Take for example this input:

first line\
second line

Without the -r flag, the trailing \ at the end of the first line would be interpreted to escape the line break and thereby continue the input, the value of line would become first linesecond line.

Input ending without EOL (end-of-line)

Without the || [[ $line ]], if the last line of the input doesn't have an EOL (end-of-line) character, the loop body will not be reached. This is because read exits with failure when it reaches EOF (end-of-file), which causes the loop to end.

The || [[ $line ]] takes care of this corner case:

When the left side of || fails, its right side is executed.
[[ $line ]] checks if $line actually has some content.
- When the last line ends with EOL, then $line will be empty, and we want to stop the loop.
- When the last line doesn't end with EOL, then $line will not be empty, and we want to continue the loop. In the following iteration the loop will really end, because we're still at the end of the file, and $line will be empty then.

Conclusion

To read the lines in the input exactly as they are, the full one-liner idiom at the top is needed.

read while