Chapter 2. Looping Lingo

It’s not just C-style for loops—bash includes other syntax and styles; some are more familiar to Python programmers, but each has its place. There is a for loop with no apparent arguments, useful in both scripts and inside functions. There is an iterator-like for loop with explicit values and values that can come from other commands.

Looping constructs

Looping constructs are common in programming languages. Since the invention of C language many programming languages have adopted the C-style “for” loop. It’s such a powerful, readable construct because it groups the initialization code, the termination condition and the iteration code all into one place. For example, in C (or Java, or …):

/* NOT bash */
for (i=0; i<10; i++) {
    printf("%d
", i);
}

With just a few minor syntax differences, bash follows much the same approach:

for ((i=0; i<10; i++)); do
    printf '%d
' "$i"
done

Note, especially, the use of double parentheses. Rather than braces, bash uses “do” and “done” to enclose the statements of the loop. As with C/C+, an idiomatic use of the "for" loop is the empty "for" loop, giving a deliberate infinite loop (you’ll also see +while true; do):

for ((;;)); do
    printf 'forever'
done

But that’s not the only kind of “for” loop in bash. Here’s a common idiom in shell scripts:

for value; do
    echo "$value"
    # do more stuff with $value...
done

This looks like something is missing, doesn’t it? Where does value get its values? This won’t do anything for you on the command line, but if you use this in a shell script then the for loop will iterate over the parameters to the script. That is, it will use $1 then $2, then $3, and so on, as the values for value.

Put that “for” loop in a file called myloop.sh then you can run it like this and see the three arguments (“-c”, “17” and “core”) printed out:

$ bash myloop.sh -c 17 core
-c
17
core
$

This abbreviated for loop is also very often found in function definitions:

function listem {
    for arg; do
        echo "arg to func: '$arg'"
    done
    echo "Inside func: $0 is still: '$0'"
}

Inside a function definition the parameters $1, $2, etc. are the parameters to the function and not parameters to the enclosing shell script. Therefore, inside the function definition the for loop will iterate over the parameters passed to the function.

This minimalist “for” loop iterates over an implied list of values - the parameters passed either to the script or the function. When used in the main body of a script, it iterates over the parameters that were passed to the script; when used inside a shell function, it iterates over the parameters that were passed to that function.

This is definitely one of the obscure bash idioms. You need to know how to read it, but we’ll circle back to debate how to write it a few sections below.

We might like a similarly simple loop but one with explicit values of our own choosing not limited to the parameters. Bash has just the thing.

Explicit Values

In bash, the for loop can be given a list of values to loop over, like this:

for num in 1 2 3 4 5; do
    echo "$num"
done

Since bash is dealing with strings we don’t have to use just numbers:

for person in Sue Neil Pat Harry; do
    echo $person
done

Of course the list of values can include variables as well as literals:

for person in $ME $3 Pat ${RA[2]} Sue; do
    echo $person
done

Quoting

You may wonder about quotes in all of these commands. You should, quoting in bash can be quite tricky. Just keep reading, we’ll get to that before the end of this chapter.

Another source of values for the for loop can come from other commands, either a single command or a pipeline of commands:

for arg in $( some cmd or other | sort -u)

Examples of this kind are:

for arg in $(cat /some/file)
for pic in $(find . -name '*.jpg')
for val in $(find . -type d | LC_ALL=C sort)

A common use, especially in older scripts, is something like this:

for i in $(seq 1 10)

because the seq command will generate a sequence of numbers. This case could be considered equivalent to:

for ((i = 1; i <= 10; i++))

This latter for is more efficient and probably more readable. (Note that after the loop terminates, however, the value of i will differ between those two forms (10 vs. 11), though generally one doesn’t use the value outside of the loop.)

There’s also this variation, but it has bash version portability issues because the brace expansion was introduced in v3.0 and zero-padding of expanded numeric values was introduced in v4.0:

for {01..10}

Leading Zeros

When either of the first two terms start with a zero in a {start..end..inc} brace expansion it will force each output value to be the same width - using zeros to pad them on the left side in bash v4.0 or newer. So {098..100} will result in: 098 099 100 whereas {98..0100} will pad to four characters, resulting in: 0098 0099 0100.

This brace expansion construct can be especially useful when you want the numbers that are being generated to be part of a larger string. You simply put the brace construct as part of the string. For example, if you wanted to generate 5 filenames like log01.txt through log05.txt you could write:

for FN in log{01..5}.txt ; do
    # do something with the filenames here
    echo $FN
done

brackets v printf

You could also do this with a numeric for loop and then a printf -v to construct the filename from the numbers but the brace expansion seems a bit simpler. Use the numeric for loop and printf when you need the numeric values for something else in addition to the filenames.

The seq command, though, can still be very useful for generating a sequence of floating-point style numbers. You specify an increment between the starting and ending values.

for VAL in $(seq 2.1 0.3 3.2); do
     echo $VAL
done

would yield:

2.1
2.4
2.7
3.0

Just remember that bash doesn’t do floating point arithmetic. You may be wanting to generate these values to pass to some other program from within your script.

Similar to Python

Here’s another common phrase seen in for loops in bash:

for person in ${namelist[@]}; do
    echo $person
done

which might produce output like this:

Arthur
Ann
Henry
John

Looking at that example, you might be tempted to think that this bash “for” loop is like Python where it can iterate over values returned by an iterator object. Well, bash is iterating over a series of values in this example, but those values come not from an iterator object. Instead the names are all spelled out before the looping begins.

The construct: ${namelist[@]} is bash syntax for listing all the values of a bash array, henceforth called a list, see the name discussion in Chapter 4. (In this example, the list is called namelist.) A substitution is made by bash as it prepares the command to be run. So the “for” loop doesn’t see the list syntax; the substitution happens first. What the “for” loop gets looks just as if we typed the values explicitly:

for person in Arthur Ann Henry John

What about dictionaries? What Python calls “dictionaries”, bash refers to as “associative arrays”, and others call “key/value pairs” or “hashes” (again, see the name discussion in Chapter 4). The construct: ${hash[@]} works fine for the values of the key/value pairs. To loop over the keys (i.e., indices) of the hash, add an exclamation point. The construct ${!hash[@]} can be used, as shown in this code snippet:

# we want a hash (i.e., key/value pairs)
declare -A hash
# read in our data
while read key value; do
    hash[$key]="$value"
done
# show us what we've got, though they won't
# likely be in the same order as read
for key in "${!hash[@]}"; do
    echo "key $key ==> value ${hash[$key]}"
done

alternate example:

# we want a hash (i.e., key/value pairs)
declare -A hash
# read in our data: word and # of occurrences
while read word count; do
    let hash[$word]+="$count"
done
# show us what we've got, though the order
# is based on the hash, i.e. we don't control it
for key in "${!hash[@]}";do
    echo "word $key count = ${hash[$key]}"
done

This chapter is more about looping constructs like for but if you want more details and examples about lists and hashes, see Chapter 4 .

Quotes and Blanks

There is one more important aspect to consider about this “for” loop. If the values in the list have blanks in them (for example, if each entry had a first and last name), then our example “for” loop:

for person in ${namelist[@]}; do
    echo $person
done

might produce output like this:

Art
Smith
Ann
Arundel
Hank
Till
John
Jakes

The for loop prints out eight different values for the four names in our list. Why? How? The answer lies in the substitution that bash makes for ${namelist[@]}. It just puts those names in place of the variable expression. That leaves eight words in the list, like this:

for person in Art Smith Ann Arundel Hank Till John Jakes

The for loop is just given a list of words. It doesn’t know from where they came.

There is bash syntax to solve this dilemma: put quotes around the list expression and each value will be quoted.

for person in "${namelist[@]}"

will be translated to:

for person in "Art Smith" "Ann Arundel" "Hank Till" "John Jakes"

and that will yield the desired result:

Art Smith
Ann Arundel
Hank Till
John Jakes

If your for loop is going to iterate over a list of filenames then you should be sure to use the quotation marks, since filenames might have a blank in them.

There is one last twist to all this. The list syntax can use either "*" or "@" to list all the elements of the list: ${namelist[*]} works just as well… EXCEPT when put inside quotes. The expression:

"${namelist[*]}"

will be evaluated with all of the values inside of a single string. In this example:

for person in "${namelist[*]}"; do
    echo $person
done

that would result in a single line of output, like this:

Art Smith Ann Arundel Hank Till John Jakes

Though a single string might be useful in some contexts it is especially pointless in a for loop - there would be only one iteration.

Style and Readability

In this chapter you saw that ${namelist[@]} and ${namelist[*]} both list all the values of the list, but if they are enclosed in quotation marks the result is different: separate strings vs. one large string. The same is true for the special shell variables [email protected] and $*. They both represent the list of all the arguments to the script (i.e., $1, $2, etc.). When enclosed in quotation marks, they also result in either multiple strings or a single string. Why bring that up now? Only to circle back to our simplest for loop:

for param

and say that this is equivalent to:

for param in "[email protected]"

One might argue that the second form is better because it shows more explicitly which values are being iterated over. However, we’re not convinced. The variable name itself and the necessity of the quotes are both specialized knowledge that is no more obvious to the naive reader than just the first, simple form. If readability is a driving factor, simply add a comment:

for param       # iterate over all the parameters

When looping over a sequence of integer values, the C-style for loop with double parentheses is probably the most readable as well as the most efficient. (If efficiency is a big concern be sure to use "declare -i i" early in your script to make your variable "i" an explicit integer, avoiding conversion to/from a string.)

Summary

In this chapter we first took a quick look at the C/C++ style numerical for loop. Then we went farther. Bash is very string oriented and has some other styles of for loops worth knowing. Its minimalist loop “for variable” provides implicit (and arguably obscure) iteration over the arguments to a script or a function. An explicit list of values, string or otherwise, provided to the for loop also provide the perfect mechanism for iterating over all the elements of a list or over all keys in a hash.

Knowing that you have all these values readily available what might you do with them? What goes on inside the loop, making use of these values? Decisions must be made about the values encountered - and decision making provides another showpiece for bash - its supercharged and superflexible case statement, the topic of the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset