Chapter 3. Expressions and Arithmetic

Bash offers many different ways to do the same thing - and some almost-identical syntax to do very different things. Often, it’s just the difference of a few special characters. We’ve already seen ${VAR} and ${#VAR} where the first expression returns the value of the variable but the second returns its string length. Other bash idioms might make you wonder: when should you use two or just one set of square brackets? Or even none? What, if any, is the difference between (( ... )) and $(( ... )) ? Usually there is some common meaning in the symbols across their various uses that hints at some semblance of reason behind the syntax. Sometimes the choice of expression was more for historical reasons. Let’s take a look and see if we can explain some of these idiomatic pattern and arithmetic expressions.

Arithmetic

Although bash is largely a string oriented language, whenever you see double parentheses in bash, it means that arithmetic evaluation is going on - arithmetic with numbers not strings. This is familiar to you from the for-loop variation that uses double parens:

for((i=0; i<size; i++))

Notice how we don’t have to use the $ in front of variable names inside the double parens. That is true whenenver we use double parens in bash. So where else do we find double parens in use?

First, we can use a dollar sign and double parens to do an arithmetic calculation to create a value for a shell variable, like these:

max=$((intro + body + outro - 1))
median_loc=$((len / 2))

Again, notice that the variables don’t need the dollar sign reference in front of them when they are used inside of double parens.

Second, consider this use of double parens:

if (( max - 3 > x * 4 )) ; then
    # do something here
fi

This time we are using double parens without a leading dollar sign. Why? What’s different?

In the first case, for variable assignments, we want the value of the expression, so just like with variables, the dollar sign indicates that we want the value. In the second case, an if statement, we don’t use the dollar sign because we only need the true/false binary to make our decision. If the expression inside the double parens (without a dollar sign) is a non-zero value, then the return status of the parenthesised expression is 0 — which is considered “true” in bash. Otherwise the return status is 1 (which, in bash, is “false”).

Notice how we said “return status”? That’s because the double parens with no dollar sign is used, syntactically, as if you were executing one or more commands. It doesn’t return a value that you could use to assign to a variable. However, you can use it to assign a new value to a variable in certain cases since bash supports some C language-style assignment operators. Here are a few examples. Remember, these are complete bash statements, one per line.

(( step++ ))
(( median_loc = len / 2 ))
(( dist *= 4 ))

Each statement is performing an arithmetic evaluation, but in each case there is an assignment of a value that also occurs as part of that evaluation. No value is returned from the expression, only the return status — which you could examine in the $? variable after each statement executes.

Could you write those three calculations from the above example using the dollar-sign-double-paren syntax? It may look more familiar to write:

step=$(( step + 1 ))
median_loc=$(( len / 2 ))
dist=$(( dist * 4 ))

We don’t want to write $((step++)) on a line by itself because that expression will return a numeric value — which the shell will then take as the name of a command to be executed. If step++ evaluated to 3 then the shell would look for a command named 3.

A Reminder about spaces

In a bash variable assignment, no spaces are allowed around the equal sign. For variable assignment, syntactically, it all must be one “word” of text. However, inside the parentheses spaces are OK since the parens define the boundary for that “word”.

Now there is just one more arithmetic variation - probably for historical reasons. You can use the shell builtin let to act like the double parens without the dollar sign. So compare these equivalent statements:

(( step++ ))   # is the same as:
let "step++"

(( median_loc = len / 2 ))  # is the same as:
let "median_loc = len / 2"

(( dist *= 4 ))   # is the same as:
let "dist*=4"

But be careful — if you don’t use quotes around the let expression then you better not have any spaces in that expression at all. (The first let in our example doesn’t need the quotes, but it’s a good habit to always use them.) Spaces will divide your command into separate words and let only takes a single word, so you’ll get a syntax error if there is more than one word.

No Parentheses Needed

We said that bash is a string-oriented language but there is a way to make an exception. You can declare a variable as an integer like this: declare -i MYVAR and having done so you can do arithmetic to assign it a value without using double parentheses and without the $ in front of variable names. Here’s an example, a script seesaw.sh:

declare -i SEE
X=9
Y=3
SEE=X+Y            # only this one will be arithmetic
SAW=X+Y            # this is just a literal string
SUM=$X+$Y          # this is string concatenation
echo "SEE = $SEE"
echo "SAW = $SAW"
echo "SUM = $SUM"

What you get if you run these statements shows how bash is mostly string oriented. The values of SAW and SUM are formed by string operations. Only SEE is given its value by doing arithmetic.

$ bash seesaw.sh
SEE = 12
SAW = X+Y
SUM = 9+3
$

This shows that you can do arithmetic without the need for double parentheses - but we usually avoid this, as it requires that you declare the variable assigned to as an integer. If you forget the declare statement or if you assign such an expression to a variable not so declared, you won’t get any error message - just an unwanted result.

Compound Commands

You are probably very familiar with seeing a single command on a line by itself in a script. You may also be familiar with using a single command in an if statement’s condition to see if the command succeeded and taking action depending on the result. If you’ve read Chapter 1, you’ve seen the “no-if” if statement idiom, too. Now let’s take a look at the simple one-command if statement, one that looks like this:

if cd $DIR ; then # do something ...

but what about these:

if [ $DIR ]; then # do something ...
if [[ $DIR ]]; then # do something ...

Why the brackets in this second example and not in the first? Is there a difference? What about one versus two brackets; which should you use and when/why?

Without any brackets, what is happening is the execution of a command (cd in our example). The success or failure of that command is returned as, in effect, a true or false for the if to use in its decision branching between the then or the else (should there be one). In bash you can put an entire pipeline of commands (e.g. cmd | sort | wc ) in an if statement. It is the return status of the last command in the pipeline that determines whether the if statement is true or false.

The single bracket syntax is actually also running a command, the shell builtin test command. The single left bracket is a shell builtin for the same thing, the test command, but with one difference: a required final argument of ]. The double bracket syntax is, technically, a bash keyword, one that indicates a compound command, whose behavior is very similar, though not identical, to the single bracket and test command.

We use either single or double bracket syntax to do some logic and comparisons, that is, conditional expressions. We use them for checking the state of things, things like if a file exists, or has certain permissions, or if a shell variable has a value or not. See the bash man page under “CONDITIONAL EXPRESSIONS” for a full list of the tests and checks you can make.

Our example, above, is checking to see if the DIR variable has a non-null value. Another way to write this would be:

if [[ -n $DIR ]]; then ...

to see if the value is not null, that is, has a non-zero length. Conversely, to see if the variable’s value is zero length, i.e., unset or null, use:

if [[ -z $DIR ]]; then ...

So are there differences between the single and double bracket tests? Just a few, but they can be significant.

Perhaps the biggest difference is that the double bracket syntax supports an additional comparison operator, =~, which allows the use of regular expressions.

if [[ $FN =~ .*xyzz*.*jpg ]]; then ...

Regular Expressions

This is the one and only place in bash where you will find regular expressions! And remember: do not put your regular expression in quotes or you will be matching those characters verbatim and not as a regular expression.

Another difference between single and double brackets is more stylistic, but one that will affect portability. These two forms do the same thing.

if [[ $VAR == "literal" ]]; then ...

if [ $VAR = "literal" ]; then ...

The use of the single equal sign for comparison may seem un-natural for C and Java programmers, but when used in bash conditional expressions, both = and == mean the same thing. The single equal sign is preferred in the single bracket syntax for POSIX compliance (so says the bash man page).

A subtle sort of difference

Within the double square brackets, the < and > operators compare “lexicographically using the current locale” whereas test (and [ ) compare using simple ASCII ordering.

You will also likely need to escape these operators (like this: if [ $x > $y] ) when using single brackets, otherwise they will be taken to mean redirection. Why? Because the single bracket (and the test command) are builtin commands not keywords, so bash sees it as running a command - and you can redirect I/O when running a command. However, when bash sees the double brackets, a keyword, it knows to expect such operators and doesn’t treat them as redirection. Therefore, of the two syntax forms, we much prefer the double-bracket syntax.

Both single and double bracket expressions can use an older, more FORTRAN-like syntax for their numeric comparisons. For example, they use -le for less-than-or-equal-to comparison. Here’s where another difference between the two arises. The arguments to either side of this operator must be simple integers in the single bracket expression. Using double brackets, each operand can be a larger arithmetic expression, though without spaces unless quoted. For example:

if [[ $OTHERVAL*10 -le $VAL/5 ]] ; then ...

A better choice if you’re doing arithmetic expressions and comparisons is to use the double-parentheses syntax. That gives you the more familiar C/Java/Python-like comparison operators and more freedom regarding spacing:

if (( OTHERVAL * 10 <= VAL / 5 )) ; then ...

Style and Readability

With so many variations to choose from, which if statement style do you choose? We choose the style that best fits the calculation under consideration. When it is a very mathematical expression we use the double parentheses. Since the $ isn’t needed on variables to get their values inside double parentheses we try to omit them consistently. For text-heavy comparisons we use the double brackets, especially because that lets us use regular expressions.

For arithmetic expressions some people may prefer the double parentheses around the expression, consistent with the if statements. However, for other the simple let builtin command reads cleanly and simply. Though bash will allow arithemetic operations without brackets or let, provided the variables are declared as integer-only, we avoid that style. It is too easy to mix and match variables, some of which may not have been declared as integers. Confusion ensues. Putting the expression in double parentheses (or using let) guarantees that it will remain an arithemetic evaluation.

Summary

As a rule, in bash, double parens indicate arithmetic is going on. The dollar sign indicates that you want the value of the expression returned, otherwise you just get a success/fail result status. But operator-rich bash makes it possible to do similar things using either the double paren syntax or the let builtin. You can live dangerously and skip the double parens by declaring your variables as integers, but we cannot recommend that.

For conditionals, the newer syntax of [[ is much to be preferred over [. However, if your conditional is arithmetic comparisons, an even better choice is the (( syntax.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset