This chapter deals with a gotcha that I came across while porting a script from ksh
to bash
. It was a gotcha only because at the time I wasn't aware of a fairly crucial difference in the behavior of the two shells. In both pdksh
and bash
, the last command of a pipeline is performed in a subshell. This means that a variable assigned within the subshell is not available to the parent shell. In ksh
, the last command of a pipeline is executed in the original shell.
This isn't an issue when using the pipe to set a variable, but if the result of a pipe is sent to a loop structure that then populates variables you will use later, that is more of a problem. Once the loop completes, the variables you were going to rely on don't exist.
Included here are a few of examples of code that you might expect to work, but they actually don't. I also include some workarounds that will perform the intended tasks.
The following is the part of the code that had problems when I ported it. It was used to process a file of extended output one line at a time. To perform this task in ksh
, I would use the following:
cat somefile | while read line
do
# Process the $line variable in some form.
if [ "`echo $line | awk '{print $3}'`" = "somevalue" ]
then
all="$all $line"
fi
done
If everything within the loop is self-contained and none of the variables in it are accessed outside the loop, this will work fine. However, the bash
code parsed each line in the output of the piped command, and populated some variables based on that output. Once the loop completed, I wanted to access those values ($all
in this example) for other purposes, and found that they were undefined.
The following code is the first workaround that I found to overcome the problem. Unfortunately it isn't quite as elegant or intuitive as the original code because it uses a temporary file. To keep the code clean I try to avoid using temporary files, but in this case I had no choice.
while read line
do
# Process the $line variable in some form.
if [ "`echo $line | awk '{print $3}'`" = "somevalue" ]
then
all="$all $line"
fi
done < somefile
First the data originally piped to the while read
loop is sent to a temporary file. The file is then redirected into the back end of the loop. This functions the same way as the original code, but allows the variables populated within the loop to remain usable once the loop completes. Chapter 8 offers another example of this technique.
The following is a modified form of the previous example:
THE_INPUT=`ps -ef`
while read line
do
# Process the $line variable in some form.
if [ "`echo $line | awk '{print $3}'`" = "somevalue" ]
then
all="$all $line"
fi
done <<EOF
$THE_INPUT
EOF
This slight modification of the earlier example eliminates the need for a temporary file. Instead of redirecting a file into the back of the loop, we start a here-document and feed it the data we want to process through the loop. A here-document is where the shell reads input from the current source until it reaches the matching tag alone on a single line, in this case EOF
. This solution works in the same way as a real file with both bash
and pdksh
.
The following sections show four methods for reading input one line at a time. With each method, I explain what variables are available within the code for each of the four shells (bash
, ksh
, pdksh
, and Bourne sh
).
The original method of piping input to a read
loop looks like this:
ps -ef | while read firstvar
do
echo firstvar within the loop: $firstvar
secondvar=$firstvar
echo secondvar within the loop: $secondvar
done
echo firstvar outside the loop: $firstvar
echo secondvar outside the loop: $secondvar
KornShell (
ksh
): Bothfirstvar
andsecondvar
are available within the loop. Onlysecondvar
is available outside the loop. This is useful because even though you can't use the originalread
variable, you can assign it to some other variable, which is then available when the loop completes.Bash (
bash
): Bothfirstvar
andsecondvar
are available within the loop. Neitherfirtvar
norsecondvar
is available after the loop completes.Public Domain Korn Shell (
pdksh
): Bothfirstvar
andsecondvar
are available within the loop. Neitherfirstvar
norsecondvar
is available after the loop completes.Bourne (
sh
): Bothfirstvar
andsecond
var
are available within the loop. Neitherfirstvar
norsecondvar
is available after the loop completes.
This is the workaround option I discussed originally. The input will be sent to a temporary file and then redirected to the back of the loop.
ps -ef > /tmp/testfile
while read firstvar
do
echo firstvar within the loop: $firstvar
secondvar=$firstvar
echo secondvar within the loop: $secondvar
done < /tmp/testfile
echo firstvar outside the loop: $firstvar
echo secondvar outside the loop: $secondvar
KornShell (
ksh
): Bothfirstvar
andsecondvar
are available within the loop. Onlysecondvar
is available outside the loop.Bash (
bash
): Bothfirstvar
andsecondvar
are available within the loop. Onlysecondvar
is available outside the loop. This version now performs in the same manner as theksh
version.Public Domain Korn Shell (
pdksh
): Bothfirstvar
andsecondvar
are available within the loop. Onlysecondvar
is available outside the loop.Bourne (
sh
): Bothfirstvar
andsecondvar
are available within the loop. Neither variable is available outside the loop.
This is the here-document workaround option where we remove the need for a temporary file. This functions in the same way as Option 2.
the_input=`ps -ef`
while read firstvar
do
echo firstvar within the loop: $firstvar
secondvar=$firstvar
echo secondvar within the loop: $secondvar
done <<EOF
$the_input
EOF
echo firstvar outside the loop: $firstvar
echo secondvar outside the loop: $secondvar
KornShell (
ksh
): Bothfirstvar
andsecondvar
are available within the loop. Onlysecondvar
is available outside the loop.Bash (
bash
): Bothfirstvar
andsecondvar
are available within the loop. Onlysecondvar
is available outside the loop. This version now performs in the same manner as theksh
version.Public Domain Korn Shell (
pdksh
): Bothfirstvar
andsecondvar
are available within the loop. Onlysecondvar
is available outside the loop.Bourne (
sh
): Bothfirstvar
andsecondvar
are available within the loop. Neither variable is available outside the loop.
This last option removes the pipe (|
) from the loop and processes an input file manually. If you have only the Bourne shell at your disposal, this is your only option, and the script will be somewhat slower. With this option, all set variables from both inside and outside the loop will be available following loop completion. This option is valid for all the shells I've mentioned.
ps -ef > /tmp/testfile
filecount=`wc -l /tmp/testfile`
count=0
while [ $count -le $filecount ]
do
firstvar=`tail +$count /tmp/testfile | head −1`
echo firstvar within the loop: $firstvar
secondvar=$firstvar
echo secondvar within the loop: $secondvar
count=`echo $count+1 | bc`
done < /tmp/testfile
echo firstvar outside the loop: $firstvar
echo secondvar outside the loop: $secondvar
The following tables summarize all of the previously discussed scenarios. Table 10-1 displays the availability of the variable that is initially set in the loop (firstvar
). Note that in all shells except the manual loop method used by the Bourne shell, this variable is unavailable for use following the loop's completion.
Table 10-1. Availability of Variables That Are Initially Set in a Loop, After Loop Completion
ksh | bash | pdksh | Bourne | |
Opt. 1: Pipe to while read |
No | No | No | No |
Opt. 2: Redirected file to back of loop | No | No | No | No |
Opt. 3: Redirected here-document to back of loop | No | No | No | No |
Opt. 4: Manual iteration through loop | Yes | Yes | Yes | Yes |
Table 10-2 displays availability of variables that are assigned within the loop (secondvar
) once the loop has completed. These variables can have values assigned to them from the initial variable (firstvar
), since that variable is accessible within the loop or from any other assignment inside the loop.
Table 10-2. Availability of Variables That Are Set Within a Loop, After Loop Completionn
ksh | bash | pdksh | Bourne | |
Opt. 1: Pipe to while read |
Yes | No | No | No |
Opt. 2: Redirected file to back of loop | Yes | Yes | Yes | No |
Opt. 3: Redirected here-document to back of loop | Yes | Yes | Yes | No |
Opt. 4: Manual iteration through loop | Yes | Yes | Yes | Yes |
The next example represents a scenario in which the script does not pipe to a loop, but instead pipes input to a read
statement. This method works well in ksh
. Within both pdksh
and bash
, once the following command is executed, both foo
and bar
variables are undefined:
echo a b | read foo bar
The workaround removes the use of the read
command altogether. This modified version has the same functionality, but it uses two separate commands instead of a pipeline of two commands.
set `echo a b` ; foo=$1 bar=$2
Using set
without any options or arguments takes the echo
output and assigns each output word using a positional parameter. The parameter can then be reused. This works fine in most instances. However, if $1
is a negative value, the set
command interprets the - sign as a switch. It then complains about the switch not being valid.
The workaround for this is to use the double-dash switch for set
. This will tell set
not to process any further arguments that begin with +
or -
.
set -- `echo a b` ; foo=$1 bar=$2
One other workaround for this is somewhat of a brute-force tactic but may be necessary depending on the age of the system or shell you're working with. You prepend some arbitrary character (not a -
sign) to the beginning of the echo
output to protect against switch evaluation. Once the variables are set
, you strip off the first character of the first variable using cut
so you are left with the original value.
set "@"`echo a b` ; foo=$1 bar=$2
foo=`echo $foo | cut -c1-`
The last example enables you to parse through each word of some input string, consuming two words at a time. Words are assumed to be separated by spaces. This once again uses the set
command to assign positional variables the value of each word. The same code as we used previously is implemented, but the core function is now surrounded by a loop that continues until the first word is null.
#!/bin/ksh
set `echo a b c d e`
while [ "$1" != "" ]
do
foo=$1 bar=$2
echo $*
shift
echo foo $foo
echo bar $bar
done
The loop assigns the first two positional parameters to foo
and bar
. It then outputs the value of all positional variables. The shift
command drops the $1
value and promotes $2
and all other variables by one position. It then outputs the values of foo
and bar
for each iteration. The $*
variable that is echoed holds all of the current positional parameters. Thus, after each iteration through the loop, the output of the line is shortened by one element. Note that this script is written in ksh
, but it should work in all previously mentioned shells.
3.16.135.67