I have created a directory in my home directory called shellprogs to serve as a repository for my shell programs. In order to execute these programs without having to specify an absolute path, I have updated my path in .profile to include the following line:
export PATH=${PATH}:~shellprogs
This adds /home/martyp/shellprogs to my path assuming colon is the delimiter, which is usually the case. After updating .profile you can log out and log in to update your path. You can issue the command below to confirm that your path has indeed been updated to include the directory in which you keep your shell programs:
martyp $ echo $PATH
/usr/bin:/usr/ucb:/etc:.:/home/martyp/shellprogs
martyp $
Let's now go to the shellprogs directory and type a simple command file called today:
#!/bin/ksh # This is a simple command file to display today's date. echo "Today's date is" date +%x
Before we look at each line in the program, let's run it to see the output:
martyp $ today ksh: today: cannot execute martyp $ ls -al total 6 drwxrwxr-x 2 martyp staff 512 May 21 09:53 . drwxrwx--- 4 martyp staff 512 May 21 09:25 .. -rw-rw-r-- 1 martyp staff 100 May 21 09:54 today martyp $ chmod +x today martyp $ ls -al total 6 drwxrwxr-x 2 martyp staff 512 May 21 09:53 . drwxrwx--- 4 martyp staff 512 May 21 09:25 .. -rwxrwxr-x 1 martyp staff 100 May 21 09:54 today martyp $ today Today's date is 05/21/99 martyp $
We could not execute today because the file was created without execute permissions, which we confirm by performing a long listing. We then add execute permission to the file with chmod +x, which is confirmed with the next long listing. We are then able to execute today and view its results.
The umask discussion in each of the earlier shell chapters describes the defaults for new files. Almost invariably, new files are created without execute permission; therefore, you will have to update the permissions on the file to include execute.
Let's now walk through this simple shell program and analyze each line.
#!/bin/ksh
The first line specifies that the ksh will be used. If you are using Bash, C shell, or any other shell, you would specify its location in this manner. Some systems have multiple versions of shells running on them. It may be that a shell has been updated since the last release and some users may want to maintain the old version of the shell. For this reason, you want to be sure that the absolute path you specify is indeed that of the shell you wish to use. Note that the #! must be the very first two characters in the file.
Normally, when you run a shell program, the system tries to execute commands using the same shell that you are using for your interactive command lines. If we don't include this line, someone running a shell other than the ksh might have unexpected results when trying to run one of our programs.
As a good practice, you should include #!shellname as the first line of every shell program you write.
Let's now view the next line of our program:
# This is a simple command file to display today's date.
These are comments. Everything after a # in a command line is considered a comment (#! on the first line is the one very big exception). Keep in mind my early remarks about including comments liberally. It is a great pleasure to share a well commented shell program with a friend, knowing that you have adequately documented the program with comments.
Here is the next command:
echo "Today's date is"
The echo command generates prompts and messages in shell programs. See the echo manual entry to see all the options available with echo for formatting your output. We commonly enclose the string to be displayed in double quotes. In this case, we did because we needed to let the shell know that the apostrophe was part of the string and not a single quote that needs a match.
Next is the last command in our program:
date +%x
This executes the date command. There are indeed many options to the date command, some of which are not intuitive. In this case, we use one of the simplest forms of the command, which simply produces today's date.
Let's cover one more example of a command file that you may find useful. This program informs you of the present working directory and then produces a long listing of the files in the directory. The following is a listing of the shell program myll:
#!/bin/ksh # This is a simple shell program that displays the current # directory name before a long file listing (ll) of that # directory. # The script name is myll echo "Long listing of directory:" pwd echo ll -l
This program uses ll; you may need to use ls -al. The following is what myll looks like when it runs:
martyp $ myll
Long listing of directory:
/home/martyp/shellprogs
total 4
-rwxrwxr-x 1 martyp staff 220 May 24 05:28 myll
-rwxrwxr-x 1 martyp staff 100 May 21 09:54 today
martyp $
This name of the present working directory is /home/martyp/shellprogs. A long listing of the contents of this directory shows the two programs we have covered so far in this chapter.
Before we can produce more complex shell programs, we need to learn more about some of the programming features built into the shell. Let's start with shell variables.
A shell variable is similar to a variable in any programming language. A variable is simply a name you give to a storage location. Unlike most languages, however, you never have to declare or initialize your variables; you just use them.
Shell variables can have just about any name that starts with a letter (uppercase or lowercase). To avoid confusion with special shell characters (like file name generation characters), keep the names simple and use just letters, numbers, and underscore (_).
To assign values to shell variables, you simply type the following:
name=value
Note that there are no spaces before and after the = character.
Here are some examples of setting shell variables from the command line. These examples work correctly:
$ myname=ralph $ HerName=mary
This one does not work because of the space after “his":
$ his name=norton
his: not found
The shell assumes that “his” is a command and tries to execute it. The rest of the line is ignored.
This example contains an illegal character (+) in the name:
$ one+one=two
one+one=two: not found
A variable must start with a letter. A common mistake is to give a variable a name that makes perfect sense to you but does not start with a letter. The following example uses a variable that starts with a number:
$ 3word=hi
3word=hi: not found
The "3" causes a "not found" to be produced when we attempt to assign this variable.
Now that we can store values in our variables, we need to know how to use those values. The dollar sign ($) is used to get the value of a variable. Any time the shell sees a $ in the command line, it assumes that the characters immediately following it are a variable name. It replaces the $variable with its value. Here are some simple examples using variables at the command line:
$ myname=ralph $ echo myname myname $ echo $myname ralph $ echo $abc123
In the first echo command, there is no $, so the shell ignores myname, and echo gets myname as an argument to be echoed. In the second echo, however, the shell sees the $, looks up the value of myname, and puts it on the command line. Now echo sees ralph as its argument (not myname or $myname). The final echo statement is similar, except that we have not given a value to abc123 so the shell assumes that it has no value and replaces $abc123 with nothing. Therefore, echo has no arguments and echos a blank line.
There may be times when you want to concatenate variables and strings. This is very easy to do in the shell:
$ myname=ralph $ echo "My name is $myname" My name is ralph
There may be times when the shell can become confused if the variable name is not easily identified in the command line:
$ string=dobeedobee $ echo "$stringdoo"
We wanted to display “dobeedobee,” but the shell thought the variable name was stringdoo, which had no value. To accomplish this we can use curly braces around the variable name to separate it from surrounding characters:
$ echo "${string}doo"
dobeedobeedoo
You can set variables in shell programs in the same way, but you might also like to do things such as save the output of a command in a variable so that we can use it later. You may want to ask users a question and read their response into a variable so that you can examine it.
Command substitution allows us to save the output from a command (stdout) into a shell variable. To demonstrate this, let's take another look at how our “today” example can be done using command substitution.
#!/bin/ksh d='date +%x' echo "Today's date is $d"
The back quotes (') around the date command tell the shell to execute date and place its output on the command line. The output will then be assigned to the variable d. We'll name this updated script today1 and run it:
$ today1
Today's date is 05/24/00
We could also have done this task without using the variable d. We could have just included the date command in the echo string, as shown in the today2 script shown in the following example:
#!/bin/ksh echo "Today's date is 'date +%x'"
When we run this program, we see exactly the same output as we did with today1:
$ today2
Today's date is 05/24/00
We'll use shell variables and command substitution extensively in some of the upcoming examples. Let's now cover reading user input.
The most common way to get information from the user is to prompt him or her and then read the response. The echo command is most commonly used to display the prompt; then the read command is used to read a line of input from the user (standard input). Words from the input line can be assigned to one or several shell variables.
Here is an example with comments to show you how read can be used:
#!/bin/ksh # program: readtest echo "Please enter your name: c" # the c leaves cursor on # this line. read name # I have no $ because we are doing an assignment # of whatever the user enters into name. echo "Hello, $name" echo "Please enter your two favorite colors: c" read color1 color2 # first word entered goes into color1 # remainder of line goes into color2 echo "You entered $color2 and $color1"
If we ran this program, it would look something like this:
$ readtest Please enter your name: gerry Hello, gerry Please enter your two favorite colors: blue green You entered green and blue $
Notice how the read command assigned the two words entered for colors into the two respective color variables. If the user entered fewer words than the read command was expecting, the remaining variables are set to null. If the user enters too many words, all extra words entered are assigned into the last variable. This technique is how you can get a whole line of input into one variable. Here's an example of what happens whn you enter more than two colors:
$ readtest Please enter your name: gerry Hello, gerry Please enter your two favorite colors: chartreuse orchid blue You entered orchid blue and chartreuse $
The program took the last two colors entered and assigned them to color2. For this reason, you have to be careful of what a user may enter with the read command and how you map that information to variables.
You may have a built-in variable used with read called REPLY. With REPLY, a line of input is assigned to the variable REPLY. I tend not to use this often because I don't like to leave anything to chance. I normally explicitly name a variable to which input is assigned. The following example shows the listing and running of readtest1, which includes a line that contains only the read command, the response of which is assigned to REPLY:
martyp $cat readtest1 #!/bin/ksh # program: readtest echo "Please enter your name: c" # the c leaves cursor on # this line. read name # I have no $ because we are doing an assignment # of whatever the user enters into name. echo "Hello, $name" echo "Please enter your two favorite colors: c" read color1 color2 # first word entered goes into color1 # remainder of line goes into color2 echo "You entered $color2 and $color1" echo "Where are you from?" read # read response into $REPLY echo "I'm sure $REPLY is great" martyp $ readtest1 Please enter your name: MARTY Hello, MARTY Please enter your two favorite colors: RED BLUE You entered BLUE and RED Where are you from? MANY DIFFERENT PLACES I'm sure MANY DIFFERENT PLACES is great martyp $
You can see in this example that the response I typed of "MANY DIFFERENT PLACES" was indeed read into REPLY.
Although echo is used throughout this chapter, you may also see print used to display lines on the screen.
Shell programs can have command-line arguments just like any regular command. Command-line arguments that you use when you invoke your shell program are stored in a special set of variables. These are called the positional parameters.
The first ten words on the command line are directly accessible in the shell program using the special variables $0-$9. This is how they work:
$0 | The command name |
$1 | The first argument |
$2 | The second argument |
$3 | . |
. | |
. | |
$9 | The ninth argument |
If you are not sure how many command line arguments you may get when your program is run, there are two other variables that can help:
$# | The number of command-line arguments |
$* | A space-separated list of all the command-line arguments (which does not include the command name) |
The variable $* is commonly used with the for loop (soon to be explained) to process shell script command lines with any number of arguments.
Let's now take the myll we worked with earlier and modify it to produce a long listing of the directory that we specify when we run the program. Figure 28-1 shows the modified myll:
#!/bin/ksh # This is a simple shell program that takes one command line # argument (a directory name) and then displays the full pathname # of that directory before doing a long file listing (ll) on # it. # # The script name is myll cd $1 echo "Long listing of the 'pwd' directory:" echo ls -l |
If we run myll with a directory name, the script changes directory, echoes the message containing the full path name (notice the command substitution), and then executes the ls -l command.
Note that the cd in the myll program will change only the working directory of the script; it does not affect the working directory of the shell from which we run myll.
martyp $ myll /tmp
Long listing of the /tmp directory:
total 2384
-rw------- 1 root sys 265228 Feb 22 15:21 dtdbcache_:0
-rw-r--r-- 1 root sys 70829 Feb 23 10:44 g
-rw-r--r-- 1 root sys 13642 Feb 23 10:48 i
-rw-rw-rw- 1 root root 14071 May 24 06:10 license_log
-rwxr-xr-x 1 chrisb users 317 Apr 20 17:38 ls
-rw-rw-r-- 1 root sys 4441 Mar 25 14:37 mwaps7454
-rw-rw-rw- 1 anne users 4341 May 20 13:56 ps23974
-rw-r--r-- 1 rodt users 4218 Apr 14 11:17 ps3358
-rw-r--r-- 1 rodt users 4763 Feb 24 07:23 ps6465
-rw-rw-r-- 1 root sys 4446 Mar 25 14:31 ps7036
-rw-rw-r-- 1 root sys 4442 Mar 25 14:35 ps7138
-rw-rw-r-- 1 root sys 4446 Mar 25 14:35 ps7215
-rw-rw-r-- 1 root sys 4498 Mar 25 14:36 ps7342
-rw-rw-r-- 1 root sys 4446 Mar 25 14:38 ps7622
-rw-rw-r-- 1 root sys 4615 Mar 25 15:30 ps7812
-rw-rw-r-- 1 root sys 5728 Feb 18 11:09 ps_data
-rw-r--r-- 1 root sys 0 Apr 26 10:50 sh20890.1
-rw-r--r-- 1 root sys 0 Apr 26 10:50 sh20891.1
-rw-r--r-- 1 root sys 0 Apr 26 10:51 sh20978.1
-rw-r--r-- 1 root sys 0 Apr 26 10:51 sh20979.1
-rw-r--r-- 1 chrisb users 5325 Mar 26 13:42 sman_9664
-rw-rw-r-- 1 root sys 295996 Mar 1 10:15 ups_data
drwx------ 2 root other 69 Mar 9 11:37 whatis.11526
drwx------ 2 root other 69 Mar 9 11:37 whatis.11686
drwx------ 2 root other 69 Mar 9 11:38 whatis.11853
drwx------ 2 root other 69 Mar 9 11:38 whatis.12014
-rw-r--r-- 1 root sys 354221 Feb 23 10:49 x
-rw-r--r-- 1 chrisb users 0 Feb 23 14:39 xx
martyp $
In this case, we could give myll no argument and it would still work properly. If we don't provide any command-line arguments, then $1 will be null, so nothing goes on the command line after cd. This will make cd take us to our home directory and perform the ll there.
If we provide more than one argument, only the first is used and any others are ignored.
If we use a command-line argument, it must be a directory name; otherwise, the cd command fails and the script terminates with a “bad directory” error message. Later I will show how to test for valid directory names and file names so that you can work around potential errors.
A more complex example can be used to build new versions of the ps command. Below are two examples that use command line arguments and command substitution to help you with your process management. |
The psg shell program in Figure 28-2 is handy for searching through what is typically a long process status listing to find only certain commands or user processes. These examples use grep. grep finds all lines that contain the pattern for which you are searching.
#!/usr/bin/sh # Program name: psg # Usage: psg some_pattern # # This program searches through a process status (ps -ef) # listing for a pattern given as the first command-line # argument. procs='ps -ef' # Get the process listing head='echo "$procs" | line' # Take off the first line (the # headings) echo "$head" # Write out the headings echo "$procs" | grep -i $1 | grep -v $0 # Write out lines # containing $1 but not this program's command line # Note that $procs MUST be quoted or the newlines in the ps # -ef listing will be turned into spaces when echoed. $head # must also be quoted to preserve any extra white space. |
Here's what psg looks like when it runs. In this example, we want to look at all the Korn shells running on the system.
martyp $ psg ksh
UID PID PPID C STIME TTY TIME CMD
root 2954 2936 0 Feb 22 ? 0:01 /bin/ksh /usr/dt/bin/Xsession
root 3002 2999 0 Feb 22 pts/2 0:01 -ksh -c unset DT; DISPLg
root 3067 1 0 Feb 22 ? 0:00 /bin/ksh /usr/dt/bin/sdtvolcheckm
jnola 11516 11514 0 May 11 pts/3 0:01 -ksh
martyp 29291 29289 0 09:30:04 pts/4 0:02 -ksh
This program also works to find the terminal, process ID, parent process ID, start date, and any other information from ps. |
As a user, you may start processes that you wish to stop. You may, for instance, start an application that does not come up on your display. You can identify the process with psg and then use the next program to stop the process, provided that you have the rights to stop the process.
The gkill shell program in Figure 28-3 searches through a ps -ef listing for a pattern (just like psg); then it kills all listed processes. The examples use the cut command, which allows you to specify a range of columns to retain.
#!/usr/bin/sh # Program name: gkill # Usage: gkill some_pattern # This program will find all processes that contain the # pattern specified as the first command line argument then # kills those processes. # get the process listing procs='ps -ef' echo "The following processes will be killed:" # Here we list the processes to kill. We don't kill this # process echo "$procs" | grep -i $1 | grep -v $0 # Allow the user a chance to cancel. echo " Press Return to continue Ctrl-C to exit" # If the user presses Ctrl-C the program will exit. # Otherwise this read waits for the next return character and # continue. read junk # find the pattern and cut out the pid field pids='echo "$procs" | grep -i $1 | grep -v $0 | cut -c9-15' # kill the processes kill $pids |
If we don't provide any command-line arguments, grep issues an error and the program continues. In the next section, we will learn how to check if $1 is set and how to gracefully clean up if it's not. |
Let's now start a process in the background, use the psg program to identify the process number, and then use gkill to stop the process:
martyp $ find / -name .c > cprogs 2>&1 & [1] 29683 martyp $ psg find UID PID PPID C STIME TTY TIME CMD martyp 29683 29579 7 13:54:19 pts/4 0:02 find / -name .c martyp $ gkill 29683 The following processes will be killed: martyp 29683 29579 10 13:54:19 pts/4 0:03 find / -name .c Press Return to continue Ctrl-C to exit martyp $
Although the shell programs in this section were simple, they employed many important shell programming techniques. Let's now move to testing and branching, which are some of the most powerful aspects of shell programming.
Decision-making is one of the shell's most powerful features. You have two ways to check conditions and branch to a piece of code that can handle that condition.
For example, you may want to ask the user a question and then check whether the answer was yes or no. You may also want to check whether a file exists before you operate on it. In either case, you can use the if command to accomplish the task. Here are a few shell script segments that explain each part of the if command:
echo "Continue? c" read ans if [ "$ans" = "n" ] then echo "Goodbye" exit fi
The echo and read provide a prompt and response, as usual. The if statement executes the next command, and if it succeeds, it executes any commands between the then and the fi (if spelled backwards).
Note that the c in the echo command suppresses the new line that echo normally generates. This action leaves the cursor on the line immediately after the “Continue? “ prompt. This is commonly used when prompting for user input.
The test command is the most common command to use with the if command. The [ “$ans” = “n” ] is the test command. It performs many types of file, string, and numeric logical tests, and if the condition is true, the test succeeds.
The syntax of the test command requires spaces around the [ ] or you will get a syntax error when the program runs. Also notice the double quotes around the response variable $ans. These are a strange anomaly with the test command. If the user presses only [[RETURN]] at the prompt without typing any other character, the value of $ans will be null. If we didn't have the quote marks around $ans in the test command, it would look like this when the value of $ans was substituted into the test command:
[ = "n" ]
This generates a “test: argument expected” error when you run the program. This is a very common mistake, and if you ever get this error, you should look for variables in your test commands with null values.
There is another form of the if command that is very common. It allows you to do one thing if a condition is met or do something else if not:
if [ ] # if some condition is true then # do something else # otherwise do this fi
There are many conditions that the test command can test. Table 28-1 shows some of the more common conditions for which you can test.
String tests: | |
[ “$a” == “string” ] | True if $a is equal to “string” |
[ “$a” != “string” ] | True if $a is NOT equal to “string” |
[ -z “$a” ] | True if $a is null (zero characters) |
[ -n “$a” ] | True if $a is NOT null |
Numeric tests: | |
[ $x -eq 1 ] | True if $x is equal to 1 |
[ $x -ne 1 ] | True if $x is NOT equal to 1 |
[ $x -lt 1 ] | True if $x is less than 1 |
[ $x -gt 1 ] | True if $x is greater than 1 |
[ $x -le 1 ] | True if $x is less than or equal to 1 |
[ $x -ge 1 ] | True if $x is greater than or equal to 1 |
File tests: | |
[ -d $file ] | True if $file is a directory |
[ -f $file ] | True if $file is a file |
[ -s $file ] | True if $file is a file with > 0 bytes |
[ -r $file ] | True if $file is readable |
[ -w $file ] | True if $file is writable |
[ -x $file ] | True if $file is executable |
Tests can be combined using -a to logically “AND” the tests together, -o to logically “OR” two tests, and ! to “negate” a test. For example, this test statement is true only if the $interactive variable is set to true or $file is a directory:
[ "$interactive" = "TRUE" -o -d $file ]
This will be used in some of the upcoming example programs.
Here is a useful extension to the gkill program earlier shown. It checks to see that we have exactly one command-line argument before the program will attempt to do the processing. It uses a numeric test and the $# variable, which represents the number of command-line arguments. It should be inserted before any other lines of code in the gkill example given above:
# If we don't have exactly one command-line argument, # write an error and exit. if [ $# -ne 1 ] then echo "Usage: $0 pattern" echo "Some pattern matching the processes to kill"" echo "must be specified" exit 1 # Exit 1 terminates the program and tells the # calling shell that we had an error. fi
Some other possible extensions to the gkill program might be to:
Allow the user to specify a signal to use with the kill command. For example:
gkill -9 ralph
would find all of ralph's processes and then kill them with kill -9.
Make sure that a valid message is printed if we can't find any processes to kill using the specified pattern.
This same type of command-line check is easily applied to the psg program to make sure you that have exactly one argument representing the pattern for which to search.
When you are reading user input, you may want to check if the user entered a value at all. If not, you would provide a reasonable default value. This is easily done with a variable modifier.
This example reads an answer (“ans”) from the user and then checks its value using an if command:
echo "Do you really want to remove all of your files? c" read ans if [ ${ans:-n} == y ] then rm -rf * fi
The ${ans:-n} statement checks the value of $ans. If there is a value in $ans, use it in the command line. If the user simply pressed [[RETURN]] at the prompt, $ans will be null. In this case, ${ans:-n} will evaluate to n when we do the comparison. Basically, in one small statement it says, “if the user did not provide an answer, assume they meant n.”
There is another modifier that is often used:
${var:=default}
It returns the value of var if it is set; it returns the default if var is not set, and it will also assign the default as the value of var for future use.
The case statement is another way to make decisions and test conditions in shell programs. It is most commonly used to check for certain patterns in command-line arguments. For example, if you wanted to determine whether the first command-line argument is an option (starts with a -), the case statement is the easiest way to do so. The case statement is also used to respond to different user input (such as asking the user to select a choice from a menu).
The case statement is probably one of the most complicated shell commands because of its syntax:
case pattern_to_match in pattern1) cmdA cmdB ;; pattern2) cmdC ;; ... *) cmdZ ;; esac
pattern_to_match is usually a shell variable that you are testing (like a command-line argument or a user response). If pattern_to_match matches pattern1, then commands cmdA and cmdB are executed. The ;; separates this pattern's command list from the next pattern. In all cases, when ;; is reached, the program jumps to the esac (case spelled backwards).
If pattern_to_match matches pattern2, then cmdC is executed and we jump to esac, the end of the case statement.
The * is provided so that if pattern_to_match did not match anything else, it will execute cmdZ. It's important to have a default action to handle the case where the user types an invalid entry.
For more robust pattern matching, any file name generation characters ( *, [ ], ? ) can be used to do special pattern matches. There is also a very useful way to check for multiple patterns in one line using the | symbol, which means logical “OR”. Here's an example:
echo "Do you want to continue? (y/n) c" read ans case $ans in y|Y) echo "Continuing" ... ;; n|N) echo "Done, Goodbye" exit ;; *) echo "Invalid input" ;; esac
Here is another example where we are testing to see whether $1 (the first command-line argument) is a valid option (a character we recognize that begins with a -):
case $1 in -l | -d) # Perform a listing echo "All files in $HOME: " ll -R $HOME | more ;; -i) # -i means set to an interactive flag to true interactive="TRUE" ;; *) # Invalid input echo "$0: $1 is an invalid option" exit 1 ;; esac
A case statement is used in another example later in this chapter.
There are many times when you want to perform an action repeatedly. In the shell, there are two ways to do this:
The basic format of the for loop is:
for var in list_of_items do cmdA cmdB cmdC done
When the loop starts, the variable var has its value set to the first word in the list_of_items through which to loop. Then the three commands between the do and the done statements are executed. After the program reaches the done statement, it goes back to the top of the loop and assigns var to the next item in the list, executes the commands, and so on. The last time through the loop, the program continues with the next executable statement after the done statement.
The list_of_items can be any list of words separated by white space. You can type the words or use variables or command substitution to build the list. For example, let's say that we want to copy a new .kshrc file into the home directory of several users. A for loop is the easiest way to do so:
for name in ralph norton alice edith archie do echo $name cp /tmp/.kshrc.new /users/$name/.kshrc done
This example can be extended to copy certain files to several machines using the rcp command and verify that they got there using the remsh command:
for host in neptune jupiter mars earth sun do echo $host rcp /etc/passwd /etc/hosts $host:/etc rcp /.profile $host:/.profile remsh $host ll /etc/passwd /etc/hosts /.profile done
You can also process lists of files in the current directory using command substitution to generate the list_of_items:
for file in 'ls' do if [ -r $file ] then echo "$file is readable fi done
Note that for file in * would have done the same thing.
If you have a large list of things you would like to loop through and you don't want to type them on the command line, you can enter them in a file instead. Then, using the cat command and command substitution, you can generate the list_of_items:
for i in 'cat important_files' do # do something with each of the files listed in the # important_files file. done
The for loop, however, is most commonly used to process the list of command-line arguments ($*):
for name in $* do if [ ! -f $name -a ! -d $name ] then echo "$name is not a valid file or directory name" else # do something with the file or directory fi done
The upcoming trash program contains a for loop that processes command-line arguments in a similar way.
The while loop has the following format:
while cmd1 do cmdA cmdB cmdC done
cmd1 is executed first. If it executes successfully, then the commands between the do and the done statements are executed. cmd1 is then executed again; if successful, the commands in the loop are executed again; and so on. When cmd1 fails, the program jumps past the done statement and resumes execution with the next executable statement.
Most of the time, the command executed in place of cmd1 is the test command. You can then perform logical tests as described in the if section. If the test succeeds (is true), the commands in the loop are executed and the script tests the condition again. The while loop is useful if you have a fixed number of times you want the loop to run or if you want something to happen until some condition is met.
This program displays the primary LAN interface (le0) statistics, which may be lan0 or some other name depending on the UNIX variant you are using, using netstat ten times, once every 30 seconds:
#!/bin/ksh i=1 while [ $i -le 10 ] do print $i netstat -i | grep le0 sleep 30 let i=i+1 done
We increment i in a very simple fashion by adding one to it each time through the loop. We evaluate the value of i every time through the loop to determine whether it is less than or equal to (le) ten. If so, we run netstat -i.
Before we run this program, called net1, let's run the netstat -i command to view its output, and then run net1. Please keep in mind that the netstat command varies greatly among UNIX variants, so your output may look very different.
martyp $ netstat -i Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue lo0 8232 loopback localhost 18927030 0 18927030 0 0 0 le0 1500 sunsys sunsys 310417964 0 17193381 52064 7573173 sunsys:/home/martyp/shellprogs martyp $ net1 1 le0 1500 sunsys sunsys 310418018 0 17193388 52064 7573173 2 le0 1500 sunsys sunsys 310418738 0 17193395 52064 7573173 3 le0 1500 sunsys sunsys 310419579 0 17193401 52064 7573173 4 le0 1500 sunsys sunsys 310420295 0 17193405 52064 7573173 5 le0 1500 sunsys sunsys 310421099 0 17193446 52064 7573173 6 le0 1500 sunsys sunsys 310421786 0 17193455 52064 7573173 7 le0 1500 sunsys sunsys 310422425 0 17193462 52064 7573173 8 le0 1500 sunsys sunsys 310423089 0 17193467 52064 7573173 9 le0 1500 sunsys sunsys 310423749 0 17193471 52064 7573173 10 le0 1500 sunsys sunsys 310424507 0 17193478 52064 7573173 sunsys:/home/martyp/shellprogs martyp $
net1 produces just the output we want for le0 every 30 seconds on our system, called sunsys. We also print out the value of i each time through the loop just so that we can see that we are incrementing it properly.
On HP-UX 11i systems the netstat command was modified to provide less information. The lanadmin command can be used to supply detailed information on collisions and errors. The following script runs lanscan to automatically produce a list of network interface cards on an 11i system and then produces a lanadmin output for each card.
#!/bin/sh #@(#) $Header: $ #@(#) Description: Script to create network stats output with lanadmin #@(#) $Source: $ #@(#) $Locker: $ #@(#) $Log: $ #@(#) # INPUT: # OUTPUT: # FILES ACCESSED: # NOTES: # AUTHOR: # REVISION HISTORY: # Setup path PATH=/bin:/usr/bin:/usr/sbin/:/usr/local/bin/;export PATH # # Determine OS Revision level # run lanscan to determine network cards # OS_VERSION=$(uname -r | awk -F. '{print $2}') if [ $OS_VERSION -lt 10 ] then COMMAND1=/bin/landiag else COMMAND1=/usr/sbin/lanadmin fi if [ -x $COMMAND1 ] then COMMAND2=/usr/sbin/lanscan if [ -x $COMMAND2 ] then if [ $OS_VERSION -lt 10 ] then for CARD in 'ls -1 /dev/lan* | awk -F "/" '{print $3}'' do echo "" $COMMAND1 -t <<- EOF 2> /dev/null | egrep "=" lan name $CARD display quit EOF done # # use field 3 of lanscan output to obtain card instance used for ppa below # else for NMID in '$COMMAND2 | grep 0x | awk '{print $3}'' do echo "" $COMMAND1 -t <<- EOF 2> /dev/null | egrep "=" lan ppa $NMID display quit EOF done fi else print print print "The command "$COMMAND2" is not executable" print "No data from "$COMMAND1" will be collected" print print fi else print print print "The command "$COMMAND1" is not executable" print print fi
Field three of lanscan is used to provide the ppa information to lanadmin as the card instance for which a lanadmin output will be produced. The following is the lanscan output and result of running the script on a system with five network cards that are ppa 0 through 1:
# lanscan Hardware Station Crd Hdw Net-Interface NM MAC HP-DLPI DLPI Path Address In# State NamePPA ID Type Support Mjr# 0/0/0/0 0x001083FEDCB7 0 UP lan0 snap0 1 ETHER Yes 119 0/4/0/0/4/0 0x001083F72ED0 1 UP lan1 snap1 2 ETHER Yes 119 0/4/0/0/5/0 0x001083F72E9B 2 UP lan2 snap2 3 ETHER Yes 119 0/4/0/0/6/0 0x001083F77E22 3 UP lan3 snap3 4 ETHER Yes 119 0/4/0/0/7/0 0x001083F71E59 4 UP lan4 snap4 5 ETHER Yes 119 # networkscript.sh PPA Number = 0 Description = lan0 Hewlett-Packard 10/100 TX Half-Duplex TT = 1500 Type (value) = ethernet-csmacd(6) MTU Size = 1500 Speed = 10000000 Station Address = 0x1083fedcb7 Administration Status (value) = up(1) Operation Status (value) = down(2) Last Change = 10440 Inbound Octets = 0 Inbound Unicast Packets = 0 Inbound Non-Unicast Packets = 0 Inbound Discards = 0 Inbound Errors = 0 Inbound Unknown Protocols = 0 Outbound Octets = 820 Outbound Unicast Packets = 20 Outbound Non-Unicast Packets = 0 Outbound Discards = 0 Outbound Errors = 0 Outbound Queue Length = 0 Specific = 655367 Index = 1 Alignment Errors = 0 FCS Errors = 0 Single Collision Frames = 0 Multiple Collision Frames = 0 Deferred Transmissions = 0 Late Collisions = 0 Excessive Collisions = 0 Internal MAC Transmit Errors = 0 Carrier Sense Errors = 0 Frames Too Long = 0 Internal MAC Receive Errors = 0 PPA Number = 1 Description = lan1 Hewlett-Packard 10/100 TX Full-Duplex TT = 1500 Type (value) = ethernet-csmacd(6) MTU Size = 1500 Speed = 100000000 Station Address = 0x1083f72ed0 Administration Status (value) = up(1) Operation Status (value) = up(1) Last Change = 11018 Inbound Octets = 2778542151 Inbound Unicast Packets = 10008640 Inbound Non-Unicast Packets = 14480929 Inbound Discards = 0 Inbound Errors = 0 Inbound Unknown Protocols = 12443000 Outbound Octets = 3811379313 Outbound Unicast Packets = 18378160 Outbound Non-Unicast Packets = 50019 Outbound Discards = 0 Outbound Errors = 0 Outbound Queue Length = 0 Specific = 655367 Index = 2 Alignment Errors = 0 FCS Errors = 0 Single Collision Frames = 0 Multiple Collision Frames = 0 Deferred Transmissions = 0 Late Collisions = 0 Excessive Collisions = 0 Internal MAC Transmit Errors = 0 Carrier Sense Errors = 0 Frames Too Long = 0 Internal MAC Receive Errors = 0 PPA Number = 2 Description = lan2 Hewlett-Packard 10/100 TX Full-Duplex TT = 1500 Type (value) = ethernet-csmacd(6) MTU Size = 1500 Speed = 100000000 Station Address = 0x1083f72e9b Administration Status (value) = up(1) Operation Status (value) = up(1) Last Change = 12223 Inbound Octets = 1053052283 Inbound Unicast Packets = 660 Inbound Non-Unicast Packets = 14478087 Inbound Discards = 0 Inbound Errors = 0 Inbound Unknown Protocols = 12442909 Outbound Octets = 5802138 Outbound Unicast Packets = 43576 Outbound Non-Unicast Packets = 43065 Outbound Discards = 0 Outbound Errors = 0 Outbound Queue Length = 0 Specific = 655367 Index = 3 Alignment Errors = 0 FCS Errors = 0 Single Collision Frames = 0 Multiple Collision Frames = 0 Deferred Transmissions = 0 Late Collisions = 0 Excessive Collisions = 0 Internal MAC Transmit Errors = 0 Carrier Sense Errors = 0 Frames Too Long = 0 Internal MAC Receive Errors = 0 PPA Number = 3 Description = lan3 Hewlett-Packard 10/100 TX Full-Duplex TT = 1500 Type (value) = ethernet-csmacd(6) MTU Size = 1500 Speed = 100000000 Station Address = 0x1083f77e22 Administration Status (value) = up(1) Operation Status (value) = up(1) Last Change = 13428 Inbound Octets = 943616591 Inbound Unicast Packets = 9064639 Inbound Non-Unicast Packets = 765175 Inbound Discards = 0 Inbound Errors = 0 Inbound Unknown Protocols = 39 Outbound Octets = 6454687 Outbound Unicast Packets = 58769 Outbound Non-Unicast Packets = 43040 Outbound Discards = 0 Outbound Errors = 0 Outbound Queue Length = 0 Specific = 655367 Index = 4 Alignment Errors = 0 FCS Errors = 0 Single Collision Frames = 0 Multiple Collision Frames = 0 Deferred Transmissions = 0 Late Collisions = 0 Excessive Collisions = 0 Internal MAC Transmit Errors = 0 Carrier Sense Errors = 0 Frames Too Long = 0 Internal MAC Receive Errors = 0 PPA Number = 4 Description = lan4 Hewlett-Packard 10/100 TX Full-Duplex TT = 1500 Type (value) = ethernet-csmacd(6) MTU Size = 1500 Speed = 100000000 Station Address = 0x1083f71e59 Administration Status (value) = up(1) Operation Status (value) = up(1) Last Change = 14633 Inbound Octets = 249984023 Inbound Unicast Packets = 2628160 Inbound Non-Unicast Packets = 765196 Inbound Discards = 0 Inbound Errors = 0 Inbound Unknown Protocols = 49 Outbound Octets = 3886863362 Outbound Unicast Packets = 10894938 Outbound Non-Unicast Packets = 425625 Outbound Discards = 0 Outbound Errors = 0 Outbound Queue Length = 0 Specific = 655367 Index = 5 Alignment Errors = 0 FCS Errors = 0 Single Collision Frames = 0 Multiple Collision Frames = 0 Deferred Transmissions = 0 Late Collisions = 0 Excessive Collisions = 0 Internal MAC Transmit Errors = 0 Carrier Sense Errors = 0 Frames Too Long = 0 Internal MAC Receive Errors = 0
Immediately before lanscan is run, our script prints the ppa number so that we know the network interface card for which the lanscan output is produced.
For all five network cards in our example there are no collisions or errors, as shown near the end of each lanadmin output, however, these fields are produced by lanadmin.
The while loop can also be used to process command-line arguments one at a time, using the number of command-line arguments and the shift command:
while [ $# -ne 0 ] do case $1 in -*) # $1 must be an option because it starts with - # Add it to the list of options: opts="$opts $1" ;; *) # $1 must be an argument. Add it to the list of # command-line arguments: args="$args $1" ;; esac shift done
The shift command shifts the remaining arguments in $* to the left by one position and decrements $#. What was the first argument ($1) is now gone forever; what was in $2 is now in $1, etc. In the process of shifting command-line arguments, $# is also decremented to accurately reflect the number of arguments left in $*.
You may want some commands to run until the user stops the program or until some stop condition is met. An infinite while loop is the best way to do so. For example, let's say that we are prompting users for some input and we will continue to prompt them until they give us valid input:
while true
do
# prompt users and get their response
echo "Enter yes or no: c"
read ans
# Check whether the response is valid
if [ "$ans" == "yes" -o "$ans" == "no" ]
then
# If it is valid, stop the looping
break
else
# Otherwise print an error message and try it again
# from the top of the loop
echo "Invalid input, try again!
"
fi
done
# Now that we have valid input, we can process the user's
# request
.
.
.
true is a special command that always executes successfully. The loop does not terminate unless the user stops the program by killing it or until a break command is executed in the loop. The break command will stop the loop.
As you write shell programs, you will notice that there are certain sets of commands appear in many places within a program. For example, several times in a script, you may check user input and issue an appropriate message if input is invalid. It can be tedious to type the same lines of code in your program numerous times. It can be a nuisance if you later want to change these lines.
Instead, you can you can put these commands into a shell function. Functions look and act like a new command that can be used inside the script. Here's an example of a basic shell function:
# This is a function that may be called from anywhere within # the program. It displays a standard usage error message # and then exits the program. print_usage() { echo "Usage:" echo "To trash files: $0 [-i] files_to_trash..." echo "Display trashed files: $0 -d" echo "Remove all trashed files: $0 -rm" echo "Print this message: $0 -help" exit 1 }
print_usage is now a new command in your shell program. You can use it anywhere in this script.
Shell functions also have their own set of positional parameters ($1-$9, $#, and $*), so you can pass them arguments just like any other command. The only nuance is that $0 represents the name of the shell program, not the name of the function.
Earlier, we talked about arguments. When you type the name of the shell script, you can supply arguments that are saved in the variables $1 through $9. The first ten words on the command line are directly accessible in the shell program using the special variables $0 - $9. The following shows how they work:
$0 | The command name |
$1 | The first argument |
$2 | The second argument |
$3 | . |
. | |
. | |
$9 | The ninth argument |
If you are not sure how many command-line arguments you may get when your program is run, there are two other variables that can help:
$# | The number of command-line arguments |
$* | A space-separated list of all of the command-line arguments (which does not include the command name). |
The variable $* is commonly used with a for loop to process shell script command lines with any number of arguments.
Figure 28-4 is a fairly complex program that exercises all the concepts we have covered so far. It is a trash program that removes files from their original locations. Instead of removing them permanently, it places them in a trash can in your home directory. This is a fairly robust program, but I'm sure that you can think of many extensions as you read through it.
#!/bin/ksh # for Bourne use /bin/sh # Program name: trash # Usage: # To trash files: trash [-i] file_names_to_trash ... # Display trashed files: trash -d # Remove all trashed files: trash -rm # Print a help message: trash -help # This program takes any number of directory or file name # arguments. If the argument is a file it will be removed # from its current place in the file system and placed in the # user's trash directory ($HOME/.trash). If the argument is a # directory name the program will ask if the user really # wants to trash the whole directory. # # This program also takes an -i (interactive) option. Like # the rm command, if the -i is the first argument on the # command line, the program stops and asks if each file # named in the remaining arguments should be trashed. # # The -d (display) option shows the contents of the # user's trashed files. # # The –help option displays a usage message for the user. # The -rm (remove) option interactively # asks the user if each file or directory in the trash # directory should be removed permanently. # # The -h, -d and -rm options may not be used with # any other command line arguments. # Possible extensions: # - Enhance the -rm option to remove a list of files # from the trash directory from the command line. # - Create a program to be run by cron once nightly to empty # everyone's trash directory. # This is a function that may be called from anywhere within # the program. It displays a standard usage error message # then exits the program. print_usage() { echo "Usage:" echo "To trash files: $0 [-i] file_names_to_trash ..." echo "Display trashed files: $0 -d" echo "Remove all trashed files: $0 -rm" echo "Print this message: $0 -help" exit 1 } # Make sure we have at least one command-line argument before # we start. if [ $# -lt 1 ] then print_usage fi # If this flag is true then we need to do interactive # processing. interactive="FALSE" # This is the name of the trash can. trash_dir="$HOME/.trash" # Make sure the trash directory exists before we go any # further. if [ ! -d $trash_dir ] then mkdir $trash_dir fi # Sort out the command-line arguments. case $1 in -help) # Print a help message. print_usage ;; -d | -rm) # a -d or -rm were given # If it was not the only command-line argument # then display a usage message and then exit. if [ $# -ne 1 ] then print_usage fi # Otherwise do the task requested. if [ $1 == "-d" ] then echo "The contents of $trash_dir: " ls -l -R $trash_dir | more else # remove all files from the trash directory rm -rf $trash_dir/* # get any dotfiles too rm -rf $trash_dir/.[!.]* fi # Now we can exit successfully. exit 0 ;; -i) # If the first argument is -i ask about each file as it # is processed. interactive="TRUE" # Take -i off the command line so we know that the # rest of the arguments are file or directory names. shift ;; -*)# Check for an option we don't understand. echo "$1 is not a recognized option." print_usage ;; esac # Just for fun we'll keep a count of the files that were # trashed. count=0 for file in $* do # First make sure the file or directory to be renamed exists. # If it doesn't, add it to a list of bad files to be written # out later. Otherwise process it. if [ ! -f $file -a ! -d $file ] then bad_files="$bad_files $file" else # If we are in interactive mode ask for confirmation # on each file. Otherwise ask about directories. if [ "$interactive" = "TRUE" -o -d $file ] then # Ask the user for confirmation (default answer is no). if [ -d $file ] then echo "Do you want to trash the dir $file ? (y/n) nc" else echo "Do you really want to trash $file ? (y/n) nc" fi read doit # If the user answered y then do the move. # Otherwise print a message that the file was not touched. if [ "${doit:-n}" = y ] then mv -i $file $trash_dir echo "$file was trashed to $trash_dir" let count=count+1 # for Bourne use: count='expr $count + 1' else echo "$file was not trashed" fi else # We are not in interactive mode, so just do it. mv -i $file $trash_dir let count=count+1 #for Bourne use: count='expr $count + 1' fi fi done echo "$0: trashed $count item(s)" if [ -n "$bad_files" ] then echo "The following name(s) do not exist and c" echo "could not be trashed:" echo "$bad_files" fi exit 0 |
Let's now run the trash program. The following example shows issuing just the program named trash, the next example shows issuing trash -help, the next example shows issuing trash -i junk to interactively remove the file junk, and the last example shows issuing trash -d to display files that have been removed with trash and are in the /home/martyp/trash directory:
martyp $ trash Usage: To trash files: trash [-i] file_names_to_trash ... Display trashed files: trash -d Remove all trashed files: trash -rm Print this message: trash -help martyp $ trash -help Usage: To trash files: trash [-i] file_names_to_trash ... Display trashed files: trash -d Remove all trashed files: trash -rm Print this message: trash -help martyp $ trash -i junk Do you really want to trash junk ? (y/n) y mv: overwrite /home/martyp/.trash/junk (yes/no)? yes junk was trashed to /home/martyp/.trash trash: trashed 1 item(s) martyp $ trash -d The contents of /home/martyp/.trash: /home/martyp/.trash: total 1364 -rw------- 1 martyp staff 684808 May 30 05:31 core -rwxrwxr-x 1 martyp staff 631 May 30 06:45 file1 -rwxrwxr-x 1 martyp staff 45 May 31 06:04 junk martyp $
Notice that when we removed the file junk, trash asked us whether we wanted to overwrite a file by the same name that had been earlier removed with trash and placed in the /home/martyp/trash directory.
This program employs every concept we have covered in this shell programming chapter so far. You may want to take a close look at this program so that you can use these techniques in the programs you craft. I have also included comments for the lines that need to be changed to make this program work in the Bourne shell. The KornShell and Bourne shell are very similar, so you can use most of the same techniques when writing programs for these two shells.
awk is a very powerful symbolic programming language and data manipulation tool. |
Data manipulation tasks that would be very complex with combinations of grep, cut, and paste are very easily done with awk. Because awk is a programming language, it can also perform mathematical operations or check the input very easily, a task that is normally difficult with shells. It can even perform floating-point math.
The basic form of an awk program looks like this:
awk '/pattern_to_match/ { program to run }' input_file_names
Notice that the entire program is enclosed in single quotes. If no input file names are specified, awk reads from standard input (as from a pipe).
The pattern_to_match must appear between the / (slash) characters. The pattern is actually called a regular expression. Some common regular expression examples are shown shortly.
The program to execute is written in awk code, which looks something like C. The program is executed whenever a line of input matches the pattern_to_match. If /pattern_to_match/ does not precede the program in { }, then the program is executed for every line of input.
awk works with fields of the input lines. Fields are words separated by white space. The fields in awk patterns and programs are referenced with $, followed by the field number. For example, the second field of an input line is $2. If you are using an awk command in your shell programs, the fields ($1, $2, etc.) are not confused with the shell script's positional parameters, because the awk variables are enclosed in single quotes and the shell ignores them.
But I don't want to just talk about it. Let's take a look at some examples.
This simple example lists just the terminals that are active on your system in which the terminal name is the second field of a who listing:
who | awk '{ print $2 }'
Here is an example of running who and then running this one-line command:
martyp $ who thomfsu console Feb 22 15:21 (:0) martyp pts/3 May 31 06:03 (atlm0216.atl.hp.com) martyp $ who | awk '{print $2}' console pts/3 martyp $
This output shows only the active terminals on the system.
Note that cut could have done this also, but you would have had to know exactly which columns the terminal name occupied in the who output, as shown below:
martyp $ who thomfsu console Feb 22 15:21 (:0) martyp pts/3 May 31 06:03 (atlm0216.atl.hp.com) martyp $ who | cut -c12-20 console pts/3 martyp $
If the user or terminal name is longer than normal in any line, this command does not work. The awk example will work because it looks at fields, not columns.
Some trivia to wow your friends with at your next cocktail party: awk is the first letter of the last names of its authors - Alfred Aho, Peter Weinberger, and Brian Kernighan. |
18.116.62.239