Chapter 10. Handling Multiple Processes

In this chapter, I will describe how to build scripts that communicate with multiple processes. Using multiple processes, you can build scripts that do much more than simple automation. For instance, you can connect programs together or borrow the facilities of one to enhance those of another. You can also do it transparently so that seems like a single program to anyone running the script.

The spawn_id Variable

In the following script, two processes are spawned. The first is bc, an arbitrary precision arithmetic interpreter. The second is a shell. By default, send and expect communicate with the most recently spawned process. In this case, the following expect reads from the shell because it was spawned after bc.

spawn bc

spawn /bin/sh
expect $prompt           ;# communicate with /bin/sh

Why is this? When a spawn command is executed, the variable spawn_id is set to an identifier that refers to the process. The spawn_id variable is examined each time send and expect are called. send and expect know how to access the process by using the value in spawn_id.

If another process is spawned, spawn_id is automatically set to an identifier referring to the new process. At this point, send and expect then communicate with the new process. In this example, "spawn bc" stored an identifier into spawn_id, but "spawn /bin/sh" replaced that with a new identifier to itself. The following expect command therefore communicates with the shell.

It is possible to communicate with the old process by setting spawn_id back to the identifier for that process. spawn_id is not special in this regard. It is read or written using the same commands that access other variables. For example:

spawn bc
set bc_spawn_id $spawn_id      ;# save bc's spawn id

spawn /bin/sh
set shell_spawn_id $spawn_id   ;# save shell's spawn id

set spawn_id $bc_spawn_id      ;# talk to bc
send "scale=50
"

Clearly, the value of spawn_id is very important. Indeed, the process whose identifier is stored in spawn_id is known as the currently spawned process. In the script above, bc is initially the currently spawned process, and then /bin/sh becomes the currently spawned process. When spawn_id is reset by an explicit set command, bc once again becomes the currently spawned process.

While not the only ones, the UNIX program bc and the related program dc are very useful to have spawned while other programs are running. Both bc and dc are capable of arbitrary precision mathematics. For example, suppose you are interacting with a process which requires you to multiply some very large number together but does not provide support itself to do it. Just change to the spawn id from bc and get the answer through an interaction like this:

send "1234567897293847923*234229384318401298334234874
"
expect -re "
(.*)

Here is an interaction with dc to change a decimal number to the oddball base of 6.

send "1928379182379871
6op
"
expect -re "
.*
(.*)
"

Both of these leave the result in expect_out(1,string).

Example—chess Versus chess

Very useful results can be produced by communicating with multiple processes. A simple but amusing example is the problem of having one chess process play a second chess process. In order to accomplish this, the standard output of one process must be connected to the standard input of another, and vice versa.

Figure 10-1. 

As an Expect script, the basic idea might be implemented this way:

set timeout −1

spawn chess                  ;# start player one
set chess1 $spawn_id

spawn chess                  ;# start player two
set chess2 $spawn_id

while 1 {
    expect "(.*)
"            ;# read move
    set spawn_id $chess1
    send $expect_out(1,string) ;# send move to other
                               ;# player

    expect "(.*)
"            ;# read response
    set spawn_id $chess2
    send $expect_out(1,string) ;# send back
}

The first four lines start two chess processes and save the respective spawn ids. Then the script loops. The loop starts by reading a move from the first process. spawn_id is changed to the second process, and the move is sent there. The response is collected, spawn_id is set back to the original chess process, and the response is sent back to the first process. The loop repeats, allowing moves to go back and forth.

Alas, the UNIX chess program was not intended to read its own output, so the output has to be massaged a little before being used as input.[40] Oddly, the program prints out moves differently depending on if it goes first or second. If the program goes first, its own moves look like this:

1. n/kn1-kb3

But if the program goes second, its own moves have an extra "..." in them and look like this:

1. ... n/qn1-qb3

Pairs of moves are numbered from 1 on up. The "1." is the move number and has to be ignored. The program also echoes the opponent’s moves. Indeed, they are echoed twice—once when they are entered, and then once again prefixed by a move number. Here is what this looks like to the person who moves first:

p/k2-k4echo as first player types move
1. p/k2-k4                         chess program reprints it
1. ... p/qb2-qb3                   chess program prints new move

Following is a command that matches the new move, leaving it in expect_out(1,string). Notice that the literal periods are prefaced with backslashes since they would otherwise match any character:

expect -re "\.\.\. (.*)
"

To the person who moves second, the interaction looks like this:

p/k2-k4echo as second player types move
1. ... p/k2-k4                                chess process reprints it
2. p/q2-q4                                chess process prints new move

In this case, the new move is matched slightly differently:

expect -re "\.\.\. .*\. (.*)
"

The patterns themselves are straightforward; however, the chess processes themselves must be started differently so that one moves first while the other waits to hear a move first. The script sends the string first to one of the processes to get it to move first. Of course, before doing this the script waits until the process acknowledges that it is listening. The process does this by printing "Chess “. Here is what that looks like:

Chess                                chess process says it is ready
first                                type this to get process to move first
1. p/k2-k4                                chess process prints first move

Once the first move has been read, it is possible to loop, handling moves the same way each time. Here is the code to start both processes. The first process is told to move first. The second process moves second.

set timeout −1

spawn chess                 ;# start player one
set id1 $spawn_id
expect "Chess
"
send "first
"              ;# force it to go first
expect -re "1\. (.*)
"    ;# read first move

spawn chess                 ;# start player two
set id2 $spawn_id
expect "Chess
"

Now the loop can be expressed more parametrically:

while 1 {
    send $expect_out(1,string)
    expect -re "\.\. (.*)
"
    set spawn_id $id1

    send $expect_out(1,string)
    expect -re "\.\. .*\. (.*)
"
    set spawn_id $id2
}

One tiny simplification has been made that deserves elaboration. In the patterns, it is only necessary to match two periods even though three are printed, since nothing else in the output could possibly match the two periods. One period would not be sufficient—that could match the period in the move number. The space following the two periods serves to enforce that they are the second and third periods rather than the first and second.

The script could use one other improvement. Currently there is no check for the end of the game. The game ends by either player resigning. Resignation is actually trivial to check. The program prints a little message and then exits. Since the program does not print a new move, Expect will read an eof. Adding "eof exit" to the two expect commands in the loop will thus allow the script to cleanly exit.

Example—Automating The write Command

Scripts are not limited to interactions with two processes. Large numbers of processes can be spawned from a single script. As an example, imagine a script that runs several write processes simultaneously. Why would this be useful? The UNIX write program allows a person to type messages on one other person’s terminal. The wall program allows messages to be typed on everyone’s terminal, but there is nothing in between—a program that types to a subset of terminals.

Using Expect, it is possible to write a script that writes messages to any set of users simultaneously. Here is the first half of such a script.

#!/usr/local/bin/expect --
set ulist {}
foreach user $argv {
    spawn write $user
    lappend ulist $spawn_id
}

The script reads the user names from the argument list. Each spawn id is appended to the list ulist. ulist is not a special variable. It could have been called anything. Notice that ulist is initialized to an empty list and then lappend is used to append to it. This is a common idiom for adding elements to lists.

Once all the spawn ids have been created, text can be sent to each process. In the second half of the script, text is read from the user via expect_user. Each time the user presses return, the line is sent to each spawned process.

set timeout −1
while 1 {
    expect_user {
        -re "
" {}
        eof break
    }

    foreach spawn_id $ulist {
        send $expect_out(buffer)
    }
}

Each time through the foreach loop, spawn_id is assigned an element from ulist. This conveniently changes the currently spawned process so that the send command sends the text to each spawned process.

If the user presses ^D, expect_user reads an eof, the loop breaks, and the script exits. The connection to each write process is closed, and each process exits.

How exp_continue Affects spawn_id

Earlier I noted that the expect command decides which process to communicate with based on the value of spawn_id. The expect command checks the value of spawn_id at two times: when it starts and after every exp_continue command. This means that with an appropriate action in an expect command, you can change the currently spawned process while the expect command is running.

The Value Of spawn_id Affects Many Commands

The chess and write scripts are good examples of how spawn_id affects both the send and expect commands. To recap, send and expect communicate with the currently spawned process—that is, the process whose spawn id is stored in the variable spawn_id. Other commands that are affected by spawn_id include interact, close, wait, match_max, parity, and remove_nulls. In later chapters, I will describe still more commands that are affected by spawn_id.

As an example, here is a code fragment to close and wait on a list of spawn ids.

foreach spawn_id $spawn_ids {
    close
    wait
}

This loop could have been added to the earlier write script—except that the script effectively does the close and wait upon exit anyway. However, remember from Chapter 4 (p. 103) that programs that run in raw mode (such as telnet) often need explicit code to force them to exit. That code might be appropriate in such a loop.

Imagine writing a script that telnets to several hosts and simultaneously sends the same keystrokes to each of them. This script could be used, for example, to reboot a set of machines, change passwords, test functionality, or any number of things that have to be performed directly on each machine.

Symbolic Spawn Ids

For efficiency, Expect uses integers to represent spawn ids. For instance, if you examine the value of spawn_id, you will find it is an integer. However, you should avoid relying on this knowledge—it could change in the future.

One thing you can rely on is that a spawn id can be used as an array index. You can use this fact to associate information with the spawn ids. For example, if you have spawned several telnet sessions, you can retrieve the original hostname if you save it immediately after the spawn.

spawn telnet potpie
set hostname($spawn_id) potpie

Once saved, the hostname can be retrieved just from the raw spawn id alone. This technique can be used inside a procedure. With only the spawn id passed as an argument, the hostname is available to the procedure.

proc wrapup {who} {
    global hostname
    set spawn_id $who
    send "exit
"
    puts "sent exit command to $hostname($spawn_id)"
}

Similar associations can be made in the reverse direction. It is also possible to associate several pieces of information with a spawn id. Consider these assignments.

spawn $cmdname $cmdarg
set proc($spawn_id,cmdname) $cmdname
set proc($spawn_id,cmdarg) $cmdarg
set proc($cmdname,spawn_id) $spawn_id

These assignments could be wrapped up in a procedure so that they occur every time you spawn a process.

Job Control

Changing spawn_id can be viewed as job control, similar to that performed by a user in the shell when pressing ^Z and using fg and bg. In each case, the user chooses which of several processes with which to interact. After making the choice, the process appears to be the only one present—until the user is ready to switch to interacting with another process.

Shell-style job control, however, cannot be automated in a shell script. It is tied to the idea of a controlling terminal, and without one, job control makes no sense. You cannot embed commands such as fg or bg in a shell script. Shell-style job control is oriented towards keyboard convenience. Jobs are switched with a minimum of keystrokes. Expect’s job control—spawn_id—is not intended for interactive use. By comparison with the shell, Expect’s job control is verbose. But it is quite appropriate for a programming language. In later chapters, I will show an alternative form of job control that is less verbose, plus I will demonstrate how to imitate C-shell job control. For now, though, I will stick with this verbose form.

In a programming language, you can embed the repetitive things inside of procedures. This is the right way to use Expect as well. If you find yourself frequently writing "set spawn_id . . .“, consider defining a procedure to automate these commands.

For example, suppose you have a script that automates an ftp process. As the ftp process runs, it writes status messages to a user via write. In this case, you need two spawn ids, one for write and one for ftp.

spawn ftp  ; set ftp   $spawn_id
spawn write; set write $spawn_id

To send a status message, spawn_id is changed from $ftp to $write and then the send command is called. Finally, spawn_id is set back so that the ftp interaction can continue.

send "get $file1
";                            expect "220*ftp> "

set spawn_id $write
send "successfully retrieved file
"

set spawn_id $ftp
send "get $file2
";                            expect "220*ftp> "

This example can be simplified by writing a procedure called, say, report.

proc report {message} {
    global write

    set spawn_id $write
    send $message
}

In the report procedure, the message is passed as an argument. It is called as:

report "successfully retrieved file
"

The spawn id of the write process is retrieved from the global environment by declaring write as a global variable. spawn_id is then set to this value. As before, send uses spawn_id to determine which process to communicate with.

This spawn_id variable is local to the procedure. It is only visible to commands (such as send) inside the procedure, and it goes away when the procedure returns. This greatly simplifies the caller. It is no longer necessary to reset spawn_id to ftp because it is done implicitly by the procedure return. Here is what the caller code would now look like:

send "get $file1
";                            expect "220*ftp> "
report "successfully retrieved file
"
send "get $file2
";                            expect "220*ftp> "

This is much cleaner than without the procedure call. Using procedures in this fashion greatly simplifies code.

Procedures Introduce New Scopes

A procedure introduces a new scope. This normally hides variables unless the global command (or upvar or uplevel) is used. Because Expect depends so much on implicit variables (spawn_id, timeout, etc.), Expect commands have a special behavior when it comes to reading variables.

  • When reading a variable, if a global command has declared the variable, the variable is looked up in the global scope. If undeclared, the variable is first looked up in the current scope, and if not found, it is then looked up in the global scope.

The italicized phrase emphasizes how Expect differs from the usual Tcl scoping mechanism. To say this a different way, while reading variables, Expect commands search the global scope for variables if they are not found in the local scope.

In the report procedure defined above, spawn_id was defined locally. By the rule just stated, spawn_id would be found in the local scope. Without the set command in report, spawn_id would be found in the global scope.

This rule can be used to simplify scripts. In the ftp example on page 237, each time a command was sent to ftp, it was immediately followed by an expect to check that the command succeeded.

send "get $file2
";                            expect "220*ftp> "

You can wrap this sequence into a procedure so that each time a command is sent, the response is checked:

proc ftpcmd {cmd} {
    send "cmd
"
    expect "220*ftp> "
}

In this procedure, again, spawn_id is not defined locally, nor is it mentioned in a global command. Thus, both the send and expect commands look it up from the global scope.

The expect command also does the same thing with the timeout variable. Because none is defined in the procedure, the global timeout is used. Compare this new definition of ftpcmd:

proc ftpcmd {cmd} {
    set timeout 20
    send "cmd
"
    expect "220*ftp> "
}

Here, the set command explicitly sets timeout to 20. This instance of timeout is local to the procedure scope. It is used by expect, but when ftpcmd returns, this local timeout disappears.

Here is yet another definition of ftpcmd. In it, a global command makes timeout refer to the global version. The set command changes the global timeout, and send and expect refer to the global timeout.

proc ftpcmd {cmd} {
    global timeout
    set timeout 20
    send "cmd
"
    expect "220*ftp> "
}

Before leaving this example, it is worth noting that tiny procedures like this one can be very helpful. They simplify the calling code—in this example, you no longer have to remember to write the expect command after every send command. If sophisticated actions are required in expect commands to handle error checking, then you need edit only a single procedure. Without a procedure, you need to add the error checking to every expect command. And if the expect command ever changes, by isolating the code in one place, it only has to be changed once.

How Expect Writes Variables In Different Scopes

Although Expect commands look in two scopes when reading variables, only one scope is used when writing variables.

  • When writing a variable, the variable is written in the current scope unless a global command has declared the variable, in which case, the variable is written in the global scope.

This is the usual Tcl behavior, but since it differs from the previous rule, I will describe it in more detail.

In the previous definition of ftpcmd, the expect command looks for ftp to return "220*ftp>“. The expect command, as usual, writes what it finds into expect_out(buffer). However, expect writes the variable into the local scope. That means that the caller does not see the updated expect_out. In the following code, the caller assumes expect_out is not overwritten by ftpcmd.

expect $shellprompt
ftpcmd "get file"
send_user "found shell prompt: $expect_out(buffer)
"

If you need a procedure to write into the global version of expect_out, then a global command must be used in the procedure. Here is a definition for ftpcmd which does that.

proc ftpcmd {cmd} {
    global expect_out

    send "cmd
"
    expect "220*ftp> "
}

The rules just described for expect_out hold for spawn_id as well. You need a global command if you want to write the value of spawn_id outside the current procedure. Without a global command, the spawn command writes spawn_id into the local scope. As soon as the procedure returns, spawn_id reverts back to its old definition. In Chapter 4 (p. 100), I suggested that you should not invoke spawn from a procedure—until after reading this chapter. Now you can see the reason why: Without knowing about the spawn_id variable and how it is scoped, it is impossible to use spawn from a procedure and be able to interact with the spawned process after the procedure returns.

A procedure that spawns a process to be used later should provide some means for returning the spawn id. One way is to use a global command.

proc spawn_ftp {host} {
    global spawn_id

    spawn ftp $host
}

It is possible to return the information in other ways, such as by explicitly returning it or by writing it into some other variable in an another scope. Here is the same procedure written to return the spawn id. Notice that it does not use a global command.

proc spawn_ftp {host} {
    spawn ftp $host
    return $spawn_id
}

And here is a procedure that returns it to the caller by using the upvar command. If the caller is another procedure, spawn_id will be local to that procedure—unless, of course, one of the techniques illustrated here is used.

proc spawn_ftp {host} {
    upvar spawn_id spawn_id
    spawn ftp $host
}

The upvar command requires spawn_id to be mentioned twice. The first mention is the name in the calling scope. The second is the name in the current scope. After the upvar, every use of spawn_id in spawn_ftp references the spawn id in the caller. For example, in the earlier script the variable ftp was set to the spawn id of an ftp process. To do this in a script, the upvar command would be:

upvar ftp spawn_id

The upvar command is commonly used when passing parameters by reference. For example, it is possible to have the caller decide the name of the variable in which to save the spawn id. The name of the variable is passed as an additional variable and then dereferenced inside of spawn_ftp.

proc spawn_ftp {host spawn_id_var} {
    upvar $spawn_id_var spawn_id
    spawn ftp $host
}

proc work {
    spawn_ftp uunet.uu.net uunet_id
    # uunet_id is valid in here
    . . .
}

work
# uunet_id is no longer valid out here

After execution of spawn_ftp in the procedure work, the variable uunet_id will have the spawn id of an ftp process to uunet.uu.net. After work returns, uunet_id will no longer be set (presuming it was not set to begin with).

Predefined Spawn Ids

Three variables contain spawn ids predefined by Expect. These do not correspond to actual processes, but can be logically used as if they do. They are:

user_spawn_id    standard input and standard output
error_spawn_id    standard error
tty_spawn_id    controlling terminal (i.e., /dev/tty)

user_spawn_id contains a spawn id associated with the standard input and standard output. When spawn_id is set to the value of user_spawn_id, expect reads from the standard input, and send writes to the standard output. This is exactly what happens when Expect is started, before any processes have been spawned.

set spawn_id $user_spawn_id
expect -re "(.*)
"      ;# read from standard input

tty_spawn_id contains a spawn id associated with the controlling terminal. Even if the standard input, standard output, or standard error is redirected, the spawn id in tty_spawn_id still refers to the terminal.

set spawn_id $tty_spawn_id
expect -re "(.*)
"      ;# read from /dev/tty

With these spawn ids, you can view the user running the Expect script as a process. The user can be sent input and provides output to the Expect script, just like a process. While users are less reliable (usually), they can effectively be treated just like a process when it comes to interacting with them from Expect. Viewing processes as users and vice versa works well and can be quite handy. Because of this, algorithms do not have to be rewritten depending on from where input comes or output goes.

In the case that input and output are always associated with a human, scripts can use send_user, send_tty, expect_user, and expect_tty. These produce the same result as setting spawn_id to the values in user_spawn_id or tty_spawn_id and then calling send or expect.

Exercises

  1. Write a procedure called bc which evaluates an arbitrary precision arithmetic expression (see page 229). The procedure should pass it to a bc process that has already been spawned and return the result so that it can used with other Tcl commands. For example:

    set foo [bc 9487294387234/sqrt(394872394879847293847)]

  2. Modify the chess script so that it keeps track of the time spent by each player and optionally halts the game if either player exceeds a time limit.

  3. Named pipes allow unrelated processes to communicate. Write a script that creates a named pipe and writes a chess move to it. Write another script that opens the other end of the pipe and reads a chess move from it. Create another pipe so the scripts can communicate in the other direction as well.



[40] Ken Thompson wrote this chess program which continues to be distributed with most versions of UNIX.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.135.63