Chapter 11. How Do Subroutines Function?

Image

In computer programming, a subroutine is a sequence of program instructions that perform a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Subprograms may be defined within programs, or separately in libraries that can be used by multiple programs. In different programming languages, a subroutine may be called a procedure, a function, a routine, a method, or a subprogram. The generic term callable unit is also sometimes used.1

1. Wikipedia, “Subroutine,” http://en.wikipedia.org/wiki/Subroutine.

Perl allows you to collect a sequence of statements, give the collection a name, and use the collection just as you use the built-in functions, such as print and localtime. This collection can be called a subroutine or a function. It doesn’t matter.

By the end of this chapter, you should be able to explain each line in the following program.

1  use feature 'say';
2
3  print "What is the current Fahrenheit temperature? ";
4  chomp( $fahr = <STDIN> );
5  say "$fahr Fahrenheit converted to Celsius is ", converter($fahr),".";
6  say "program continues here";
7
8  sub converter{
9     my ($ftemp) = @_;
10    if ($ftemp < -459 ){ return "too cold"; }
11    my $celsius = ($ftemp - 32 ) * 5/9;
12    return int $celsius;
13 }

11.1 Subroutines/Functions

We have been using a number of Perl’s built-in functions since the beginning of this book. In addition to the large number of Perl functions already available, you can create your own. Some languages distinguish between the terms function and subroutine. Perl doesn’t. Technically, a function is a block of code that returns a value, whereas a subroutine is a block of code that performs some task, but doesn’t return anything. Perl subroutines and functions can do both, so we’ll use the two terms interchangeably in this text. For now, we’ll use the term “subroutine” when referring to user-defined functions.

Let’s further define the Wikipedia definition of a subroutine. Subroutines are self-contained units of a program designed to accomplish a specified task, such as calculating a mortgage payment, retrieving data from a database, or checking for valid input. When a subroutine is called in a program, it is like taking a detour from the main part of the program. Perl starts executing the instructions in the subroutine and when finished, returns to the main program and picks up where it left off. You can use subroutines over and over again and thus save you from repetitious programming. They are also used to break up a program into smaller units to keep it better organized and easier to maintain. If the subroutine proves to be useful in other programs, you can store it in a library as a module (discussed in Chapter 13, “Modularize It, Package It, and Send It to the Library!”).

The subroutine definition consists of one or more statements enclosed in a block, independent of your program and not executed until it is called. It is often referred to as a black box. Information goes into the black box as input (like the calculator or remote control when you push buttons), and the action or value returned from the box is its output (such as a calculation or a different channel, continuing the analogy). What goes on inside the box is transparent to the user. The programmer who writes the subroutine is the only one who cares about those details. When you use Perl’s built-in functions, such as print or rand, you send a string of text or a number to the function, and it sends something back. You don’t care how it does its job; you just expect it to work. If you send bad input, you get back bad output or maybe nothing; hence the expression “garbage in, garbage out.”

The scope of a subroutine is where it is visible in the program. Up to this point, all scripts have been in the namespace main, the main package. Subroutines are global in that they are visible or available to the entire script where they are defined. And you can place them anywhere in the script. You can define them in another file, and when coming from another file, they are loaded into the script with the require or use functions. All variables created within a subroutine or accessed by it are also global, unless specifically made local with either the local or my operators.

The subroutine is called, or invoked, by appending a set of empty parentheses to the subroutine name. (Rarely you will see a function called by prepending it with an ampersand, but that style is outdated in modern Perl.) If you use a forward reference, neither ampersands nor parentheses are needed to call the subroutine. You can send scalars, arrays, hashes, references, and the like to subroutines in an argument list and receive them by the function. This is all covered in the following pages.

If a nonexistent subroutine is called, the program quits with an error message: Undefined subroutine in “main::prog” .... If you want to check whether the subroutine has been defined, you can do so with the built-in defined function.

The return value of a subroutine is the value of the last expression evaluated (either a scalar or a list). You can use the return function explicitly to return a value or to exit from the subroutine early, based on the result of testing some condition.

If you make the call to the subroutine part of an expression, you can assign the returned value to a variable, thus emulating a function call.

11.1.1 Defining and Calling a Subroutine

A declaration simply announces to the Perl compiler that a subroutine is going to be defined in the program and may take specified arguments. Declarations are global in scope for a package (we have been working in package main). In other words, they are visible no matter where you put them in the program, although it is customary to put declarations at the beginning or end of the program, or in another file. (For now, we will define the subroutines in one file, package main.) We will discuss packages in Chapter 13, “Modularize It, Package It, and Send It to the Library!” but it is important to note that when we use the term global variable, it is technically called a package variable. A subroutine definition is a block of statements that follows the subroutine name. A subroutine you do not explicitly declare is declared at the same time it is defined.

Declaration: sub name;

Definition: sub name { statement; statement; }

You can define a subroutine anywhere in your program (or even in another file). The subroutine consists of the keyword sub followed by an opening curly brace, a set of statements, and ending in a closing curly brace.

The subroutine and its statements are not executed until called. You can call a subroutine by attaching a set of empty parentheses to its name (called the null parameter list), or by calling it as a built-in function. If you call the subroutine without parentheses, then you must declare it first.2

2. The sigil for a subroutine is the ampersand (&) and in older programs you may see this used to call a subroutine, as in &greetme. Today, the & is used only in special cases with reference.

Forward Declaration

A forward declaration announces to the compiler that the subroutine has been defined somewhere in the program. If no arguments are being passed to the subroutine, the empty parens are not needed to call a subroutine if it has been declared.

Scope of Variables

Scope describes where a variable is visible in your program. Perl variables are global in scope. They are visible throughout the entire program, even in subroutines. If you declare a variable in a subroutine, it is visible to the entire program. If you change the value of an existing variable from within a subroutine, it will be changed when you exit the subroutine. A local variable is private to the block, subroutine, or file where it is declared. You use the my operator to create local, lexically scoped variables, since by default, Perl variables are global in scope.

11.2 Passing Arguments and the @_ Array

If you want to send values to a subroutine, you call it with a comma-separated list of arguments enclosed in parentheses.

The following feed_me() function takes three arguments when called:

my @fruit=qw(apples pears peaches plums);  # Declare variables
my $veggie="corn";
&feed_me( @fruit, $veggie, "milk" );  # Call subroutine with arguments

The arguments can be a combination of numbers, strings, references, lists, hashes, variables, and so forth. They are received by the function in a special Perl array, called the @_ array, as a list of corresponding values called parameters. No matter how many arguments are passed, they will be flattened out into a single list in the @_ array. In this example, @fruit will be sent as four values, followed by $veggie and the string "milk". That means that six values will be stored in the @_ array. The @_ is populated when the subroutine is entered and cleared when it is exited.

sub feed_me{ print join(",", @_)," "; }  # Subroutine gets arguments
                                          # in @_ array

Output:  apples, pears, peaches, plums, corn, milk

11.2.1 Call-by-Reference and the @_ Array

Arguments, whether scalar values or lists, are passed into the subroutine and stored in the @_ array, whose values consist of implicit references or aliases to the actual parameters. (If you modify the @_ array, you will modify the actual parameters.)

The elements of the @_ array are $_[0], $_[1], $_[2], and so on. If a scalar variable is passed, its value is the first element of the @_ array, $_[0]. If you pass two or more arrays or hashes or any combination of values to the function, they will be flattened out into the @_ as one big array. Perl doesn’t care if you don’t use all the parameters passed or if you have an insufficient number of parameters. If you shift or pop the @_ array, you merely lose your reference to the actual arguments.

11.2.2 Assigning Values from @_

When retrieving the values in @_, you may fall into common pitfalls when copying its values into a variable (see Table 11.1).

Image

Table 11.1 Retrieving Values

Passing a Hash to a Subroutine

When you pass a hash to a subroutine, it is also flattened onto the @_ as a single list. When copied from the @_ into another hash, the hash is recreated with key/value pairs. It is more efficient to send a reference (address). In this way, you would send only the address of the hash, rather than the entire hash. (See Chapter 12, “Does This Job Require a Reference?”)

11.2.3 Returning a Value

When a subroutine returns a value to the caller, it behaves as a function. For example, you may call a subroutine from the right-hand side of an assignment statement. The subroutine can then send back a value to the caller which can be assigned to a variable, either scalar or array.

      $average  =  ave(3, 5, 6, 20);
returned value     call to subroutine

The value returned is really the value of the last expression evaluated within the subroutine.

You can also use the return function to return early from the subroutine based on some condition. Your main program will pick up in the line right after where the subroutine was called. If used outside a subroutine, the return function causes a fatal error. You could say that the return is to a subroutine what an exit is to a program. If you use the exit function in a subroutine, you will exit the entire program and return to the command line.

11.2.4 Scoping Operators: local, my, our, and state

Most programming languages provide a way for you to pass arguments by using call-by-value, where a copy of the value of the argument is received by the subroutine. If the copy is modified, the original value is untouched. To make copies of values in Perl, the arguments are copied from the @_ array and assigned to local variables. As discussed in Chapter 5, “What’s in a Name?” Perl provides two operators (also called keywords or functions) to create local copies, local and my. The state keyword is similar to the my operator, but it creates the variable and initializes it only once (similar to a static variable in the C language) and is only available for versions of Perl starting with 5.10. The our function, to put it simply, allows you to create a global variable even when strict is turned on.

The local Operator

The local operator was used to turn on call-by-value in Perl programs prior to the Perl 5 release. Although you can still use local with special variables and filehandles, the my operator is normally used, which further ensures the privacy of variables within a function block. With strict turned on, local variables will not be allowed.

The local operator creates local variables from its list. Any variable declared with local is said to be dynamically scoped, which means it is visible from within the block where it was created and visible to any functions called from within this block or any blocks (or subroutines) nested within the block where it is defined. If a local variable has the same name as a global variable, the value of the global one is saved and a new local variable is temporarily created. When the local variable goes out of scope, the global variable becomes visible again with its original value(s) restored. After the last statement in a subroutine is executed, its local variables are discarded. For an interesting Web page on when and how to use the local operator, see: http://perl.plover.com/local.html, particularly “Coping with Scoping.”

The my Operator

The my operator is also used to turn on call-by-value and is said to be lexically scoped. Although we have already used my variables to declare variables, it bears more discussion here. Lexically scoped means that variables declared as my variables are visible from the point of declaration to the end of the innermost enclosing block. That block could be a simple block enclosed in curly braces, a subroutine, eval, or a file. A variable declared with the my operator is created on a special scratch pad that is private to the block where it was created.3 Example 11.8 reviews the scope of my variables within a block.

3. See Chapter 12, “Does This Job Require a Reference?” for more on my variables.

Unlike the variables declared with the local operator, any variables declared as my variables are visible only within the block or subroutine in which they are declared, not in any subroutines called from this subroutine. Now let’s take a look at the next example, which shows the scope of my variables within a subroutine.

In the next example, we will examine the difference between my and local variables in a subroutine.

11.2.5 Using the strict Pragma (my and our)

Although we touched on pragmas, particularly the warnings and strict pragmas, they are topics that bear repeating when discussing subroutines. You may recall, a pragma is a module that triggers a compiler to behave in a certain way. The strict module, strict.pm, is part of the standard Perl distribution. If the compiler detects something in your program it considers “unsafe,” your program will be aborted. You can use the strict pragma with an import list to give specific restrictions, such as:

use strict 'vars';   # Must use my, our, state, or use vars.
use strict 'refs';   # Symbolic references not allowed.
use strict 'subs';   # Bareword (identifier without quotes) not allowed
                     # with the exception of subroutines.

Without the import list, all restrictions are in effect. Check the full documentation. At your command-line prompt, type perldoc strict.

You can use the strict pragma to prevent the use of global variables in a program. When you use a global variable, even a variable declared with local, the compiler will complain if strict has been declared. Only lexically scoped variables are allowed. They are variables that are declared with either the my or our built-in functions. The our built-in (Perl 5.6+) is used when you need a global variable but still want to use the strict pragma to protect against accidentally using global variables elsewhere in the program. (For more information about strict and packages, see the section, “The strict Pragma,” in Chapter 12, “Does This Job Require a Reference?”)

The state Feature

The state feature, like the my operator, creates a lexically scoped variable, but once created, it is not reinitialized when the subroutine is called again; that is, the variable is persistent from one call to the next. This feature was not implemented before Perl 5.10 was released. In order to avoid backward-compatibility problems, you must enable state with the use feature state pragma.

11.2.6 Putting It All Together

Example 11.5 was a bare bones sample of how to pass arguments (two arrays) to subroutines. The strict pragma was not used. There was no return value. This final version summarizes the steps for defining and invoking a subroutine with a return value.

11.2.7 Prototypes

A prototype can be described as like a template, and tells the compiler how many and what types of arguments the subroutine should get when it is called. It lets you treat your subroutine just like a Perl built-in function. Note that prototypes are often misused and should only be used to produce special behavior in your subroutine! So be wary.

The prototype is made part of a declaration and is handled at compile time.

Prototype:
   sub subroutine_name($$);
   Takes two scalar arguments

   sub subroutine_name(@);
   Argument must be an array, preceded with an @ symbol

   sub subroutine_name($$;@)
   Requires two scalar arguments and an optional array.
   Anything after the semicolon is optional.

11.2.8 Context and Subroutines

We introduced “context” when discussing variables and operators. Now we will see how context applies to subroutines. There are two main contexts: scalar and list. When mixing data types, results differ when an expression is evaluated in one or the other context. When a subroutine doesn’t return a value, the context is called void context.

A good example of context is in array or scalar assignment. Consider the following statements:

@list = qw( apples pears peaches plums );  # List context
$number = @list;   # Scalar context
print scalar @list, " ";  # Use the scalar function

In list context, @list is assigned an array of the elements, but in scalar context, $number produces the number of items in the array @list.

We have also seen context when using built-in Perl functions. Consider the localtime function. If the return value is assigned to a scalar, the date and time are returned as a string, but if the return value is assigned to an array, each element of the array represents a numeric value for the hour, minute, second, and so forth. The print function, on the other hand, expects to receive a list of arguments, in list context. You can use the built-in scalar function to explicitly evaluate an expression in a scalar context, as shown in Example 11.17.

The wantarray Function and User-Defined Subroutines

“He took that totally out of context,” is something you might say after hearing an argument based on a news story, the Bible, or a political speech. In Chapter 5, “What’s in a Name?” we discussed context, in Perl, which refers to how a variable and values are evaluated. For example, is the context list or scalar? There may be times when you want a subroutine to behave in a certain way based on the context in which it was called. This is where you can use the built-in wantarray function. You can use this function to determine whether the subroutine should be returning a list or a scalar. If your subroutine is called in list context (that is, the return value will be assigned to an array), then wantarray will return true; otherwise, it will return false. If the context is to return no value (void context), wantarray returns the undefined value. (Use this function sparingly; it is not recommended for general use due to unexpected behavior. See http://en.wikipedia.org/wiki/Principle_of_least_astonishment.)

11.2.9 Autoloading

The Perl AUTOLOAD function is called whenever Perl is told to call a subroutine and the subroutine can’t be found. The special variable $AUTOLOAD is assigned the name of the undefined subroutine.

You can also use the AUTOLOAD function with objects to provide an implementation for calling unnamed methods. (A method is a subroutine called on an object.)

11.2.10 BEGIN and END Blocks (Startup and Finish)

The BEGIN and END special code blocks may remind UNIX programmers of the special BEGIN and END patterns used in the awk programming language.

A BEGIN block is executed immediately, before the rest of the file is even parsed. If you have multiple BEGINs, they will be executed in the order they were defined.

The END block is executed when all is done; that is, when the program is exiting, even if the die function caused the termination. Multiple END blocks are executed in reverse order.

11.2.11 The subs Function

The subs function allows you to predeclare subroutine names. Its arguments are a list of subroutines. This allows you to call a subroutine without the ampersand or parentheses and to override built-in Perl functions.

11.3 What You Should Know

1. How do you define and call a subroutine?

2. What is the difference between a function and a subroutine?

3. Where do you put a subroutine definition in your Perl script?

4. How do you pass arguments to a subroutine?

5. How does Perl retrieve its parameter list?

6. What is the difference between local and global variables?

7. What is the difference between my and our?

8. How do you pass a hash to a function?

9. What is a state variable?

10. What is the significance of the return statement?

11. What is prototyping?

12. What is autoloading?

11.4 What’s Next?

In the next chapter, you will learn about references and why you need them. A Perl reference is a variable that refers to another one. In short, it contains the address of another variable. Generally, there are three good reasons to use references: to pass arguments by reference to subroutines; to create complex data structures, such as a hash of hashes, an array of arrays, a hash consisting of nested hashes, arrays, subroutines, and so forth; and to create Perl objects.

Exercise 11: I Can’t Seem to Function Without Subroutines

1. Write a program called tripper that will ask the user the number of miles he has driven and the amount of gas he used.

a. In the tripper script, write a subroutine called mileage that will calculate and return the user’s mileage (miles per gallon). The number of miles driven and the amount of gas used will be passed as arguments. All variables should be my variables. The program should test to make sure the user doesn’t enter 0 for the amount of gas. (Division by zero is illegal.)

b. Print the results.

c. Prototype tripper.

2. Hotels are often rated using stars to represent their score. A five-star hotel may have a king-size bed, a kitchen, and two TVs; a one-star hotel may have cockroaches and a leaky roof.

a. Write a subroutine called printstar that will produce a histogram to show the star rating for hotels shown in the following hash. The printstar function will be given two parameters: the name of the hotel and the number of its star rating. (Hint: sort the hash keys into an array. Use a loop to iterate through the keys, calling the printstar function for each iteration.)

%hotels=("Pillowmint Lodge" => "5",
         "Buxton Suites"    => "5",
         "The Middletonian" => "3",
         "Notchbelow"       => "4",
         "Rancho El Cheapo" => "1",
         "Pile Inn"         => "2",
        );

(OUTPUT)
Hotel                   Category
------------------------------------------
Notchbelow          |****         |
The Middletonian    |***          |
Pillowmint Lodge    |*****        |
Pile Inn            |**           |
Rancho El Cheapo    |*            |
Buxton Suites       |*****        |
------------------------------------------

b. Sort the hotels by stars, five stars first, one star last. Can you sort the hash by values so that the five-star hotels are printed first, then four, and so forth? (See http://alvinalexander.com/perl/edu/qanda/plqa00016.)

Hotel             Category
-------------------------
Buxton Suites    |*****    |
Pillowmint Lodge |*****    |
Notchbelow       |****     |
The Middletonian |***      |
Pile Inn         |**       |
Rancho El Cheapo |*        |
-------------------------

3. Write a grades program to take the course number and the name of a student as command-line arguments. The course numbers are CS101, CS202, and CS303. The program will include three subroutines:

• Subroutine ave to calculate the overall average for a set of grades.

• Subroutine highest to get the highest grade in the set.

• Subroutine lowest to get the lowest grade in the set.

a. Print the average, the highest score, and the lowest score.

b. If there were any failures (average below 60), print the name, course number, and a warning to STDERR such as: Be advised: Joe Blow failed CS202.

c. Send the name of the failing student and the course number to a file called failures. Sort the file by course number.

d. Use the AUTOLOAD function to test that each subroutine has been defined.

4. Write a function to calculate and return the monthly payment on a loan where:

P = principal, the initial amount of the loan

I = the annual interest rate (from 1 to 100 percent)

L = length, the length (in years) of the loan, or at least the length over which the loan is amortized

The following assumes a typical conventional loan where the interest is compounded monthly. (See http://www.hughchou.org/calc/formula.html for tips on how to calculate mortgage loan payments.)

a. First, define two more variables to make the calculations easier:

J = monthly interest in decimal form = I / (12 × 100)

N = number of months over which loan is amortized = L × 12

b. Create a hash with the values of P, I, L, and pass the hash to the function. Return the monthly payment using the following formula (you must convert the formula to Perl):

M = P * ( J / (1 - (1 + J) ** -N))

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.133.61