When you complete this chapter, you should be able to explain each of the following statements (each statement is an entity unto itself):
printf "%-10s $%d%10.2f %b %x %o
", "Jack", 15,15,15,15,15;
print 'She cried, "I can't help you!"',"
";
$str = sprintf "$%.2f", $sal;
print qq!u$name, the local time is !, scalar localtime, "
";
use feature qw(say);
say "The sum is ", 5 + 4;
say "No more! "
Before we get started, please take note that each line of code, in most of the examples throughout this book, is numbered. The output and explanations are also numbered to match the numbers in the code. These numbers are provided to help you understand important lines of each program. When copying examples into your text editor, don’t include these numbers, or you will generate many unwanted errors!
In Chapter 3, “Perl Scripts,” we briefly introduced standard I/O. By convention, whenever your program starts execution, the parent process (normally a shell program) opens three predefined streams called stdin, stdout, and stderr. All three of these streams are connected to your terminal by default.
stdin is the place where input comes from, the terminal keyboard; stdout is where output normally goes, the screen; and stderr is where errors from your program are printed, also the screen.
Perl inherits stdin, stdout, and stderr from the shell. Perl does not access these streams directly but gives them names called filehandles. Perl accesses the streams via the filehandle. The filehandle for stdin is called STDIN; the filehandle for stdout is called STDOUT; and the filehandle for stderr is called STDERR. STDERR is a separate stream that sends its output to the screen and allows you to redirect those errors to, for example, an error log in order to find out what went wrong. Later, we’ll use Perl techniques to deal with errors and error messages. (See Figure 4.1.)
In Chapter 10, “Getting a Handle on Files,” we’ll see how you can create your own filehandles, but for now we’ll stick with those that are predefined.
The print and printf functions, by default, send their output to the STDOUT filehandle, your screen. For example:
print "Give me your name";
chomp($name = <STDIN>);
# User is prompted to enter something at the keyboard,
# until he presses the return key. The chomp function removes the newline.
if ( $name eq "") { print STDERR "You didn't enter anything.
$name is empty.
";
exit 1; }
print STDOUT "Hello, $name
"; # A string of output is sent to screen.
print "Hello back to you.
"; # STDOUT is the default for the print
# function.
When printing, it is helpful to understand how Perl views words. A word is a sequence of characters with a unit of meaning, much like words in English. Perl words are not restricted to just alpha characters, but they cannot contain whitespace unless quoted. A string is a word or words enclosed in matching quotes (for example, “This is the life!”). You can use an unquoted word to identify filehandles, functions, labels, and other reserved words; for example, with print STDERR “Error ”, print is a function, STDERR is a filehandle, and “Error ” is a string. If the word has no special meaning to Perl, it will be treated as if surrounded by single quotes and is called a bareword.
You will probably use the built-in print function more than any of the printing options provided by Perl because it is efficient and easy to use. The print function prints a string or a list of comma-separated words to the Perl filehandle STDOUT. If successful, the print function returns 1.
The string literal adds a newline to the string. You can embed it in the string or treat it as a separate string. (Perl requires that escape sequences like be enclosed in double quotes.) The say function ( version 5.10) is just like print but appends a newline for you. (See Section 4.4.2, “The No Newline say Function,” later in this chapter.)
Since quoting affects the way in which variables are interpreted, this is a good time to review Perl’s quoting rules. It is often difficult to determine which quotes to use, where to use them, and how to find the culprit if they are misused; in other words, it can be a real debugging nightmare.1 To lighten things up a little, Perl offers an alternative method of quoting, but you still have to fully understand quoting rules before the alternative is useful.2 You can use the backslash () to quote a special character such as $ or @ and it behaves as a set of single quotes, as ‘$’ or ‘@’.
1. Barry Rosenberg, in his book KornShell Programming Tutorial, has a chapter titled “The Quotes From Hell.”
2. Larry Wall, creator of Perl, calls his alternative quoting method “syntactic sugar.”
Perl has three types of quotes and all three types have a different function. They are single quotes, double quotes, and backquotes. Quotes come in pairs and must be matched. For example:
print "This is a quoted string. Some characters are special;
e.g., $var, @list and are interpreted within double quotes.
";
print 'This is also a quoted string. All characters are literal within
single quotes; i.e., $, @, and backslash characters are not special';
print "This is an operating system shell command enclosed in back quotes."
, `pwd`;
print "The backslash quotes a single character as in $5.00. It protects
the $ from interpretation";
A pair of single or double quotes may delimit a string of characters. Quotes will either allow the interpretation of special characters or protect special characters from interpretation, depending on the kind of quotes you use.
Single quotes are the “democratic” quotes. All characters enclosed within them are treated equally; in other words, there are no special characters. But the double quotes discriminate. They treat some of the characters in the string as special characters. The special characters include the $ sign, the @ symbol, and escape sequences such as and .
When backquotes surround an operating system command, the command will be executed by the shell, often called command substitution. The output of the command will be returned as a string that can be used in a print statement, assigned to a variable, and so forth. If you are using Windows, Linux, or UNIX, the commands enclosed within backquotes must be supported by the particular operating system and will vary from system to system. (If your program is going to be used on several operating systems, using backquotes will affect its portability.)
No matter what kind of quotes you are using, they must be matched. Because the quotes mark the beginning and end of a string, Perl will complain about a “Might be a multiline runaway string” or “Execution of quotes aborted...” or “Can’t find string terminator anywhere before EOF...” and fail to compile if you forget one of the quotes.
Double quotes must be matched, unless embedded within single quotes. When a string is enclosed in double quotes, scalar variables (preceded with a $) and arrays (preceded by the @ symbol) are interpolated (that is, the value of the variable replaces the variable name in the string). Hashes (preceded by the % sign) are not interpolated within the string enclosed in double quotes.
Strings that contain string literals (such as , ) must be enclosed in double quotes for backslash interpretation.
A single quote may be enclosed in double quotes, as in “I don’t care!”
If a string is enclosed in single quotes, it is printed literally (what you see is what you get).
If a single quote is needed within a string, then it can be embedded within double quotes or backslashed. If double quotes are to be treated literally, they can be embedded within single quotes.
UNIX/Windows3 commands placed within backquotes are executed by the shell, and the output is returned to the Perl program as a string, usually assigned to a variable or made part of a print string. When the output of a command is assigned to a variable, the context is scalar (that is, a single value is assigned).4 For command substitution to take place, the backquotes cannot be enclosed in either double or single quotes. (Make note, UNIX shell programmers: backquotes cannot be enclosed in double quotes as in shell programs.)
3. If using other operating systems, such as Microsoft or Mac OS 9.1 and below, the OS commands available for your system will differ.
4. If output of a command is assigned to an array, the first line of output becomes the first element of the array, the second line of output becomes the next element of the array, and so on.
Perl provides an alternative form of quoting—the q, qq, qx, and qw constructs.
• The q represents single quotes.
• The qq represents double quotes.
• The qx represents backquotes.
• The qw represents a quoted list of words. (See Table 4.1.)
The string to be quoted is enclosed in forward slashes, but you can use alternative delimiters for all four of the q constructs. You can use a nonalphanumeric character for the delimiter, such as a # sign, ! point, or paired characters, such as parentheses, square brackets, and so forth. You can also use a single character or paired characters. For example:
q/Hello/
q#Hello#
q{Hello}
q[Hello]
q(Hello)
Quoting rules affect almost everything you do in Perl, especially when printing a string of words. Strings are normally delimited by a matched pair of either double or single quotes. When a string is enclosed in single quotes, all characters are treated as literals. When a string is enclosed in double quotes, however, almost all characters are treated as literals, with the exception of those characters that are used for variable substitution and special escape sequences. We will look at the special escape sequences in this chapter and discuss quoting and variables in Chapter 5, “What’s in a Name?”
Perl uses some characters for special purposes, such as the dollar sign ($) and the at (@) sign. If these special characters are to be treated as literal characters, they may be preceded by a backslash () or enclosed within single quotes (‘ ‘). Use the backslash to quote a single character rather than a string of characters.
It is so common to make mistakes with quoting that we will introduce here the most common error messages you will receive resulting from mismatched quotes and bare words.
Think of quotes as being the “clothes” for Perl strings. If you take them off, you may get a “Bareword” message such as:
Bareword “there” not allowed while “strict subs” in use at try.pl line 3. Execution of program.pl aborted due to compilation errors.
Also think of quotes as being mates. A double quote is mated with a matching double quote, and a single quote with a matching single quote. If you don’t match the quotes, if one is missing, the missing quote has “run away.” Where did the mate go? You may receive an error like this:
(Might be a runaway multi-line “” string starting on line 3)
When assigning literal values5 to variables or printing literals, you can represent the literals numerically as integers in decimal, octal, or hexadecimal or as floats in floating-point or scientific notation.
5. Literals may also be called “constants,” but the Perl experts prefer the term “literal,” so in deference to them, we’ll use the term “literal.”
Strings enclosed in double quotes may contain string literals, such as for the newline character, for a tab character, or e for an escape character. String literals are alphanumeric (and only alphanumeric) characters preceded by a backslash. They may be represented in decimal, octal, or hexadecimal, or as control characters.
Perl also supports special literals for representing the current script name, the line number of the current script, and the logical end of the current script.
Since you will be using literals with the print and printf functions, let’s see what these literals look like.
You can represent literal numbers as positive or negative integers in decimal, octal, or hexadecimal (see Table 4.2). You can represent floats in floating-point notation or scientific notation. Octal numbers contain a leading 0 (zero), hex numbers a leading 0x (zero and x), and numbers represented in scientific notation contain a trailing E, followed by a negative or positive number representing the exponent.
Like shell strings, Perl strings are normally one or more characters delimited by either single or double quotes; for example, “This is a literal string” and ‘so is this a literal string’. Escape sequences, (single characters that when preceded by a backslash don’t represent themselves) are interpreted only if enclosed in double quotes (see Table 4.3). “This is a literal string with an escape sequence ”.
Perl’s special literals _ _LINE_ _ and _ _FILE_ _ are used as separate words and will not be interpreted if enclosed in quotes, single or double. They represent the current line number of your script and the name of the script, respectively. These special literals are equivalent to the predefined special macros used in the C language.
The _ _END_ _ special literal is used in scripts to represent the logical end of the file. Any trailing text following the _ _END_ _ literal will be ignored, just as if it had been commented. The control sequences for end of input in UNIX are <CTRL>+D (