Practical Extraction and Reporting Shortcuts

Ruby was influenced by the scripting language Perl, whose name is an acronym for Practical Extraction and Reporting Language. Because of this, Ruby includes a number of global functions that make it easy to write programs that extract information from files and generate reports. In the object-oriented paradigm, input and output functions are methods of IO, and string manipulation functions are methods of String. For pragmatic reasons, however, it is useful to have global functions that read from and write to predefined input and output streams. In addition to providing these global functions, Ruby follows Perl further and defines special behavior for the functions: many of them operate implicitly on the special method-local variable $_. This variable holds the last line read from the input stream. The underscore character is mnemonic: it looks like a line. (Most of Ruby’s global variables that use punctuation characters are inherited from Perl.) In addition to the global input and output functions, there are several global string processing functions that work like the String methods but operate implicitly on $_.

These global functions and variables are intended as shortcuts for short and simple Ruby scripts. It is generally considered bad form to rely on them in larger programs.

Input Functions

The global functions gets, readline, and readlines are just like the IO methods by the same names (see Reading lines), but they operate implicitly on the $< stream (which is also available as the constant known as ARGF). Like the methods of IO, these global functions implicitly set $_.

$< behaves like an IO object, but it is not an IO object. (Its class method returns Object, and its to_s method returns “ARGF”.) The precise behavior of this stream is complicated. If the ARGV array is empty, then $< is the same as STDIN: the standard input stream. If ARGV is not empty, then Ruby assumes that it is a list of filenames. In this case, $< behaves as if it were reading from the concatenation of each of those files. This does not correctly capture the behavior of $<, however. When the first read request for $< occurs, Ruby uses ARGV.shift to remove the first filename from ARGV. It opens and reads from that file. When the end of that file is reached, Ruby repeats the process, shifting the next filename out of ARGV and opening that file. $< does not report end-of-file until there are no more file names in ARGV.

What this means is that your Ruby scripts can alter ARGV (to process command-line options, for example) before beginning to read from $<. Your script can also add additional files to ARGV as it runs, and $< will use these files.

Deprecated Extraction Functions

In Ruby 1.8 and before, the global functions chomp, chomp!, chop, chop!, gsub, gsub!, scan, split, sub, and sub! work like the same-named methods of String, but operate implicitly on $_. Furthermore, chomp, chop, gsub, and sub assign their result back into $_, which means that they are effectively synonyms for their exclamation-mark versions.

These global functions have been removed in Ruby 1.9, so they should not be used in new code.

Reporting Functions

Kernel defines a number of global functions for sending output to $stdout. (This global variable initially refers to the standard output stream, STDOUT, of the Ruby process, but you can alter its value and change the behavior of the functions described here.)

puts, print, printf and putc are equivalent to the same-named methods of STDOUT (see Writing to a Stream). Recall that puts appends a newline to its output if there is not one there already. print, on the other hand, does not automatically append a newline, but it does append the output record separator $, if that global variable has been set.

The global function p is one with no analog in the IO class. It is intended for debugging, and its short name makes it very easy to type. It calls the inspect method of each of its arguments and passes the resulting strings to puts. Recall that inspect is equivalent to to_s by default, but that some classes redefine it to provide more developer-friendly output suitable for debugging. If you require the pp library, you can use the pp function in place of p to “pretty print” your debugging output. (This is useful for printing large arrays and hashes.)

The printf method mentioned earlier expects a format string as its first argument and substitutes the value of its remaining arguments into that string before outputting the result. You can also format into a string without sending the result to $stdout with the global function sprintf or its synonym format. These work like the % operator of String.

One-Line Script Shortcuts

Earlier in this chapter, we described the -e option to the interpreter for executing single-line Ruby scripts (often used in conjunction with the -n and -p looping options). There is one special shortcut inherited from Perl that is allowed only in scripts specified with -e.

If a script is specified with -e, and a regular expression literal appears by itself in a conditional expression (part of an if, unless, while, or until statement or modifier), then the regular expression is implicitly compared to $_. If you want to print all lines in a file that begin with the letter A, for example, you can write:

ruby -n -e 'print if /^A/' datafile

If this same script was stored in a file and run without the -e option, it would still work, but it would print a warning (even without -w). To avoid the warning, you’d have to make the comparison explicit instead:

print if $_ =~ /^A/
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.131.47