Appendix A. Ruby Quick Reference

Basic Ruby Syntax

The Ruby language is made up of expressions. Each expression returns a value. Even elements that are just statements in other languages, such as if and for, are expressions in Ruby that return values.

Ruby is a pure object-oriented language. Every variable or constant in Ruby is an object, and there are no basic non-object types. Every variable or literal responds to the basic method call syntax.

A simple Ruby expression is one of the following:

  • A literal. Ruby has literal syntax for arrays, hashes, numbers, ranges, regular expressions, strings, and symbols.

  • The name of an existing variable or constant.

  • A method call, which combines the name of an existing variable or constant with a method name. The basic form of a method call is <receiver>.<method>(<arguments>). Variants on this form will be discussed later.

  • One of several special expressions invoked by the use of a keyword such as if, case, or while.

Complex expressions can be built using Ruby operators. Variable assignment using = is considered to be a type of operator. Most expressions can also have arbitrarily complex expressions within them — for example, the arguments of a method call are all themselves expressions.

A Ruby expression ends with a line break unless the Ruby interpreter has a reason to believe the expression is intended to continue. The expression continues if there is an open delimiter such as a quotation mark, parenthesis, bracket, or brace. The expression also continues if the last character in the line is a comma or operator. A complex expression such as an if statement is usually expected to cross over multiple lines. An expression can be forced to continue to the next line by ending a line with a backslash character.

Multiple expressions can be placed on the same line by separating the expressions with a semicolon.

Literal expressions

Ruby has several different mechanisms to create literal objects. In addition to expected literals for number and string, Ruby literals can also create arrays, Booleans, hashes, ranges, regular expressions, and symbols.

Arrays

A literal array is created by enclosing the elements in brackets. Inside the brackets, the elements are separated by commas. Each element can be an arbitrarily complex Ruby expression. The result is an instance of the class Array. For example, lool at the array defined here:

x = [1, "hello", fred]

It contains three elements: a number, a string, and a variable.

If the elements of the array are all strings, Ruby provides a shortcut syntax, using the notation %w followed by an arbitrary delimiter. Elements in the array are separated by a space, and the strings are interpreted as literals. A backslash character can be used to insert a space inside an element, rather than treat the space as a delimiter.

>> %w{zot jenny max butch peabody}
=> ["zot", "jenny", "max", "butch", "peabody"]
>> %w(zot jenny max butch peabody arthur dekker)
=> ["zot", "jenny", "max", "butch", "peabody", "arthur dekker"]

A second form, with a capital %W, allows for string interpolation rules to be obeyed inside the array — it's the equivalent of a double-quoted string. See the Strings section for full details.

>> %W(a b #{1 + 1} c)
=> ["a", "b", "2", "c"]

Adding elements to the end of an array is managed with the push method or the following operator:

<<.[1, 2, 3].push(4)
[1, 3, 3] << 4

Removing the last element from the array is managed with the method pop. The methods shift and unshift provide similar functionality for the beginning of the array.

An arbitrary element in the array can be accessed using index lookup. The first element of the array is at index 0. A positive integer indicates the index from the start of the array, returning nil if the integer is greater than the size of the array. The index −1 is the last element of the array; other negative integers are counted from the end of the array.

The expression inside the brackets can be a Ruby range, in which case the sub-array corresponding to the indexes in the range is returned. Less commonly, the index expression can be two integers separated by a comma [index, length], which returns a sub-array starting at the index for the given length.

[1, 2, 3][0] = 1
[1, 2, 3][-1] = 3
[1, 2, 3][0..1] = [1, 2]
[1, 2, 3][0, 1] = [1]

Arrays provide a number of different methods that allow enumeration over the contents of the array. Many of these are provided by the Enumerable module. The methods include each, which iterates over the contents of the array, map, which applies a block to each element of the array and returns the resulting list, and select, which applies a block to each element of the array and returns those elements for which the result is true.

Boolean literals

Ruby defines the special variables true and false, which correspond to the expected Boolean values. They are the sole instances of the classes TrueClass and FalseClass. Ruby also defines the special value nil, which is the sole instance of the class NilClass.

The value nil also evaluates to false when evaluated in a Boolean expression. Unlike other scripting languages, no other values in Ruby are treated as false. All values other than false and nil are considered to be logically true.

Hashes

A hash literal contains a series of key/value pairs inside braces. The key and value are separated by => and each pair is separated by a comma. The key and value can be arbitrary Ruby expressions.

>> {:a => 1, "b" => "fred"}
=> {:a=>1, "b"=>"fred"}

The => sequence can be replaced by a comma; this should only be done if you don't want anybody to read your code. The created object is an instance of the class Hash. Ideally, a hash key is an immutable object, such as a symbol or number.

Hash elements are accessed and set through bracket index lookup: hash[key]. The methods keys and values return a list of the appropriate elements, and the method has_key? returns true if the key is in the hash.

Numbers

Ruby's number literals are straightforward. An integer is any sequence of digits. A sign character can be the first character. Integer literals can be written to other bases by starting the numbers with a leading 0 for octal, a 0x for hexadecimal, and a 0b for binary. Decimal numbers can also be indicated with 0d. Underscore characters are ignored in integer literals and therefore are often used as group separators. Integer literals create instances of the class Fixnum, unless the literal is outside the range of Fixnum, in which case the literal is of type Bignum.

>> 100     => 100
>> −100    => −100
>> 0100    => 64
>> 0x100   => 256
>> 0b100   => 4
>> 987_123 => 987123

Floating point literals are a sequence of digits containing a decimal point. There must be at least one digit on either side of the decimal point, or else you get a syntax error. A floating point literal is converted to an instance of the class Float.

>> 1.3   => 1.3
>> 1.3e2 => 130.0

Ranges

A Ruby range literal can be indicated in one of two ways. The range consists of two expressions separated by either two or three dots. The two-dot version creates a range that includes the value in the second expression, while the three-dot version excludes the final value.

Ranges are normally used as compact storage for a long sequence of consecutive values, which can be iterated over and converted to arrays. The following example shows the difference between the two- and three-dot versions by showing the difference in the array that is created from the range.

>> x = 1..10  => 1..10
>> x.to_a     => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> y = 1...10 => 1...10
>> y.to_a     => [1, 2, 3, 4, 5, 6, 7, 8, 9]

The expressions that make up the two ends of the range can be arbitrarily complex. The range ends are most commonly integers, but any class that implements the method succ can act as a range boundary.

Note

Dates and strings are also often used as range boundaries.

Regular expressions

A regular expression literal can be created by placing a regular expression pattern inside a pair of forward slashes, as follows:

/ab*d/

Ruby also offers an arbitrary delimiter marker for regular expressions %r, which is often used if the expression pattern itself contains a lot of slashes.

%r{ab*d}

Regular expression literals are converted to instances of the class Regexp. Regular expressions are used to perform complex pattern matches against strings. In Ruby, these matches are performed using either the operator =~ or the method Regexp.match. Most characters in the regular expression match against the same character in the test string; however, several special character forms augment the basic behavior.

After the ending delimiter of the literal, one or more characters can be used to indicate optional regular expression behavior. The three most common options are as follows:

  1. i The pattern match is case sensitive

  2. m The pattern match is assumed to encompass multiple lines. Practically, this means that the dot special character matches newline characters.

  3. x Literal white space in the pattern is ignored, allowing you to use spacing to make the expression easier to read. Spaces in the pattern can be included by using a special pattern such as s.

Regular expression literals are converted to instances of the class Regexp.

Within a regular expression literal, the following 14 characters have special meaning:

( ) [ ] { } . * ? | +  $ ^

To actually include one of those characters literally in the pattern, they must be escaped using a backslash, as in * or \. Each of the delimiter pairs indicates something different within a regular expression.

Parentheses have their normal function of grouping elements to indicate the scope of operators. For example, /face*/ matches the string faceeee, while /(face)*/ matches the string facefaceface. In addition, parentheses cause a portion of the matched string to be saved for use either within the regular expression pattern or after the match is complete. The groups can be referred to using special variables of the form 1 and 2 during the pattern, and $1 and $2 after the match, as in the following example:

>> /(.*)c(.*)/ =~ "abcde" => 0
>> $1 => "ab"
>> $2 => "de"

Variables are numbered based on the position of the opening parenthesis — in the pattern /((.*)c)(.*)/, the variable $1 includes c.

Brackets are used to mark a set of characters that can match the string at that point, so /[abcd]/ matches a, b, c, or d. That pattern could also be written /[a-d]/; the hyphen indicates an inclusive sequence of characters.

Multiple sequences can be in one set, and so /[a-zA-z]/ matches any upper- or lowercase letter. If the set starts with a ^, then the pattern matches only characters that are not in the set, so /[^a-zA-z]/ matches non-alphabetic characters. Also, within brackets, all special characters, except the right bracket and hyphen, can appear without being escaped.

Braces are used to indicate the number of times a sub-pattern must exist in the match. The basic form is /[a-z]{3}/, which would indicate that exactly three characters must be in the matching string. A range can be indicated with {3,5}, and a minimum value can be indicated with the form {3,}. Braces are also used as part of the #{expr} interpolation, which is the same in regular expressions as it is in double-quoted strings.

There are three shortcuts for indicating commonly used ranges for a match. The character *means "zero or more," and so the regex /a[a-z]*/ matches any string that contains an a followed by zero or more lowercase letters. The character + means "one or more" and the character ? means "zero or one."

By default, the * and + characters are "greedy," meaning that they match as much of the string as they can. This can be an issue if there is more than one potential stopping place for the sub-pattern. You can change the default behavior by putting a ? after the * or +. In the first example following, the [a-z]* pattern matches past the first b and stops at the last possible point, before the second b. In the second, non-greedy example, the pattern stops at the first point, before the first b.

>> /a([a-z]*)b/ =~ "aabceb" => 0
>> $1 => "abce"
>> /a([a-z]*?)b/ =~ "aabceb" => 0
>> $1 => "a"

There are several shortcuts to denote common character sequences. The special character ^ matches the beginning of the string, while $ matches the end. Many programmers prefer to use the slightly less cryptic synonyms A and z. The variant  matches the entire string except for a trailing newline character. The sequence  matches a word boundary; d matches any digit and is the same as [0-9].

The sequence s matches the white space characters space, tab, newline, carriage return, and line feed; it is the same as [ f ]. The sequence w matches "word" characters, meaning [a-zA-Z0-9_]. All four of these sequences are negated by replacing the lowercase character with an uppercase one, and so D matches any non-digit, or [^0-9].

Finally, the pipe character indicates a logical or, and so /a|b/ matches a or b.

Strings

There are several different ways to write literal strings in Ruby. The simplest is to use single quotation marks. Within a single-quoted string literal, only two interpolations are performed. A ' is used to insert a literal single quote, and a \ is used to insert a literal backslash.

>> 'hello' => "hello"
>> isn't' => "isn't"
>> 'nip\tuck' => "nip\tuck"
>> 'nip\tuck'.size => 8

In a single quoted string, backslashes that are not used as an escape are treated as literals, but converted to the escape format in the irb output, so the third and fourth examples evaluate to the same result. In the final example, the irb output preserves the escape sequence in the return value — the resulting string only has one backslash, as shown by the size method.)

With a double-quoted string literal, the full complement of escape characters, such as for newline and for tab, are substituted (backslashes that are not part of an escape sequence are ignored). In addition, the sequence #{expr} is replaced in the string by the value of the expression; the value is converted to a string if needed. For example,

>> "ab#{1 + 1}c" => "ab2c"

There are generic delimiter forms for both string forms, %q for single-quote and %Q for doublequote. The double-quote form can also be written as just %, as in the following example:

>> %q(ab#{1 + 1}c) => "ab#{1 + 1}c"
>> %Q(ab#{1 + 1}c) => "ab2c"
>> %(ab#{1 + 1}c) => "ab2c"

Ruby also supports Perl-style here docs. A here doc starts with << and an identifier, and continues over multiple lines until the identifier is reached.

<<DOC
all this is
in the here
doc string
DOC

If there is a hyphen before the identifier, then the closing identifier can be indented;

<<-DOC
          this can be indentd
DOC

Otherwise, it must start at the beginning of a line. By default, the string is interpreted according to double-quote rules; however, if the identifier is encased in quotation marks, then the here doc is interpreted according to the style of quotation marks used.

Symbols

A symbol in Ruby is an immutable string, similar to an interned string in other languages. In addition, all instances of a symbol with the same value are guaranteed to point to the same internal object. Symbols are used by Ruby as the internal representation of method and variable names. Because they are immutable, they are commonly used as hash keys.

A symbol is formed by a colon, followed by a name or string literal. The string literal is interpreted according to the normal string rules. It does not matter how the symbol is constructed in determining its value; all three of the following are the same symbol:

:person2person
:'person2person'
:"person#{1+1}person"

One important note about symbols: the Symbol class does not implement the <=> operator, meaning that a list of symbols cannot be sorted using the sort method alone.

Variable and method names

Ruby variable names are typical of identifiers and consist of lowercase letters, uppercase letters, digits, and the underscore character, with the following restrictions:

  • A local variable name must begin with either a lowercase letter or (much more rarely) an underscore. By convention, local variables use underscores to separate words (as in this_is_a_name), rather than interCaps.

  • An instance variable for an object starts with an @ sign, as in @thingy. By convention, instance variables are lowercase and use underscores to separate words.

  • Class variables for objects start with @@, and are otherwise identical to instance variables.

  • Constant values must start with a capital letter. By convention, constants that have normal object values are in all capitals with underscores to separate words. The capitalized initial letter is a marker to Ruby that the value is constant.

  • Class and module names are a special case of constant values. Class and module names must also begin with a capital letter. By convention, class and module names are mixed case.

  • Ruby has global variables, which begin with $. Normally, global variables are rarely used within a Ruby program; however, there are several standard global variables, of which perhaps the most commonly used are $1, $2, and so on, which contain regular expression matches.

Ruby method names come in two forms. The main form is similar to a local variable, starting with a lowercase letter or underscore, and conventionally having underscores separating words. Unlike a local variable, a method name may end with a question mark (?) or exclamation point (!).

By convention, a method ending with a question mark, such as nil?, returns a Boolean value. An exclamation point is usually used to indicate a variant of a method that changes the receiving object in place, rather than returning a changed copy, for example sort and sort!. An exclamation point is sometimes used more generally to indicate any method that makes a destructive change on the receiving object.

Method names can also end with an equals sign (=). This is interpreted to mean that the method is a setter method and is called as the left side of an assignment statement, rather than through a normal method call. So, a method defined as def name= would be called in a line of code as follows:

obj.full_name = "Scott McCloud"

It is possible for a method and a local variable with the same name to co-exist in the same scope. Under normal circumstances, Ruby does a fine job of resolving ambiguity. The most common confusing case is something similar to the previous example, but without an explicit receiver, as follows:

full_name = "Perry Mason"

Ruby interprets this as an assignment to a new local variable called full_name. However, you might want it to call the setter method full_name= for the current value of self. In this one case, you must include the self value explicitly to invoke the setter, as follows:

self.full_name = "Perry Mason"

Method names can also be the symbols of one of the operators that is listed in the following section as capable of being overridden. The method declaration is just the symbol for the operator, as follows:

def +(other)

Therefore, the following line

a + b

calls the + method for object a if it exists (if the method doesn't exist, you get an error). Many of the operators are defined at the Object level, and are valid for all Ruby objects.

Operators

Ruby operators include the typical set, as well as a few Ruby-specific ones. Following is the list, from highest to lowest priority. However, if you are depending on the details of the priority list in your code, you're probably writing hard-to-read code. Throw in a couple of parentheses.

Table A.1 shows the Ruby operators and their meanings in commonly used classes. All elements in the same table row have the same priority. Unless otherwise indicated, the operators can be overridden as Ruby methods using the same symbol.

Table A.1. Ruby Operators and Their Meanings

Operator

Definition

::

Module and class scope resolution. This operator cannot be overridden.

[]

Array or hash element lookup, overridden by directories to indicate file globbing, as in dir[file"file.*"].

[]=

Array or hash element assignment.

**

Raising to a power. For example, 2 ** 3 = 8.

~

Bitwise complement for numbers. Pattern negation for regular expressions and strings.

!

Logical negation.

+

Unary plus. When overriding this method, use the method name +@, as in def +@.

-

Unary minus. When overriding this method, use the method name -@, as in def -@.

*

Multiplication. Overridden for arrays and strings to indicate repetition; for example, [a, b] * 3 = [a, b, a, b, a, b] "fred" * 2 = "fredfred".

/

Division. If both operands are integers, then so is the result.

%

Modulus. Overridden by strings to provide sprintf formatting.

+

Binary addition. For arrays and strings indicates concatenation.

-

Binary subtraction. For arrays, implements set difference.

<<

Bitwise left shift. Left shift is overridden by Array to implement push, by String to implement append, and by IO objects to indicate writing to the output.

>>

Bitwise right shift.

&

Logical and for Booleans; bitwise and for integers. For arrays, overridden to mean set intersection.

^

Exclusive logical or for Booleans; exclusive bitwise or for integers.

|

Inclusive logical or for Booleans; inclusive bitwise or for integers. For arrays indicates set union.

>

Greater than. For most objects, all the comparison methods are defined in terms of <=> by mixing in the Comparable module.

>=

Greater than or equal to. For most objects, all the comparison methods are defined in terms of <=> by mixing in the Comparable module.

<

Less than. For most objects, all the comparison methods are defined in terms of <=> by mixing in the Comparable module.

<=

Less than or equal to. For most objects, all the comparison methods are defined in terms of <=> by mixing in the Comparable module.

<=>

Comparison operator. Returns −1, 0, or 1 depending on relationship between operands.

==

Equal.

!=

Not equal. Implemented in terms of ==. May not be overridden as a method.

===

Equal for purposes of case statement.

=~

Pattern match.

!~

Not pattern match. Implemented in terms of =~; may not be overridden.

&&

Logical and. May not be overridden as a method.

||

Logical or. May not be overridden.

..

Inclusive range operator. Cannot be overridden.

...

Exclusive range operator. Cannot be overridden.

?:

Ternary operator. May not be overridden.

=

Assignment. Variant forms are /=, *=, %=, +=, -=, |=, &=, ||=, &&=, **=, >>=, <<=. None of these may be overridden as such, but the variant forms are defined in terms of their other operators.

defined?

Returns Boolean true if the symbol is defined, as in x defined?

not

Logical not. May not be overridden.

and, or

Logical operators. May not be overridden.

Method calls

The basic form of a Ruby method call is as follows:

receiver.method

This indicates that the method is called on the receiver. See the "Objects and Classes" section for a discussion of how the class structure is searched to find the method.

There are a number of optional elements in the method call. The receiver can be omitted, in which case self is assumed for the current context. As a matter of convention, self is only explicitly used as a receiver where it is necessary to avoid ambiguity.

Arguments can be passed to the method; they are placed in a comma-delimited list.

receiver.method(arg1, arg2)

The parentheses may be omitted if doing so does not introduce ambiguity. By convention, empty pairs of parentheses are always omitted.

There are a few special argument forms, the calling forms will be discussed here. The "Defining Methods" section will discuss their meaning when responding to a call with special arguments. After the explicit arguments, a series of key/value pairs can be added:

receiver.method(arg1, arg2, key1 => val1, key2 => val2)

The key/value pairs are merged into a single hash object before being passed to the receiver.

An array can also be added with the * syntax.

x = [3, 4]
receiver.method(arg1, arg2, *x)

In this case, the array is unrolled and the elements of the array are passed to the receiver as individual arguments — the receiver in this case would get four arguments. Technically, the array rollup can only appear after the key/value pairs; however, it's almost unheard of for a method to have both.

The final optional argument to the method is a block, which can be defined in three different ways. First, the block can be an instance of the class Proc, in which case the argument must be preceded with an ampersand:

receiver.method(arg1, arg2, &proc)

The second and third ways define the block outside the argument list using either of the following syntax forms:

receiver.method(arg1, arg2) {|block_arg| ...}
receiver.method(arg1, arg2) do |block_arg| end

The technical difference between the braces and the do/end syntax is that the braces have higher priority. However, that is unlikely to be an issue in typical Ruby code. By convention, the do/end form is used for multiline blocks, and the braces are used for single-line blocks.

Note

You will occasionally see the convention that braces are used in any case where you intend to chain the return value of the method call with the block, regardless of how many lines the block takes.

See the "Defining Methods" section for information on how these argument types are defined and used within objects.

Special keyword expressions

Much of Ruby's control flow is managed by special expressions based on keywords. This section will discuss them.

The if expression

An if expression allows for conditional evaluation. The most basic form of the expression is as follows:

if <boolean expression>
  <body...>
end

If the Boolean expression evaluates to true, then the body expressions are evaluated.

An if expression can be written on a single line, in which case the keyword then must separate the Boolean expression and the body.

if <boolean expression> then <body...> end

The if expression takes an optional else clause, which is evaluated if the Boolean expression evaluates to false.

if <boolean expression>
  <body...>
else
  <else body...>
end

If there are multiple else clauses, then the keyword for separating them is elsif. The body corresponding to the first Boolean expression to return true is executed. If none of the Boolean expressions return true, then the else clause is evaluated if one has been included.

if <boolean expression>
  <body...>
elsif <another boolean>
  <another body>
elsif <yet again>
  <yet again body>
else
  <else body>
end

The if expression as a whole returns the value of the final expression of the evaluated block. This means that an if expression is commonly used as a more-readable replacement for the ternary operator:

result = if <boolean> then <true exp> else <false exp> end

An if expression can also be used after another expression, in which case the expression is only evaluated when the if expression is true. In the following line of code, the expression is completely skipped if the Boolean is false.

<expression> if <boolean>

This version is commonly used as a guard clause at the top of a method, as follows:

return if foo.nil?

The unless expression

Ruby provides an unless expression as a shortcut for if not. The unless expression is the exact opposite of an if; the body is only evaluated if the Boolean expression is false.

unless <boolean expression>
  <body>
end

In normal usage, the unless expression is preferred to a simple if not. The unless expression allows for an optional else statement, but that usage is not recommended. There is no unless equivalent to an elsif clause.

The unless expression also has a modifier form, which evaluates its attached expression only when the Boolean is false.

<expression> unless <boolean>

The case expression

Ruby has a very flexible case statement. The basic form is as follows:

case <value expression>
when <expression1>
  <expression1 body>
when <expression2>
  <expression2 body>
end

The value expression is evaluated and compared to each when expression. The first when expression to be case-equal to the value expression has its body executed. No other clauses are executed. (Note the indentation style — the when clauses are at the same indent level as the case line.) The when clause and the body can be on the same line, but they must be separated by the keyword then.

The equality test for a case expression is unusual. The special operator === is used to evaluate whether two clauses are equivalent for the purposes of a case statement. For most objects, the === operator is equivalent to the == operator, but there are three standard classes that override === in an interesting way.

The Range class overrides === to match any number in the range, allowing you to write the following:

case bowling_score
when 0..100 then "Not very good"
when 101..200 then "Good"
when 200..299 then "Very Good"
when 300 then "Perfect"
end

The Regexp class overrides === to perform a string match, allowing you to write the following:

case name
when /$A[A-M]/ then "In first half of alphabet"
when /$A[N-Z]/ then "In last half of alphabet"
end

Somewhat less interestingly, Module and Class override the === operator to indicate that the value class is either equal to or a subclass of the class in the when clause.

As with the if and unless expressions, the value of the case expression is the value of the last expression evaluated in the chosen clause.

Ruby allows you to place an else clause at the end of the case expression, which is evaluated if none of the other clauses match.

case name
when "fred" then "Hi Fred"
when "barney" then "Hi Barney"
else "Hi"
end

You can also place multiple matching expressions in a single when clause, separated by a comma. The associated expressions are evaluated if any of the expressions in the list match the initial value:

when "fred", "barney" then "Hi Guy"

Finally, you can also have a case statement without an initial value in the case clause. In this situation, each when clause contains one or more complete Boolean expressions; if any of them are true, then the associated expressions are evaluated.

case
when obj.nil?, obj.size > 3 then "do something"
when obj.size = 5 then "do something else"
else "shrug your shoulders"
end

The for expression

Ruby has a basic for loop expression.

for <loop variable> in <enumerable exp>
  <body expressions>
end

The loop variable is any valid local variable name. The expression must evaluate to an object that responds to the method each, which includes an array or any Ruby Enumerable. The body of the loop is evaluated once for each element in the enumerable expression, with the loop variable being set to each element in turn. This is almost exactly equivalent to the each method with a block (except for one minor detail: local variables created inside the body of the for expression are available outside it, while local variables created in an each block are not). In normal Ruby practice, calling the each method is preferred.

If the elements of the enumerable expression are themselves enumerable, they can all be assigned separately in the declaration of the for expression:

for x, y in [[1, 2], [3, 4]]
p "(#{x}, #{y})"
end

The entire for expression can be placed on a single line, in which case the list is separated from the body by the keyword do.

for x in [1, 2, 3] do p x ** 2 end

The while expression

Ruby's while expression is extremely simple.

while <boolean expression>
  <body>
end

The Boolean expression is evaluated first; if it is true, then the body is evaluated. The body continues to be evaluated until the Boolean expression is false at the end of a loop (or until the loop is exited through a loop control keyword).

The entire expression can be placed on a single line, in which case the Boolean expression and the body must be separated by the keyword do.

while obj.incomplete? do obj.task end

The while expression also has a modifier version that comes at the end of an expression. The preceding expression is evaluated repeatedly until the Boolean expression is false.

obj.task while obj.incomplete?

If the preceding expression is a block denoted by a begin/end pair, then the block is always evaluated at least once, regardless of the value of the Boolean expression:

begin
  obj.task1
  obj.task2
end while obj.incomplete?

The return value for a normally exited while expression is nil.

The until expression

Ruby offers the until expression, which is the exact opposite of the while expression, looping over the body of the loop as long as the Boolean expression is false:

until <boolean-expression>
  <body>
end

The single line and modifier versions of the until expression have the same syntax as the while versions.

Loop control keywords

Any Ruby loop can be controlled from within the loop with one of the following four keywords: break, next, redo, and retry. These keywords work within for loops, while loops, and until loops, as well as Enumerable each loops and their variants.

The keyword break ends the loop at the point it is evaluated. The break keyword may take an optional argument. Within a for, while, or until loop (but not in an each loop), the value of that argument is returned as the value of the loop, allowing you to distinguish an exit that was the result of a break from a normal exit.

The keyword next causes the next iteration of the loop to start immediately. The keyword redo causes the current iteration of the loop to start again from the top of the loop. The keyword retry starts the entire loop over, returning the Boolean or list expression to its initial value.

Assignment

There are a couple of nuances to Ruby assignment that you should know. The most basic form of assignment has a variable name on the left and an expression on the right.

score = 27

From that point, the name takes the value of the right-hand expression. Technically, it takes a reference to the object that is the result of the right-hand expression.

You can make multiple assignments in the same line:

score, location = 27, "Soldier Field"

The right side can also be an array with the same meaning — in fact, the previous form is converted internally into the following form:

score, location = [27, "Soldier Field"]

This can also include variable swapping:

x, y = y, x

If the two sides are unbalanced, extra names on the left side are set to nil, and extra names on the right side are ignored. The last value on the left can have an asterisk preceding it, in which case it behaves like it would in a method argument list and takes any and all extra values on the right side as an array. The last value on the right side can also be prefixed with an asterisk, in which case it behaves like a method call and unrolls its values out of the array to be assigned one by one.

If the left-hand value is an object attribute, then Ruby looks for an appropriate setter method for that object. In the following example, Ruby calls the method score= on the instance obj with the argument 27.

obj.score = 27

Similarly, a bracket reference on the left side triggers a call to the appropriate []= method. The arguments to that method are, in order, any value that appears inside the brackets and then the value on the right side of the assignment.

File input and output

A file can be opened by calling the method File.new.

File.new(filename, mode)

The method returns an instance of class File. The mode is a short string that indicates what operations can be performed on the file. If the mode is r, then the file is read-only. This is the default mode value if none is specified. If the mode is w, then the file is opened for writing. A non-existent file is created and an existing file is emptied. Somewhat less common is a, which opens an existing file for writing at the end; a non-existent file is still created.

To write string data to a file, you use the method write, or the shortcut operator <<. Note that you have to explicitly include the newline characters. If the expression being written is not a string, then it is converted before being written.

f.write("log file
")
f << "the next thing

"

Files implement the method each, which enumerates over the file line-by-line. The use of each allows all the methods of Enumerable to be used for files.

To read the data, you can use the method readline, which returns a whole line, the method readchar which reads a single character, or the method read(int), which reads an arbitrary number of bytes.

When you are done with the file, close it with the method close. The "open, then do something, then close" structure is so common that Ruby provides a shortcut.

File.open(filename, mode) do |f|
  ## f is the File object, do things with it
end

The open method takes a block. The requested file is opened before the block is executed and closed when the block is complete.

Exceptions

Raising an exception in Ruby is done by calling the method raise, which is of the class Kernel, and is thus available anywhere in a Ruby program. The common usage of the method takes as an argument either one of the following:

  • The class Exception or one of its subclasses

  • An instance of the class Exception or one of its subclasses

An optional second argument is a string message for the exception; an optional third argument is a stack trace.

Handling an exception can take place inside any method without explicitly entering an exceptionaware block. The keywords begin and end also denote the borders of a block that can handle exceptions. To actually handle an exception, use the keyword rescue to start a block that will be invoked when an exception is raised

def this_method_could_break
  <normal method body>
rescue
  <exceptional code>
end

Note the indentation — the rescue clause is outdented to the same level as the def or begin statement.

The rescue keyword takes an optional list of Exception classes that are handled by the rescue clause. If no classes are specified, then StandardError is the default class, which catches nearly all errors in typical usage. Multiple rescue clauses may be specified, each with its own response block of code. The rescue keyword and the associated code may be on the same line, in which case they must be separated with the keyword then.

If specific exception lists are specified, then the list of exceptions can be ended with the phrase => varname, in which case the variable name is assigned the value of the exception. Even if a variable is not specified, the current exception is always available in the global variable $!.

After all the rescue clauses, there are two optional clauses that may be added. The keyword else is used to introduce code that is invoked if no exceptions are raised in the main body of the code. The keyword ensure marks code that is always executed at the end of the method or block, regardless of whether or not an exception was raised.

def method
  <body>
rescue
<exception>
else
  <no exception>
ensure
  <always>
end

Objects and Classes

Every value in Ruby is an object, including classes and methods. In this section, you will see how methods, classes, and modules are defined and how they relate to one another.

Defining methods

Methods are defined using the keyword def. The most basic form is as follows:

def <methodname>
  <body>
end

The limitations on the method name are described in the previous section on variable names. A method defined in this way inside a class or module definition creates an instance method for that class or module. A method defined outside a class or module definition is effectively global to the Ruby program. Technically, it's a method of the class Object.

The return value of a method is the value of the last expression evaluated. You can exit the method at any time by using the keyword return with an optional value. If no value is specified, then the return value is nil. Most Ruby programmers do not explicitly use return in places where it would be redundant. In the following example, the method explicitly returns 0 if the argument is nil; otherwise, it implicitly returns two times the argument.

def example(argument)
  return 0 if argument.nil?
  argument * 2
end

The first variant to the structure involves placing a constant or expression before the method name, separated by a dot. This form is most often used to create class methods:

class User
  def self.total_count
    <method body>
  end
end

With the preceding definition, you can then call the method User.total_count. Generically, this structure binds the method to the object preceding the method name and with that object alone — only that one object can invoke the method definition. In the previous case, within the class definition, self is set to the class User, and so the method total_count is uniquely associated with the class (the declaration def User.total_count would have the same affect). However, you can also use the form to define a method that is specific to a single instance of a class.

ted = User.new
robin = User.new
def ted.go_to_work
  <body>
end

After this definition, the method call ted.go_to_work is successful, while the method call robin.go_to_work is not.

Method arguments are normally defined in a comma-delimited list:

def method(arg1, arg2, arg3)

Any argument can have an optional default value, which can be a constant or expression. The default value is used if the calling argument list doesn't contain all the arguments. Normally, arguments with default values are placed after all the arguments that don't have default values. The following method can be called with one, two, or three arguments.

def method(arg1, arg2 = 7, arg3 = 2)
  p "#{arg1} #{arg2} #{arg3}"
end
method(1)       ==> "1 7 2"
method(1, 2)    ==> "1 2 2"
method(1, 2, 3) ==> "1 2 3"

The default value expression can use any argument name defined earlier in the list, meaning that, in the previous example, the default for arg3 could be defined as, say, arg2 * 3.

The final argument can optionally be an array argument, denoted by putting an asterisk before the argument name. This argument absorbs any remaining values from the method call into an array. It's typical to give an array value the default value of an empty array:

def method(arg1, *arg2 = [])
method(1, 2, 3, 4) ## arg1 = 1, arg2 = [2, 3, 4]

If you expect callers of the method to use the key/value feature to roll up arguments into a hash, it's customary to signal that by giving the last method the default value of an empty hash:

def method(arg1, arg2 = {})
method(1,:a => 3,:b => 4) ## arg1 = 1, arg2 = {:a => 3,:b => 4}

The final optional argument to a method is a block, which is the subject of the next section.

Blocks

A Ruby block is a sequence of executable code that can be defined in one place and executed later on. As mentioned earlier, a block can be defined after a method call using one of two possible syntaxes. In both cases, arguments to the block are placed between pipe characters. Block argument lists cannot have default values or array lists.

receiver.method(arg1, arg2) {|block_arg| ...}
receiver.method(arg1, arg2) do |block_arg| end

Within a block, you can place any arbitrary Ruby code. The block retains its context when it's called, and so any local variables that are visible from where the block is defined are still available when the block is called. This includes the values of self and super. Take the following example:

def outer_method
  alpha = 1
  beta = 2
  [:a,:b,:c].each {|e| p alpha * beta }
end

The values alpha and beta are defined outside the block, but can still be used inside the block even though the block is eventually executed inside the Array#each method.

Like other Ruby constructs, the value of the block when executed is the value of the last expression inside the block.

The block argument does not need to be specified in the argument list of the method being called. Instead, the block is just invoked using the keyword yield, which causes the block to be executed at that point. Any arguments that come after the yield are passed directly to the block.

def block_thingy
  block_value = yield(1, 2)
  p block_value
end
block_thingy {|a, b| a + b}
==> 3

In this example, the existence of the block is only important in the yield statement, which passes the values 1 and 2 to the block, which adds them.

To determine whether a block has been passed to a method, call block_given? at any point; it returns true if the current method is called with a block argument.

If the final argument of a method is preceded by an ampersand, then the method does check for a block argument in the argument call. The block is converted to an instance of the class Proc and can be invoked using the method Proc#call. The following example is functionally equivalent to the previous one.

def proc_thingy(&a_proc)
  proc_value = a_proc.call(1, 2)
  p proc_value
end
proc_thingy {|a, b| a + b}

The conversion works both ways; a Proc value can be passed inside an argument list with an ampersand preceding it and is treated like an ordinary block by the receiving method. In this example, the block_thingy method is called with an explicit Proc object, which it treats exactly as though a block has been declared:

def proc_thingy(&a_proc)
  block_thingy(&a_proc)
end

Internally, the method to_proc is called on the value after the ampersand, which leads to some interesting possibilities. For example, Rails ActiveSupport extends the class Symbol to override to_proc, such that the following two declarations are equivalent:

[1, 2, 3].map {|i| i.sqrt }
[1, 2, 3].map(&:sqrt)

In the first example, the normal syntax is used to declare a block. In the second example, the ampersand triggers a to_proc call on the symbol :sqrt, which converts the symbol into a oneargument block where the argument's method, named :sqrt, is called — identical to the first version. When used judiciously, the symbol-to-proc trick makes for some nicely readable code. This extension has been added to the core for Ruby 1.9.

Defining classes and modules

In Ruby, classes and modules are related concepts. A module has all the abilities of a class except for the ability to create instances. Instead, modules can be included inside classes to provide additional functionality.

Defining modules

A module is defined using the following syntax:

module <ModuleName>
  <all kinds of goodness>
  end

The module name is a constant, and must begin with a capital letter.

All expressions inside the module are executed when the module is loaded. Specifically, the module can include classes, other modules, instance methods defined with def, and module methods defined with def self.

Constants defined within the module, including nested modules and classes, are accessible through the :: scope resolution operator, as in Module::InnerModule::Class. Module-level methods are available through normal method syntax — Module.method. Instance methods in the module are only accessible from an instance of a class that includes the module.

Defining classes

The class definition syntax starts similar to the module syntax:

class <ClassName>
  <class things>
end

Again, the code inside the class definition is actually executed. Constants and module-level methods are defined as in modules, instance methods are accessible to any instance of the class. In particular, many lines of code inside a class definition that look like declarations, such as attribute listings or scope descriptions, are actually method calls to either the class Class or Module.

The biggest difference between a class and a module is that a class can create instances of itself using the method new. Under normal circumstances, you would not override the new method. If you want your class to perform initialization when a new instance is created, override the instance method initialize.

class Animal
  def initialize(name)
    @name = name
  end
end
scooby = Animal.new("Scooby-Doo")

At the end of this code snippet, scooby is an instance of Animal, and its name attribute is set to Scooby-Doo. At this, point, the attribute is still private.

By default, the class is placed inside the current module at the location where the class is defined. However, you can explicitly place the class inside a specific module by including the module as part of the class name with the scope resolution operator:

class OuterModule::InnerModule::NewClass

To explicitly place the class at the top-level module, prefix the class name with just ::.

An alternate syntax allows access to the singleton classes referred to earlier for binding methods to a specific instance. The form discussed previously,

def obj.method
  <method body>
end

is equivalent to the following:

class << obj
  def method
    <method body>
  end
end

Within the class << obj block, any code that is legal within a class can be placed. Methods defined within the block are bound to the specific object mentioned in the class declaration. This mechanism is frequently used with a class as the object:

class Animal
  class << self
    def this_is_a_class_method
      @count = @count + 1
    end
  end
end

The method defined in the inner class block is accessible as a class method Animal.this_is_a_class_method. The reason why this mechanism is sometimes used to define class methods is that, by using this method, each subclass of Animal gets its own copy of the instance variables, and can maintain separate values.

Superclasses and self

A class can have a special relationship with a class known as its superclass. The class inherits behavior from the superclass, meaning that instances of the subclass can access any methods defined in the superclass. To define this relationship, use the following form:

class Subclass < Superclass

The superclass is usually the constant name of the class in question, although it technically can be any expression that returns a class object. If no superclass is specified, the class is assumed to have Object as its superclass. Object's superclass is nil.

The special variable self always refers to the current object whose code is being executed. Within an instance method, self is the receiver of that method call. Within the parts of a class definition that are outside instance methods, self is the class object itself.

The special method super is available inside any method definition and causes the same method name to be called in the superclass. If super is called with no arguments, then the original arguments to the subclass method are automatically passed to the superclass method. If explicit arguments are used for the super call, then those arguments are passed to the superclass method, allowing for the case where the superclass method may have a different argument signature than the subclass method.

Including and extending with modules

Previously, you saw that instance methods declared within a module can only be declared by an instance; however, modules cannot create instances of their own. In order for instance methods in a module to be accessible, the module needs to be mixed into a class.

The most common way to mix in a module is with the include method, if the following module is defined in any_module.rb:

module AnyModule
  def self.total_modulate
    p "calling the class method"
  end
  def modulate
    p "modulating"
  end
end

Then you can write the following in a different file:

require 'any_module'
class AnyClass
  include AnyModule
end
x = AnyClass.new
x.modulate
AnyModule.total_modulate   ### NOT AnyClass.total_modulate

By including the method, instances of the class can respond to instance methods in the module. The instance of AnyClass can call the modulate method, even though it is defined in AnyModule.

The require method takes the filename where the module is defined, minus the .rb extension. You only need to call require if the module's file has not already been loaded — if the module was loaded at startup, then require may not be needed.

Rails, for example, provides special functionality to look for unknown modules, such that modules on the Rails load path can be found and loaded without needing require.

However, while using include adds instance methods to the including class, it does not add class methods, and so the total_modulate method is still only accessible through the AnyModule module. In order to add a module's methods as class methods, use the extend method. If a module is extended into a class, then instance methods of the module become class methods of the class. Class methods of the module still do not become class methods of the class:

class AnyClass
  extend AnyModule
end
AnyClass.modulate
AnyModule.total_modulate   ### NOT AnyClass.total_modulate

Attributes

An instance variable can be declared for the current class at any time by prefixing the variable name with an @. This is usually done in the initialize method, but it can be done inside any method:

def doing_things
  @name = "fred"
end

The new instance variable is available wherever the object is used. Using an instance variable never raises an exception in Ruby; if the variable has not been explicitly created, its value is nil.

Ruby instance variables are never accessible outside the class they are a part of. In order for other classes to see the value, they must call a method that returns or changes the value. The standard getters and setters in Ruby look like this:

def name
  @name
end
def name=(val)
  @name = val
end

It's tedious to have to write all that for each instance variable, and so Ruby gives you a shortcut:

attr_accessor:name,:date,:score

The attr_accessor method takes one or more symbols as arguments, and creates standard getters and setters for each symbol for the instance variable of the same name. If you only want a getter, or only a setter, you can use the variants attr_reader and attr_writer.

Access control

Ruby objects have access control modifiers that can prevent outside objects from calling methods in the class. The default access is public, which means that any other object can call the method. The next level of strictness is protected, which means that the method can only be called inside the body of the class or one of its subclasses.

However, the protected method can be called on any instance of the class that happens to be available in the method body. A private method can only be called by implicit lookup where the method has no receiver, meaning that the private method can only be called on the self object in the current context.

The difference between protected and private may seem odd. Take a look at the following example of a comparison operator (assume that outside is a different instance of the same class):

def <=>(outside)
  key <=> outside.key
end

The comparison checks how the local value of the key attribute compares to the value of the outside object. The local value, key, is accessible no matter what the access of the key method is. If key is declared to be public, then outside.key is legal because all public access is legal. If key is protected, then outside.key is still legal, because it is being called inside the class, even though it is not the instance that is currently self.

However, if key is private, then outside.key is not legal, because a private call can only be made with an implicit self. The initial call to just key is still legal, because that call is an implicit self.

There are two ways to define access control. The most common way is to just include the method call public, protected, or private, with no arguments anywhere in the class. From that point, all methods have the newly declared access level, until the next no-argument access control method is called. If the access control method is called with arguments, then those arguments are the symbols of methods to be given the access. In the following example, protected_method is, well, protected because of its location, and thing is private because of the explicit call in the last line.

class Example
  def initialize
  end
  def thing
  end
  protected
  def protected_method
  end
  private:thing
end

Method lookup

The following is a complete list of the steps Ruby takes to find the definition of a method.

For an instance method, the receiving object is either the object explicitly designated as the recipient of the method, or implicitly, the current value of self. All method matches are on the name only, not the number or type of arguments. The search path is as follows for an instance method:

  1. The singleton class of the receiving object, if it exists.

  2. The class of the receiving object, looking for an instance method.

  3. Any module included in the class of the receiving object, looking for the instance method. If more than one module is included, they are searched in order.

  4. The superclass of the receiving object, looking for an instance method.

  5. Steps 3 and 4 are repeated for modules included in the superclass, and then the superclass of the superclass. This continues until the lookup reaches the class Object.

  6. If the class Object doesn't have the method, the class Kernel is checked.

  7. If the method doesn't exist there, then the special method method_missing is called, starting at the receiving object, and continuing along the same lookup mechanism until an implementation is found. (Object is guaranteed to have one.) Inside method_missing, the program gets a last chance to do something, based on the method name and arguments rather than throwing an exception.

  8. If nothing is found, an exception is thrown.

The lookup path for a class method is slightly different. In this case, the recipient class is the class being sent the message, as in Employee.total_count.

  1. The class is searched for a class method of the same name.

  2. All superclasses are searched for a class method. Notice that included modules are not searched in this path. An extended module technically adds its methods directly into the namespace of the extending class, and thus would be found just by walking up the regular class hierarchy.

  3. Eventually, the superclasses reach Object. If Object does not define the class method, then the next step is to search for instance methods of the class Class. (Remember, classes are objects, too.)

  4. The path from there is instance methods of Class, instance methods of Module, instance methods of Object, and instance methods of Kernel, in that order.

  5. If nothing is found, then a class version of method_missing is searched for, starting at the original recipient class.

Module methods are similar to class methods, except that modules don't have superclasses, and so the search path is simply module method of the module, instance methods of Module, instance methods of Object, and instance methods of Kernel, then method_missing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.55.198