Ruby is a very pure object-oriented language: all values are
objects, and there is no distinction between primitive types and object types as there are in many
other languages. In Ruby, all objects inherit from a class named
Object
and share the methods defined
by that class. This section explains the common features of all objects
in Ruby. It is dense in parts, but it’s required reading; the
information here is fundamental.
When we work with objects in Ruby, we are really working with object references. It is not the object itself we manipulate but a reference to it.[*] When we assign a value to a variable, we are not copying an object “into” that variable; we are merely storing a reference to an object into that variable. Some code makes this clear:
s = "Ruby" # Create a String object. Store a reference to it in s. t = s # Copy the reference to t. s and t both refer to the same object. t[-1] = "" # Modify the object through the reference in t. print s # Access the modified object through s. Prints "Rub". t = "Java" # t now refers to a different object. print s,t # Prints "RubJava".
When you pass an object to a method in Ruby, it is an object reference that is passed to the method. It is not the object itself, and it is not a reference to the reference to the object. Another way to say this is that method arguments are passed by value rather than by reference, but that the values passed are object references.
Because object references are passed to methods, methods can use those references to modify the underlying object. These modifications are then visible when the method returns.
We’ve said that all values in Ruby are objects and all objects
are manipulated by reference. In the reference implementation,
however, Fixnum
and
Symbol
objects are actually
“immediate values” rather than references. Neither of these classes
have mutator methods, so
Fixnum
and Symbol
objects are immutable, which means
that there is really no way to tell that they are manipulated by
value rather than by reference.
The existence of immediate values should be considered an implementation detail. The only practical difference between immediate values and reference values is that immediate values cannot have singleton methods defined on them. (Singleton methods are explained in Defining Singleton Methods.)
The built-in Ruby classes described in this chapter have literal
syntaxes, and instances of these classes are created simply by
including their values literally in your code. Objects of other
classes need to be explicitly created, and this is most often done
with a method named new
:
myObject = myClass.new
new
is a method of the
Class
class. It allocates memory to
hold the new object, then it initializes the state of that newly
allocated “empty” object by invoking its initialize
method. The arguments to new
are passed directly on to initialize
. Most classes define an initialize
method to perform whatever
initialization is necessary for instances.
The new
and initialize
methods provide the default
technique for creating new classes, but classes may also define other
methods, known as “factory methods,” that return instances. We’ll
learn more about new
, initialize
, and factory methods in Object Creation and Initialization.
Ruby objects never need to be explicitly deallocated, as they do in languages like C and C++. Ruby uses a technique called garbage collection to automatically destroy objects that are no longer needed. An object becomes a candidate for garbage collection when it is unreachable—when there are no remaining references to the object except from other unreachable objects.
The fact that Ruby uses garbage collection means that Ruby programs are less susceptible to memory leaks than programs written in languages that require objects and memory to be explicitly deallocated and freed. But garbage collection does not mean that memory leaks are impossible: any code that creates long-lived references to objects that would otherwise be short-lived can be a source of memory leaks. Consider a hash used as a cache. If the cache is not pruned using some kind of least-recently-used algorithm, then cached objects will remain reachable as long as the hash itself is reachable. If the hash is referenced through a global variable, then it will be reachable as long as the Ruby interpreter is running.
Every object has an object identifier, a Fixnum
, that you can obtain with the object_id
method. The value returned by this
method is constant and unique for the lifetime of the object. While
the object is accessible, it will always have the same ID, and no
other object will share that ID.
The method id
is a deprecated synonym for object_id
. Ruby 1.8 issues a warning if you use it, and it has been removed in
Ruby 1.9.
__id__
is a valid synonym for
object_id
. It exists as a fallback,
so you can access an object’s ID even if the object_id
method has been undefined or
overridden.
The Object
class implements
the hash
method to simply return an
object’s ID.
There are several ways to determine the class of an object in Ruby. The simplest is simply to ask for it:
o = "test" # This is a value o.class # Returns an object representing the String class
If you are interested in the class hierarchy of an object, you can ask any class what its superclass is:
o.class # String: o is a String object o.class.superclass # Object: superclass of String is Object o.class.superclass.superclass # nil: Object has no superclass
In Ruby 1.9, Object
is no
longer the true root of the class hierarchy:
# Ruby 1.9 only Object.superclass # BasicObject: Object has a superclass in 1.9 BasicObject.superclass # nil: BasicObject has no superclass
See Subclassing and Inheritance for more on BasicObject
.
So a particularly straightforward way to check the class of an object is by direct comparison:
o.class == String # true if o is a String
The instance_of?
method does the same thing and is a little more
elegant:
o.instance_of? String # true if o is a String
Usually when we test the class of an object, we would also like
to know if the object is an instance of any subclass of that class. To
test this, use the is_a?
method, or its synonym kind_of?
:
x = 1 # This is the value we're working with x.instance_of? Fixnum # true: is an instance of Fixnum x.instance_of? Numeric # false: instance_of? doesn't check inheritance x.is_a? Fixnum # true: x is a Fixnum x.is_a? Integer # true: x is an Integer x.is_a? Numeric # true: x is a Numeric x.is_a? Comparable # true: works with mixin modules, too x.is_a? Object # true for any value of x
The Class
class defines
the ===
operator in
such a way that it can be used in place of is_a?
:
Numeric === x # true: x is_a Numeric
This idiom is unique to Ruby and is probably less readable than
using the more traditional
is_a?
method.
Every object has a well-defined class in Ruby, and that class never changes during the lifetime of the object. An object’s type, on the other hand, is more fluid. The type of an object is related to its class, but the class is only part of an object’s type. When we talk about the type of an object, we mean the set of behaviors that characterize the object. Another way to put it is that the type of an object is the set of methods it can respond to. (This definition becomes recursive because it is not just the name of the methods that matter, but also the types of arguments that those methods can accept.)
In Ruby programming, we often don’t care about the class of an
object, we just want to know whether we can invoke some method on it.
Consider, for example, the <<
operator. Arrays, strings, files, and other I/O-related classes define
this as an append operator. If we are writing a method that produces
textual output, we might write it generically to use this operator.
Then our method can be invoked with any argument that implements
<<
. We don’t care about the
class of the argument, just that we can append to it. We can test for
this with the respond_to?
method:
o.respond_to? :"<<" # true if o has an << operator
The shortcoming of this approach is that it only checks the name
of a method, not the arguments for that method. For example, Fixnum
and Bignum
implement <<
as a left-shift operator and expect
the argument to be a number instead of a string. Integer objects
appear to be “appendable” when we use a respond_to?
test, but they produce an error when our code appends a
string. There is no general solution to this problem, but an ad-hoc
remedy, in this case, is to explicitly rule out Numeric
objects with the is_a?
method:
o.respond_to? :"<<" and not o.is_a? Numeric
Another example of the type-versus-class distinction is
the StringIO
class (from
Ruby’s standard library). StringIO
enables reading from and writing to string objects as if they were
IO
objects. StringIO
mimics the IO
API—StringIO
objects define the same methods
that IO
objects do. But StringIO
is not a subclass of IO
. If you write a method that expects a
stream argument, and test the class of the argument with is_a? IO
, then your method won’t work with
StringIO
arguments.
Focusing on types rather than classes leads to a programming style known in Ruby as “duck typing.” We’ll see duck typing examples in Chapter 7.
Ruby has a surprising number of ways to compare objects for equality, and it is important to understand how they work, so you know when to use each method.
The equal?
method is
defined by Object
to test
whether two values refer to exactly the same object. For any two
distinct objects, this method always returns false
:
a = "Ruby" # One reference to one String object b = c = "Ruby" # Two references to another String object a.equal?(b) # false: a and b are different objects b.equal?(c) # true: b and c refer to the same object
By convention, subclasses never override the equal?
method.
Another way to determine if two objects are, in fact, the same
object is to check their object_id
:
a.object_id == b.object_id # Works like a.equal?(b)
The ==
operator is the most common way to test for equality. In the
Object
class, it is simply a
synonym for equal?
, and it tests
whether two object references are identical. Most classes redefine
this operator to allow distinct instances to be tested for
equality:
a = "Ruby" # One String object b = "Ruby" # A different String object with the same content a.equal?(b) # false: a and b do not refer to the same object a == b # true: but these two distinct objects have equal values
Note that the single equals sign in this code is the assignment operator. It takes two equals signs to test for equality in Ruby (this is a convention that Ruby shares with many other programming languages).
Most standard Ruby classes define the ==
operator to implement a reasonable
definition of equality. This includes the Array
and Hash
classes. Two arrays are equal
according to ==
if they have the
same number of elements, and if their corresponding elements are all
equal according to ==
. Two hashes
are ==
if they contain the same
number of key/value pairs, and if the keys and values are themselves
equal. (Values are compared with the ==
operator, but hash keys are compared
with the eql?
method, described
later in this chapter.)
The Numeric
classes
perform simple type conversions in their ==
operators, so that (for example) the
Fixnum
1
and the Float
1.0
compare as equal. The ==
operator of classes, such as String
and Array
, normally requires both operands to
be of the same class. If the righthand operand defines a to_str
or to_ary
conversion function (see Object Conversion), then these operators invoke the ==
operator defined by the righthand
operand, and let that object decide whether it is equal to the
lefthand string or array. Thus, it is possible (though not common)
to define classes with string-like or array-like comparison behavior.
!=
(“not-equal”)
is used in Ruby to test for inequality. When Ruby sees
!=
, it simply uses the ==
operator and then inverts the result.
This means that a class only needs to define the ==
operator to define its own notion of
equality. Ruby gives you the !=
operator for free. In Ruby 1.9, however, classes can explicitly
define their own !=
operators.
The eql?
method is defined by Object
as a synonym for equal?
. Classes that override it typically
use it as a strict version of ==
that does no type conversion. For example:
1 == 1.0 # true: Fixnum and Float objects can be == 1.eql?(1.0) # false: but they are never eql!
The Hash
class uses eql?
to check
whether two hash keys are equal. If two objects are eql?
, their hash
methods must also return the same
value. Typically, if you create a class and define the ==
operator, you can simply write a
hash
method and define eql?
to use ==
.
The ===
operator is commonly called the “case equality” operator and
is used to test whether the target value of a case
statement matches any of the when
clauses of that statement. (The
case
statement is a multiway
branch and is explained in Chapter 5.)
Object
defines a default
===
operator so that it invokes
the ==
operator. For many
classes, therefore, case equality is the same as ==
equality. But certain key classes
define ===
differently, and in
these cases it is more of a membership or matching operator.
Range
defines ===
to test whether a value falls within
the range. Regexp
defines
===
to test whether a string
matches the regular expression. And Class
defines ===
to test whether an object is an
instance of that class. In Ruby 1.9, Symbol
defines ===
to return true
if the righthand operand is the same
symbol as the left or if it is a string holding the same text.
Examples:
(1..10) === 5 # true: 5 is in the range 1..10 /d+/ === "123" # true: the string matches the regular expression String === "s" # true: "s" is an instance of the class String :s === "s" # true in Ruby 1.9
It is uncommon to see the ===
operator used explicitly like this.
More commonly, its use is simply implicit in a case
statement.
The =~
operator is
defined by String
and
Regexp
(and Symbol
in Ruby 1.9) to perform pattern matching, and it isn’t really an
equality operator at all. But it does have an equals sign in it, so
it is listed here for completeness. Object
defines a no-op version of =~
that always returns false
. You can define this operator in
your own class, if that class defines some kind of pattern-matching
operation or has a notion of approximate equality, for example.
!~
is defined as the inverse of
=~
. It is definable in Ruby 1.9
but not in Ruby 1.8.
Practically every class
can define a useful ==
method for
testing its instances for equality. Some classes can also define an
ordering. That is: for any two instances of such a class, the two
instances must be equal, or one instance must be “less than” the
other. Numbers are the most obvious classes for which such an ordering
is defined. Strings are also ordered, according to the numeric
ordering of the character codes that comprise the strings. (With the
ASCII text, this is a rough kind of case-sensitive alphabetical
order.) If a class defines an ordering, then instances of the class
can be compared and sorted.
In Ruby, classes define an ordering by implementing
the <=>
operator.
This operator should return –1
if
its left operand is less than its right operand, 0
if the two operands are equal, and
1
if the left operand is greater
than the right operand. If the two operands cannot be meaningfully
compared (if the right operand is of a different class, for example), then the operator should
return nil
:
1 <=> 5 # -1 5 <=> 5 # 0 9 <=> 5 # 1 "1" <=> 5 # nil: integers and strings are not comparable
The <=>
operator is all
that is needed to compare values. But it isn’t particularly intuitive.
So classes that define this operator typically also include the
Comparable
module as a mixin. (Modules and mixins are covered in Modules As Mixins.) The Comparable
mixin defines the following
operators in terms of <=>
:
Comparable
does not define
the !=
operator; Ruby automatically defines that operator as the negation
of the ==
operator. In addition to
these comparison operators, Comparable
also defines a useful
comparison method named between?
:
1.between?(0,10) # true: 0 <= 1 <= 10
If the <=>
operator
returns nil
, all the comparison operators derived from it return
false
. The special Float
value NaN
is an example:
nan = 0.0/0.0; # zero divided by zero is not-a-number nan < 0 # false: it is not less than zero nan > 0 # false: it is not greater than zero nan == 0 # false: it is not equal to zero nan == nan # false: it is not even equal to itself! nan.equal?(nan) # this is true, of course
Note that defining <=>
and including the Comparable
module
defines a ==
operator for your
class. Some classes define their own ==
operator, typically when they can
implement this more efficiently than an equality test based on
<=>
. It is possible to define
classes that implement different notions of equality in their ==
and <=>
operators. A class might do
case-sensitive string comparisons for the ==
operator, for example, but then do
case-insensitive comparisons for <=>
, so that instances of the class
would sort more naturally. In general, though, it is best if <=>
returns 0
if and only if ==
returns true
.
Many Ruby classes define methods that return a
representation of the object as a value of a different class. The
to_s
method, for obtaining a String
representation of an object, is
probably the most commonly implemented and best known of these
methods. The subsections that follow describe various categories of
conversions.
Classes define explicit conversion methods for use by application
code that needs to convert a value to another representation. The
most common methods in this category are to_s
, to_i
, to_f
, and to_a
to convert to String
, Integer
, Float
, and Array
, respectively. Ruby 1.9 adds to_c
and to_r
methods to convert to Complex
and Rational
.
Built-in methods do not typically invoke these methods for
you. If you invoke a method that expects a String
and pass an object of some other
kind, that method is not expected to convert the argument with
to_s
. (Values interpolated into
double-quoted strings, however, are automatically converted with
to_s
.)
to_s
is easily the most
important of the conversion methods because string representations
of objects are so commonly used in user interfaces. An important
alternative to to_s
is the
inspect
method. to_s
is generally intended to return a
human-readable representation of the object, suitable for end users.
inspect
, on the other hand, is
intended for debugging use, and should return a representation that
is helpful to Ruby developers. The default inspect
method, inherited from Object
, simply calls to_s
.
Sometimes a class has strong characteristics of some other class.
The Ruby Exception
class
represents an error or unexpected condition in a program and
encapsulates an error message. In Ruby 1.8, Exception
objects are not merely convertible to strings; they are string-like
objects and can be treated as if they were strings in many
contexts.[*] For example:
# Ruby 1.8 only e = Exception.new("not really an exception") msg = "Error: " + e # String concatenation with an Exception
Because Exception
objects
are string-like, they can be used with the string
concatenation operator. This does not work with most other Ruby
classes. The reason that Exception
objects can behave like
String
objects is that, in Ruby
1.8, Exception
implements the
implicit conversion method to_str
, and the +
operator defined by String
invokes this method on its
righthand operand.
Other implicit conversion methods are to_int
for objects that want to be
integer-like, to_ary
for objects
that want to be array-like, and to_hash
for objects that want to be
hash-like. Unfortunately, the circumstances under which these
implicit conversion methods are called are not well documented.
Among the built-in classes, these implicit conversion methods are
not commonly implemented, either.
We noted earlier in passing that the ==
operator can perform a weak kind of
type conversion when testing for equality. The ==
operators defined by String
, Array
, and Hash
check to see if the righthand operand
is of the same class as the lefthand operand. If so, they compare
them. If not, they check to see if the righthand operand has a
to_str
, to_ary
, or to_hash
method. They don’t invoke this
method, but if it exists, they invoke the ==
method of the righthand operand and
allow it to decide whether it is equal to the lefthand
operand.
In Ruby 1.9, the built-in classes String
, Array
, Hash
, Regexp
, and IO
all define a class method named
try_convert
. These methods
convert their argument if it defines an appropriate implicit
conversion method, or they return nil
otherwise. Array.try_convert(o)
returns o.to_ary
if o
defines that method; otherwise, it
returns nil
. These try_convert
methods are convenient if you want to
write methods that allow implicit conversions on their
arguments.
The Kernel
module defines four conversion methods that behave as global
conversion functions. These functions—Array
, Float
, Integer
, and String
—have the same names as the classes
that they convert to, and they are unusual in that they begin with a
capital letter.
The Array
function
attempts to convert its argument to an array by
calling to_ary
. If that method is
not defined or returns nil
, it
tries the to_a
method. If
to_a
is not defined or returns
nil
, the Array
function simply returns a new array
containing the argument as its single element.
The Float
function
converts Numeric
arguments
to Float
objects directly. For
any non-Numeric
value, it calls
the to_f
method.
The Integer
function
converts its argument to a Fixnum
or Bignum
. If the argument is a Numeric
value, it is converted directly.
Floating-point values are truncated rather than rounded. If the
argument is a string, it looks for a radix indicator (a leading
0
for octal, 0x
for hexadecimal, or 0b
for binary) and converts the string
accordingly. Unlike String.to_i
it does not allow nonnumeric trailing characters. For any other kind
of argument, the Integer
function
first attempts conversion with to_int
and then with to_i
.
Finally, the String
function converts its argument to a string simply by calling its
to_s
method.
Numeric types define a conversion method named coerce
. The
intent of this method is to convert the argument to the same type as
the numeric object on which the method is invoked, or to convert
both objects to some more general compatible type. The coerce
method always returns an array that
holds two numeric values of the same type. The first element of the
array is the converted value of the argument to coerce
. The second element of the returned
array is the value (converted, if necessary) on which coerce
was invoked:
1.1.coerce(1) # [1.0, 1.1]: coerce Fixnum to Float require "rational" # Use Rational numbers r = Rational(1,3) # One third as a Rational number r.coerce(2) # [Rational(2,1), Rational(1,3)]: Fixnum to Rational
The coerce
method is used
by the arithmetic operators. The +
operator defined by Fixnum
doesn’t know about Rational
numbers, for example, and if its righthand operand is a Rational
value, it doesn’t know how to add
it. coerce
provides the solution.
Numeric operators are written so that if they don’t know the type of
the righthand operand, they invoke the coerce
method of the righthand operand,
passing the lefthand operand as an argument. Returning to our
example of adding a Fixnum
and a
Rational
, the coerce
method of Rational
returns an array of two Rational
values. Now the +
operator defined by Fixnum
can simply invoke +
on the values in the array.
Boolean values deserve a special mention in the context of type conversion. Ruby is very strict with
its Boolean values: true
and
false
have to_s
methods, which return “true” and “false” but define no other
conversion methods. And there is no to_b
method to convert other values to
Booleans.
In some languages, false
is
the same thing as 0
, or can be
converted to and from 0
. In Ruby,
the values true
and false
are their own distinct objects, and
there are no implicit conversions that convert other values to
true
or false
. This is only half the story,
however. Ruby’s Boolean operators and its conditional and looping
constructs that use Boolean expressions can work with values other
than true
and false
. The rule is simple: in Boolean
expressions, any value other than false
or nil
behaves like (but is not converted to) true
. nil
, on the other hand behaves like
false
.
Suppose you want to test whether the variable x
is nil
or not. In some languages, you must
explicitly write a comparison expression that evaluates to true
or false
:
if x != nil # Expression "x != nil" returns true or false to the if puts x # Print x if it is defined end
This code works in Ruby, but it is more common simply to take
advantage of the fact that all values other than nil
and false
behave like true
:
if x # If x is non-nil puts x # Then print it end
It is important to remember that values like 0
, 0.0
,
and the empty string ""
behave
like true
in Ruby, which is
surprising if you are used to languages like C or JavaScript.
The Object
class defines
two closely related methods for copying objects. Both clone
and dup
return a shallow copy of the object on
which they are invoked. If the copied object includes internal state
that refers to other objects, only the object references are copied,
not the referenced objects themselves.
If the object being copied defines an initialize_copy
method, then clone
and
dup
simply allocate a new, empty
instance of the class and invoke the initialize_copy
method on this empty
instance. The object to be copied is passed as an argument, and this
“copy constructor” can initialize the copy however it desires. For
example, the initialize_copy
method could
recursively copy the internal data of an object so that the resulting
object is not a simple shallow copy of the original.
Classes can also override the clone
and dup
methods directly to produce any kind of
copy they desire.
There are two important differences between the clone
and dup
methods defined by Object
. First, clone
copies both the frozen and tainted
state (defined shortly) of an object, whereas dup
only copies the tainted state; calling
dup
on a frozen object returns an
unfrozen copy. Second, clone
copies
any singleton methods of the object, whereas dup
does not.
You can save the state of an object by passing it to the class
method Marshal.dump
.[*] If you pass an I/O stream object as the second
argument, Marshal.dump
writes the
state of the object (and, recursively, any objects it references) to
that stream. Otherwise, it simply returns the encoded state as a
binary string.
To restore a marshaled object, pass a string or an I/O stream
containing the object to Marshal.load
.
Marshaling an object is a very simple way to save its state for
later use, and these methods can be used to provide an automatic file
format for Ruby programs. Note, however, that the binary format used
by Marshal.dump
and Marshal.load
is version-dependent, and newer
versions of Ruby are not guaranteed to be able to read marshaled
objects written by older versions of Ruby.
Another use for Marshal.dump
and Marshal.load
is to create deep copies of objects:
def deepcopy(o) Marshal.load(Marshal.dump(o)) end
Note that files and I/O streams, as well as Method
and Binding
objects, are too dynamic to be
marshaled; there would be no reliable way to restore their
state.
YAML (“YAML Ain’t Markup Language”) is a commonly used alternative to the Marshal
module that
dumps objects to (and loads objects from) a human-readable text
format. It is in the standard library, and you must require 'yaml'
to use it.
Any object may be frozen by
calling its freeze
method. A
frozen object becomes immutable—none of its internal state may
be changed, and an attempt to call any of its mutator methods
fails:
s = "ice" # Strings are mutable objects s.freeze # Make this string immutable s.frozen? # true: it has been frozen s.upcase! # TypeError: can't modify frozen string s[0] = "ni" # TypeError: can't modify frozen string
Freezing a class object prevents the addition of any methods to that class.
You can check whether an object is frozen with the frozen?
method. Once frozen, there is no way
to “thaw” an object. If you copy a frozen object with
clone
, the copy will also be
frozen. If you copy a frozen object with dup
, however, the copy will not be
frozen.
Web applications must often keep track of data derived from untrusted
user input to avoid SQL injection attacks and similar security risks.
Ruby provides a simple solution to this problem: any object may be
marked as tainted by calling its taint
method. Once an object is tainted, any
objects derived from it will also be tainted. The taint of an object
can be tested with the tainted?
method:
s = "untrusted" # Objects are normally untainted s.taint # Mark this untrusted object as tainted s.tainted? # true: it is tainted s.upcase.tainted? # true: derived objects are tainted s[3,4].tainted? # true: substrings are tainted
User input—such as command-line arguments, environment
variables, and strings read with gets
—are automatically tainted. When the
global variable $SAFE
is set to a
value greater than zero, Ruby restricts various built-in methods so
that they will not work with tainted data. Copies of tainted objects
made with clone
and dup
remain tainted. A tainted object may be
untainted with the untaint
method.
You should only do this, of course, if you have examined the object
and are convinced that it presents no security risks.
In Ruby 1.9, objects can be untrusted in addition to being
tainted. The methods untrusted?
, untrust
, and trust
check and set the trustedness of an
object. Untrusted code creates untrusted, tainted objects and is not
allowed to modify trusted objects. See Security for
details on taint, trust, and $SAFE
.
[*] If you are familiar with C or C++, you can think of a reference as a pointer: the address of the object in memory. Ruby does not use pointers, however. References in Ruby are opaque and internal to the implementation. There is no way to take the address of a value, dereference a value, or do pointer arithmetic.
[*] Doing so is discouraged, however, and Ruby 1.9 no longer
allows the implicit conversion of Exception
to String
.
[*] The word “marshal” and its variants are sometimes spelled with two ls: marshall, marshalled, etc. If you spell the word this way, you’ll need to remember that the name of the Ruby class has only a single l.
3.15.34.39