A hash is a data structure that maintains a set of objects known as keys, and associates a value with each key. Hashes are also known as maps because they map keys to values. They are sometimes called associative arrays because they associate values with each of the keys, and can be thought of as arrays in which the array index can be any object instead of an integer. An example makes this clearer:
# This hash will map the names of digits to the digits themselves numbers = Hash.new # Create a new, empty, hash object numbers["one"] = 1 # Map the String "one" to the Fixnum 1 numbers["two"] = 2 # Note that we are using array notation here numbers["three"] = 3 sum = numbers["one"] + numbers["two"] # Retrieve values like this
This introduction to hashes documents Ruby’s hash literal syntax
and explains the requirements for an object to be used as a hash key.
More information on the API defined by the Hash
class is provided in Hashes.
A hash literal is written as a comma-separated list of key/value
pairs, enclosed within curly braces. Keys and values are separated
with a two-character “arrow”: =>
. The Hash
object created earlier could also be
created with the following literal:
numbers = { "one" => 1, "two" => 2, "three" => 3 }
In general, Symbol
objects
work more efficiently as hash keys than strings do:
numbers = { :one => 1, :two => 2, :three => 3 }
Symbols are immutable interned strings, written as colon-prefixed identifiers; they are explained in more detail in Symbols later in this chapter.
Ruby 1.8 allows commas in place of arrows, but this deprecated syntax is no longer supported in Ruby 1.9:
numbers = { :one, 1, :two, 2, :three, 3 } # Same, but harder to read
Both Ruby 1.8 and Ruby 1.9 allow a single trailing comma at the end of the key/value list:
numbers = { :one => 1, :two => 2, } # Extra comma ignored
Ruby 1.9 supports a very useful and succinct hash literal syntax when the keys are symbols. In this case, the colon moves to the end of the hash key and replaces the arrow:[*]
numbers = { one: 1, two: 2, three: 3 }
Note that there may not be any space between the hash key identifier and the colon.
Ruby’s hashes are implemented, unsurprisingly, with a data
structure known as a hash table. Objects
used as keys in a hash must have a method named hash
that returns a Fixnum
hashcode
for the key. If two keys are equal, they must have the same
hashcode. Unequal keys may also have the same hashcode, but hash
tables are most efficient when duplicate hashcodes are rare.
The Hash
class compares keys
for equality with the eql?
method. For
most Ruby classes, eql?
works like
the ==
operator (see Object Equality for details). If you define a new class that
overrides the eql?
method, you must
also override the hash
method, or
else instances of your class will not work as keys in a hash. (We’ll
see examples of writing a hash
method in Chapter 7.)
If you define a class and do not override eql?
, then instances of that class are
compared for object identity when used as hash keys. Two distinct
instances of your class are distinct hash keys even if they represent
the same content. In this case, the default hash
method is appropriate: it returns the
unique object_id
of the
object.
Note that mutable objects are problematic as hash keys. Changing the content of an object typically changes its hashcode. If you use an object as a key and then alter that object, the internal hash table becomes corrupted, and the hash no longer works correctly.
Because strings are mutable but commonly used hash keys, Ruby
treats them as a special case and makes private copies of all strings
used as keys. This is the only special case, however; you must be very
cautious when using any other mutable object as a hash key. Consider
making a private copy or calling the freeze
method.
If you must use mutable hash keys, call the rehash
method of the Hash
every time you mutate a key.
18.191.202.177