6.3 Hashes

Unlike arrays, which strictly use integer indices, hashes can use any data type as their index. What Ruby calls a “hash” is really a clever way of using a string data type to map quickly to a specific element inside an array.

The string is referred to as a hash key. Some kind of function must exist to map a string to a number. For example, a simple hash function could add up the ASCII codes for each letter and implement a modulo for the number of keys we have. A hash collision occurs when our hash function returns the same number for two different keys, which can be handled with various collision resolution algorithms. A simple collision resolution algorithm simply places all keys that have a collision into a bucket, and the bucket is sequentially scanned for the specific key that is requested when a collision occurs. A detailed discussion of hashing is beyond the scope of this book, but we wanted to illustrate the differences between a hash table and an array.

In most cases, strings are used to associate keys to values. For example, instead of using a two-dimensional array, we can use a hash to store student test scores by name as seen in Example 6-8. As shown, similar to arrays, line 1 creates a new hash structure. Likewise, element assignment, lines 2–4, follow the same process done for arrays.

Example 6-8. Example hash usage
    1 scores = Hash.new
    2 scores["Geraldo"] = [98, 95, 93, 96]
    3 scores["Brittany"] = [74, 90, 84, 92]
    4 scores["Michael"] = [72, 87, 68, 54, 10]

Example: Hash

To access Brittany’s score, we could simply call on scores["Brittany"]. Of course, the string "Brittany" can also be replaced by a variable that holds that string.

Gem of Wisdom

Arrays are accessed with a numerical index, as in array[5]. Hashes are accessed with a string as the index, as in scores["Brittany"].

Example: Accessing a Hash

Example 6-9. Example hash accessor usage
    1 scores = Hash.new
    2 scores["Geraldo"] = [98, 95, 93, 96]
    3 scores["Brittany"] = [74, 90, 84, 92]
    4 scores["Michael"] = [72, 87, 68, 54, 10]
    5 name = "Brittany"
    6 puts name + " first score is: " + scores[name][0].to_s

In line 5 of Example 6-9, we assigned “Brittany” to the variable name; so, assuming that the code of Example 6-9 is stored in file hash_2.rb, executing the code should display Brittany’s first score on the screen:

$ ruby hash_2.rb
Brittany first score is: 74

It is possible to get an array of all the keys by calling on scores.keys. We can then go through each key by using a for loop. We can now rewrite the maximum score example to work for any number of students, no matter what their names are or how many scores each student has.

Note that in our example, the number of individual scores varies among the students. That is, in Example 6-9, both “Geraldo” and “Brittany” have four scores each, while “Michael” has five. The ability to have varying numbers of entries provides great flexibility.

Example: Find the Max—Hash

Example 6-10. Find the max—hash
     1 scores = Hash.new
     2 
     3 scores["Geraldo"] = [98, 95, 93, 96]
     4 scores["Brittany"] = [74, 90, 84, 92]
     5 scores["Michael"] = [72, 87, 68, 54, 10]
     6 
     7 maxscore = 0
     8 for name in scores.keys
     9 	column = 0
    10 	while (column < scores[name].size)
    11 
    12 		if (scores[name][column] > maxscore)
    13 			maxname = name
    14 			maxscore = scores[name][column]
    15 		end
    16 		column = column + 1
    17 	end
    18 end
    19 
    20 puts maxname + " has the highest score."
    21 puts "The highest score is: " + maxscore.to_s

We see that running the code from Example 6-10, stored in file find_max_hash.rb, will output the following result:

$ ruby find_max_hash.rb
Geraldo has the highest score.
The highest score is: 98

Note that the entries in this hash differ from the entries used in the array example.

Hashes cannot replace arrays outright. Due to the nature of their keys, they do not actually have any sensible sequence for their elements. Hashes and arrays serve separate but similar roles. Hashes excel at lookup. A hash keyed on name with a phone number as a value is much easier to work with than a multidimensional array of names and phone numbers.

Arrays refer to a sequence of variables where each variable does not have a name; instead, it is referenced by an integer index. That is, arr[i] refers to the i th element in the sequence, remembering that indices start at 0. In contrast, a hash table uses a key-value pairing to identify the particular entry. In the earlier example, we wish to access test scores based on a person’s name. That is, the hash table arr['Geraldo'] identifies Geraldo’s test scores even though Geraldo is not an integer. Such referencing supports both efficient access and logical correlations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.162.51