8. References: Crossed Signals

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 8. References: Crossed Signals

Ever sent an email to the wrong contact? You probably had a hard time sorting out the confusion that ensued. Well, Ruby objects are just like those contacts in your address book, and calling methods on them is like sending messages to them. If your address book gets mixed up, it’s possible to send messages to the wrong object. This chapter will help you recognize the signs that this is happening, and help you get your programs running smoothly again.

Some confusing bugs

The word continues to spread—if someone has a Ruby problem, your company can solve it. And so people are showing up at your door with some unusual dilemmas...

This astronomer thinks he has a clever way to save some coding. Instead of typing my_star = CelestialBody.new and my_star.type = 'star' for every star he wants to create, he wants to just copy the original star and set a new name for it.

But the plan seems to be backfiring. All three of his CelestialBody instances are reporting that they have the same name!

The heap

The bug in the star catalog program stems from an underlying problem: the developer thinks he’s working with multiple objects, when actually he’s operating on the same object over and over.

To understand how that can be, we’re going to need to learn about where objects really live, and how your programs communicate with them.

Rubyists often talk about “placing objects in variables,” “storing objects in arrays,” “storing an object in a hash value,” and so forth. But that’s just a simplification of what actually happens. Because you can’t actually put an object in a variable, array, or hash.

Instead, all Ruby objects live on the heap, an area of your computer’s memory allocated for object storage.

When a new object is created, Ruby allocates space on the heap where it can live.

Generally, you don’t need to concern yourself with the heap—Ruby manages it for you. The heap grows in size if more space is needed. Objects that are no longer used get cleared off the heap. It’s not something you usually have to worry about.

But we do need to be able to retrieve items that are stored on the heap. And we do that with references. Read on to learn more about them.

References

When you want to send a letter to a particular person, how do you get it to them? Each residence in a city has an address that mail can be sent to. You simply write the address on an envelope. A postal worker then uses that address to find the residence and deliver the letter.

When a friend of yours moves into a new residence, they give you their address, which you then write down in an address book or other convenient place. This allows you to communicate with them in the future.

Ruby uses references to locate objects on the heap, like you might use an address to locate a house. When a new object is created, it returns a reference to itself. You store that reference in a variable, array, or other convenient place. Similar to a house address, the reference tells Ruby where the object “lives” on the heap.

Later, you can use that reference to call methods on the object (which, you might recall, is similar to sending them a message).

We want to stress this: variables, arrays, hashes, and so on never hold objects. They hold references to objects. Objects live on the heap, and they are accessed through the references held in variables.

When references go wrong

Andy met not one, but two, gorgeous women last week: Betty and Candace. Better yet, they both live on his street.

Andy intended to write down both their addresses in his address book. Unfortunately for him, he accidentally wrote down the same address (Betty’s) for both women.

Later that week, Betty received two letters from Andy:

Now, Betty is angry at Andy, and Candace (who never received a letter) thinks Andy is ignoring her.

What does any of this have to do with fixing our Ruby programs? You’re about to find out...

Aliasing

Andy’s dilemma can be simulated in Ruby with this simple class, called LoveInterest. A LoveInterest has an instance method, request_date, which will print an affirmative response just once. If the method is called again after that, the LoveInterest will report that it’s busy.

Normally, when using this class, we would create two separate objects and store references to them in two separate variables:

betty = LoveInterest.new
candace = LoveInterest.new

When we use the two separate references to call request_date on the two separate objects, we get two affirmative answers, as we expect.

We can confirm that we’re working with two different objects by using the object_id instance method, which almost all Ruby objects have. It returns a unique identifier for each object.

But if we copy the reference instead, we wind up with two references to the same object, under two different names (the variables betty and candace).

This sort of thing is known as aliasing, because you have multiple names for a single thing. This can be dangerous if you’re not expecting it!

In this case, the calls to request_date both go to the same object. The first time, it responds that it’s available, but the second request is rejected.

This aliasing behavior seems awfully familiar... Remember the malfunctioning star catalog program? Let’s go back and take another look at that next.

Here is a Ruby class:

class Counter

  def initialize
    @count = 0
  end

  def increment
    @count += 1
    puts @count
  end

end

And here is some code that uses that class:

a = Counter.new
b = Counter.new
c = b
d = c

a.increment
b.increment
c.increment
d.increment

Guess what the code will output, and write your answer in the blanks.

Note

(We’ve filled in the first one for you.)

Here is a Ruby class:

class Counter

  def initialize
    @count = 0
  end

  def increment
    @count += 1
    puts @count
  end

end

And here is some Ruby code that uses that class:

a = Counter.new
b = Counter.new
c = b
d = c

a.increment
b.increment
c.increment
d.increment

Guess what the code will output, and write your answer in the blanks.

Fixing the astronomer’s program

Now that we’ve learned about aliasing, let’s take another look at the astronomer’s malfunctioning star catalog, and see if we can figure out the problem this time...

If we try calling object_id on the objects in the three variables, we’ll see that all three variables refer to the same object. The same object under three different names...sounds like another case of aliasing!

By copying the contents of the variables, the astronomer did not get three distinct CelestialBody instances as he thought. Instead, he’s a victim of unintentional aliasing — he got one CelestialBody with three references to it!

To this poor, bewildered object, the sequence of instructions looked like this:

“Set your name attribute to 'Altair', and your type attribute is now 'star'.”
“Now set your name to 'Polaris'.”
“Now your name is 'Vega'.”
“Give us your name attribute 3 times.”

The CelestialBody dutifully complied, and told us three times that its name was now Vega.

Fortunately, a fix will be easy. We just need to skip the shortcuts and actually create three CelestialBody instances.

And as we can see from the output, the problem is fixed!

It’s definitely good policy to avoid copying references from variable to variable. But there are other circumstances where you need to be aware of how aliasing works, as we’ll see shortly.

Quickly identifying objects with “inspect”

Before we move on, we should mention a shortcut for identifying objects. We’ve already shown you how to use the object_id instance method. If it outputs the same value for the object in two variables, you know they both point to the same object.

The string returned by the inspect instance method also includes a representation of the object ID, in hexadecimal (consisting of the numbers 0 through 9 and the letters a through f). You don’t need to know the details of how hexadecimal works; just know that if you see the same value for the object referenced by two variables, you have two aliases for the same object. A different value means a different object.

Problems with a hash default object

The astronomer is back, with more problematic code...

He needs his hash to be a mix of planets and moons. Since most of his objects will be planets, he set the hash default object to a CelestialBody with a type attribute of "planet". (We saw hash default objects last chapter; they let you set an object the hash will return any time you access a key that hasn’t been assigned to.)

He believes that will let him add planets to the hash simply by assigning names to them. And it seems to work:

When the astronomer needs to add a moon to the hash, he can do that, too. He just has to set the type attribute in addition to the name.

But then, as he continues adding new CelestialBody objects to the hash, it starts behaving strangely...

The problems with using a CelestialBody as a hash default object become apparent as the astronomer tries to add more objects to the hash. When he adds another planet after adding a moon, the planet’s type attribute is set to "moon" as well!

If he goes back and gets the value for the keys he added previously, those objects appear to have been modified as well!

Good observation! Remember we said that the inspect method string includes a representation of the object ID? And as you know, the p method calls inspect on each object before printing it. Using the p method shows us that all the hash keys refer to the same object!

Looks like we’ve got a problem with aliasing again! On the next few pages, we’ll see how to fix it.

We’re actually modifying the hash default object!

The central problem with this code is that we’re not actually modifying hash values. Instead, we’re modifying the hash default object.

We can confirm this using the default instance method, which is available on all hashes. It lets us look at the default object after we create the hash.

Let’s inspect the default object both before and after we attempt to add a planet to the hash.

So why is a name being added to the default object? Shouldn’t it be getting added to the hash value for bodies['Mars']?

If we look at the object IDs for both bodies['Mars'] and the hash default object, we’ll have our answer: p bodies['Mars'] p bodies.default

When we access bodies['Mars'], we’re still getting a reference to the hash default object! But why?

A more detailed look at hash default objects

When we introduced the hash default object in the last chapter, we said that you get the default object any time you access a key that hasn’t been assigned to yet. Let’s take a closer look at that last detail.

Let’s suppose we’ve created a hash that will hold student names as the keys, and their grades as the corresponding values. We want the default to be a grade of 'A'. grades = Hash.new('A')

At first, the hash is completely empty. Any student name that we request a grade for will come back with the hash default object, 'A'.

When we assign a value to a hash key, we’ll get that value back instead of the hash default the next time we try to access it.

Even when some keys have had values assigned, we’ll still get the default object for any key that hasn’t been assigned previously.

But accessing a hash value is not the same as assigning to it. If you access a hash value once and then access it again without making an assignment, you’ll still be getting the default object.

Only when a value is assigned to the hash (not just retrieved from it) will anything other than the default object be returned.

Back to the hash of planets and moons

And that is why, when we try to set the type and name attributes of objects in the hash of planets and moons, we wind up altering the default object instead. We’re not actually assigning any values to the hash. In fact, if we inspect the hash itself, we’ll see that it’s totally empty!

Actually, those are calls to the name= and type= attribute writer methods on the hash default object. Don’t mistake them for assignment to the hash.

When we access a key for which no value has been assigned, we get the default object back.

The statement below is not an assignment to the hash. It attempts to access a value for the key 'Mars' from the hash (which is still empty). Since there is no value for 'Mars', it gets a reference to the default object, which it then modifies.

And since there’s still nothing assigned to the hash, the next access gets a reference to the default object as well, and so on.

Fortunately, we have a solution...

Our wish list for hash defaults

We’ve determined that this code doesn’t assign a value to the hash, it just accesses a value. It gets a reference to the default object, which it then (unintentionally) modifies.

Right now, when we access a hash key for which no value has been assigned, we just get a reference to the hash default object.

What we really want is to get an entirely new object for each unassigned hash key.

Of course, if we did that without assigning to the hash, then later accesses would just keep generating new objects over and over...

So it would also be nice if the new object were assigned to the hash for us, so that later accesses would get the same object again (instead of generating new objects over and over).

Hashes have a feature that can do all of this for us!

Hash default blocks

Instead of passing an argument to Hash.new to be used as a hash default object, you can pass a block to Hash.new to be used as the hash default block. When a key is accessed for which no value has been assigned:

The block is called.
The block receives references to the hash and the current key as block parameters. These can be used to assign a value to the hash.
The block return value is returned as the current value of the hash key.

Those rules are a bit complex, so we’ll go over them in more detail in the next few pages. But for now, let’s take a look at your first hash default block:

If we access keys on this hash, we get separate objects for each key, just like we always intended.

Better yet, the first time we access any key, a value is automatically assigned to the hash for us!

Now that we know it will work, let’s take a closer look at the components of that block...

Hash default blocks: Assigning to the hash

In most cases, you’ll want to assign the value created by your hash default block to the hash. A reference to the hash and the current key are passed to the block, in order to allow you to do so.

When we assign values to the hash in the block body, things work like we’ve been expecting all along. A new object is generated for each new key you access. On subsequent accesses, we get the same object back again, with any changes we’ve made intact.

Watch it!

Don’t forget to assign a value to the hash!

If you forget, the generated value will just be thrown away. The hash key still won’t have a value, and the hash will just keep calling the block over and over to generate new defaults.

Hash default blocks: Block return value

When you access an unassigned hash key for the first time, the hash default block’s return value is returned as the value for the key.

As long as you assign a value to the key within the block body, the hash default block won’t be invoked for subsequent accesses of that key; instead, you’ll get whatever value was assigned.

Watch it!

Make sure the block return value matches what you’re assigning to the hash!

Otherwise, you’ll get one value when you first access the key, and a completely different value on subsequent accesses.

Generally speaking, you won’t need to work very hard to remember this rule. As we’ll see on the next page, setting up an appropriate return value for your hash default block happens quite naturally...

Hash default blocks: A shortcut

Thus far, we’ve been returning a value from the hash default block on a separate line:

But Ruby offers a shortcut that can reduce the amount of code in your default block a bit...

You’ve already learned that the value of the last expression in a block is treated as the block’s return value... What we haven’t mentioned is that in Ruby, the value of an assignment expression is the same as the value being assigned.

So we can use an assignment statement by itself in a hash default block, and it will return the assigned value.

And, of course, it will add the value to the hash as well.

So, in the astronomer’s hash, instead of adding a separate line with a return value, we can just let the value of the assignment expression provide the return value for the block.

The astronomer’s hash: Our final code

Here’s our final code for the hash default block:

Here’s how the program works now:

We use a hash default block to create a unique object for each hash key. (This is unlike a hash default object, which gives references to one object as the default for all keys.)
Within the block, we assign the new object to the current hash key.
The new object becomes the value of the assignment expression, which also becomes the block’s return value. So the first time a given hash key is accessed, a new object is returned as the corresponding value.

Using hash default objects safely

Hash default objects work very well if you use a number as the default.

Okay, it’s a little more complicated than that. Hash default objects work very well if you don’t change the default, and if you assign values back to the hash. It’s just that numbers make it easy to follow these rules.

Take this example, which counts the number of times letters occur in an array. (It works just like the vote counting code from last chapter.)

Using a hash default object here works because we follow the above two rules...

Hash default object rule #1: Don’t modify the default object

If you’re going to use a hash default object, it’s important not to modify that object. Otherwise, you’ll get unexpected results the next time you access the default. We saw this happen when we used a default object (instead of a default block) for the astronomer’s hash, and it caused havoc:

In Ruby, doing math operations on a numeric object doesn’t modify that object; it returns an entirely new object. We can see this if we look at object IDs before and after an operation.

In fact, numeric objects are immutable: they don’t have any methods that modify the object’s state. Any operation that might change the number gives you back an entirely new object.

That’s what makes numbers safe to use as hash default objects; you can be certain that the default number won’t be changed accidentally.

Numbers make good hash default objects because they are immutable.

Hash default object rule #2: Assign values to the hash

If you’re going to use a hash default object, it’s also important to ensure that you’re actually assigning values to the hash. As we saw with the astronomer’s hash, sometimes it can look like you’re assigning to the hash when you’re not...

When we use a number as a default object, though, it’s much more natural to actually assign values to the hash. (Because numbers are immutable, we can’t store the incremented values unless we assign them to the hash!)

The rule of thumb for hash defaults

That’s true. So we have a rule of thumb that will keep you out of trouble...

If your default is a number, you can use a hash default object.
If your default is anything else, you should use a hash default block.

As you gain more experience with references, all of this will become second nature, and you can break this rule of thumb when the time is right. Until then, this should prevent most problems you’ll encounter.

Understanding Ruby references and the issue of aliasing won’t help you write more powerful Ruby programs. It will help you quickly find and fix problems when they arise, however. Hopefully this chapter has helped you form a basic understanding of how references work, and will let you avoid trouble in the first place.

Your Ruby Toolbox

That’s it for Chapter 8! You’ve added references to your toolbox.

If you need to store more objects, Ruby will increase the size of the heap for you. If you’re no longer using an object, Ruby will delete it from the heap for you.
Aliasing is the copying of a reference to an object, and it can cause bugs if you do it unintentionally.
Most Ruby objects have an object_id instance method, which returns a unique identifier for the object. You can use it to determine whether you have multiple references to a single object.
The string returned by the inspect method also includes a representation of the object ID.
If you set a default object for a hash, all unassigned hash keys will return references to that single default object.
For this reason, it’s best to only use immutable objects (objects that can’t be modified), such as numbers, as hash default objects.
If you need any other kind of object as a hash default, it’s better to use a hash default block, so that a unique object is created for each key.
Hash default blocks receive a reference to the hash and the current key as block parameters. In most cases, you’ll want to use these parameters to assign a new object as a value for the given hash key.
The hash default block’s return value is treated as the initial default value for the given key.
The value of a Ruby assignment expression is the same as the value being assigned. So if an assignment expression is the last expression in a block, the value assigned becomes the block’s return value.

Up Next...

In the next chapter, we’re going to get back to the topic of organizing your code. You’ve already learned how to share methods between classes with inheritance. But even in situations where inheritance isn’t appropriate, Ruby offers a way to share behavior across classes: mixins. We’ll learn about those next!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8. References: Crossed Signals

Create new playlist

Sign In

Sign Up

Chapter 8. References: Crossed Signals

Some confusing bugs

The heap

References

When references go wrong

Aliasing

Note

Fixing the astronomer’s program

Quickly identifying objects with “inspect”

Problems with a hash default object

We’re actually modifying the hash default object!

A more detailed look at hash default objects

Back to the hash of planets and moons

Our wish list for hash defaults

Hash default blocks

Hash default blocks: Assigning to the hash

Watch it!

Hash default blocks: Block return value

Watch it!

Hash default blocks: A shortcut

Note

The astronomer’s hash: Our final code

Using hash default objects safely

Hash default object rule #1: Don’t modify the default object

Hash default object rule #2: Assign values to the hash

The rule of thumb for hash defaults

Your Ruby Toolbox

Up Next...

Table of Contents for
8. References: Crossed Signals