Chapter 8. References: Crossed Signals

image with no caption

Ever sent an email to the wrong contact? You probably had a hard time sorting out the confusion that ensued. Well, Ruby objects are just like those contacts in your address book, and calling methods on them is like sending messages to them. If your address book gets mixed up, it’s possible to send messages to the wrong object. This chapter will help you recognize the signs that this is happening, and help you get your programs running smoothly again.

Some confusing bugs

The word continues to spread—if someone has a Ruby problem, your company can solve it. And so people are showing up at your door with some unusual dilemmas...

image with no caption

This astronomer thinks he has a clever way to save some coding. Instead of typing my_star = CelestialBody.new and my_star.type = 'star' for every star he wants to create, he wants to just copy the original star and set a new name for it.

image with no caption

But the plan seems to be backfiring. All three of his CelestialBody instances are reporting that they have the same name!

The heap

The bug in the star catalog program stems from an underlying problem: the developer thinks he’s working with multiple objects, when actually he’s operating on the same object over and over.

To understand how that can be, we’re going to need to learn about where objects really live, and how your programs communicate with them.

Rubyists often talk about “placing objects in variables,” “storing objects in arrays,” “storing an object in a hash value,” and so forth. But that’s just a simplification of what actually happens. Because you can’t actually put an object in a variable, array, or hash.

Instead, all Ruby objects live on the heap, an area of your computer’s memory allocated for object storage.

image with no caption

When a new object is created, Ruby allocates space on the heap where it can live.

image with no caption

Generally, you don’t need to concern yourself with the heap—Ruby manages it for you. The heap grows in size if more space is needed. Objects that are no longer used get cleared off the heap. It’s not something you usually have to worry about.

But we do need to be able to retrieve items that are stored on the heap. And we do that with references. Read on to learn more about them.

References

When you want to send a letter to a particular person, how do you get it to them? Each residence in a city has an address that mail can be sent to. You simply write the address on an envelope. A postal worker then uses that address to find the residence and deliver the letter.

image with no caption

When a friend of yours moves into a new residence, they give you their address, which you then write down in an address book or other convenient place. This allows you to communicate with them in the future.

image with no caption

Ruby uses references to locate objects on the heap, like you might use an address to locate a house. When a new object is created, it returns a reference to itself. You store that reference in a variable, array, or other convenient place. Similar to a house address, the reference tells Ruby where the object “lives” on the heap.

image with no caption

Later, you can use that reference to call methods on the object (which, you might recall, is similar to sending them a message).

image with no caption

We want to stress this: variables, arrays, hashes, and so on never hold objects. They hold references to objects. Objects live on the heap, and they are accessed through the references held in variables.

When references go wrong

Andy met not one, but two, gorgeous women last week: Betty and Candace. Better yet, they both live on his street.

image with no caption

Andy intended to write down both their addresses in his address book. Unfortunately for him, he accidentally wrote down the same address (Betty’s) for both women.

image with no caption

Later that week, Betty received two letters from Andy:

image with no caption

Now, Betty is angry at Andy, and Candace (who never received a letter) thinks Andy is ignoring her.

What does any of this have to do with fixing our Ruby programs? You’re about to find out...

Aliasing

Andy’s dilemma can be simulated in Ruby with this simple class, called LoveInterest. A LoveInterest has an instance method, request_date, which will print an affirmative response just once. If the method is called again after that, the LoveInterest will report that it’s busy.

image with no caption

Normally, when using this class, we would create two separate objects and store references to them in two separate variables:

betty = LoveInterest.new
candace = LoveInterest.new
image with no caption

When we use the two separate references to call request_date on the two separate objects, we get two affirmative answers, as we expect.

image with no caption

We can confirm that we’re working with two different objects by using the object_id instance method, which almost all Ruby objects have. It returns a unique identifier for each object.

image with no caption

But if we copy the reference instead, we wind up with two references to the same object, under two different names (the variables betty and candace).

This sort of thing is known as aliasing, because you have multiple names for a single thing. This can be dangerous if you’re not expecting it!

image with no caption

In this case, the calls to request_date both go to the same object. The first time, it responds that it’s available, but the second request is rejected.

image with no caption

This aliasing behavior seems awfully familiar... Remember the malfunctioning star catalog program? Let’s go back and take another look at that next.

Fixing the astronomer’s program

Now that we’ve learned about aliasing, let’s take another look at the astronomer’s malfunctioning star catalog, and see if we can figure out the problem this time...

image with no caption

If we try calling object_id on the objects in the three variables, we’ll see that all three variables refer to the same object. The same object under three different names...sounds like another case of aliasing!

image with no caption

By copying the contents of the variables, the astronomer did not get three distinct CelestialBody instances as he thought. Instead, he’s a victim of unintentional aliasing — he got one CelestialBody with three references to it!

image with no caption

To this poor, bewildered object, the sequence of instructions looked like this:

  1. “Set your name attribute to 'Altair', and your type attribute is now 'star'.”

  2. “Now set your name to 'Polaris'.”

  3. “Now your name is 'Vega'.”

  4. “Give us your name attribute 3 times.”

image with no caption

The CelestialBody dutifully complied, and told us three times that its name was now Vega.

image with no caption

Fortunately, a fix will be easy. We just need to skip the shortcuts and actually create three CelestialBody instances.

image with no caption

And as we can see from the output, the problem is fixed!

image with no caption

It’s definitely good policy to avoid copying references from variable to variable. But there are other circumstances where you need to be aware of how aliasing works, as we’ll see shortly.

Quickly identifying objects with “inspect”

Before we move on, we should mention a shortcut for identifying objects. We’ve already shown you how to use the object_id instance method. If it outputs the same value for the object in two variables, you know they both point to the same object.

image with no caption

The string returned by the inspect instance method also includes a representation of the object ID, in hexadecimal (consisting of the numbers 0 through 9 and the letters a through f). You don’t need to know the details of how hexadecimal works; just know that if you see the same value for the object referenced by two variables, you have two aliases for the same object. A different value means a different object.

image with no caption

Problems with a hash default object

The astronomer is back, with more problematic code...

image with no caption

He needs his hash to be a mix of planets and moons. Since most of his objects will be planets, he set the hash default object to a CelestialBody with a type attribute of "planet". (We saw hash default objects last chapter; they let you set an object the hash will return any time you access a key that hasn’t been assigned to.)

image with no caption

He believes that will let him add planets to the hash simply by assigning names to them. And it seems to work:

image with no caption

When the astronomer needs to add a moon to the hash, he can do that, too. He just has to set the type attribute in addition to the name.

image with no caption

But then, as he continues adding new CelestialBody objects to the hash, it starts behaving strangely...

The problems with using a CelestialBody as a hash default object become apparent as the astronomer tries to add more objects to the hash. When he adds another planet after adding a moon, the planet’s type attribute is set to "moon" as well!

image with no caption

If he goes back and gets the value for the keys he added previously, those objects appear to have been modified as well!

image with no caption
image with no caption

Good observation! Remember we said that the inspect method string includes a representation of the object ID? And as you know, the p method calls inspect on each object before printing it. Using the p method shows us that all the hash keys refer to the same object!

image with no caption

Looks like we’ve got a problem with aliasing again! On the next few pages, we’ll see how to fix it.

We’re actually modifying the hash default object!

The central problem with this code is that we’re not actually modifying hash values. Instead, we’re modifying the hash default object.

We can confirm this using the default instance method, which is available on all hashes. It lets us look at the default object after we create the hash.

Let’s inspect the default object both before and after we attempt to add a planet to the hash.

image with no caption

So why is a name being added to the default object? Shouldn’t it be getting added to the hash value for bodies['Mars']?

If we look at the object IDs for both bodies['Mars'] and the hash default object, we’ll have our answer: p bodies['Mars'] p bodies.default

image with no caption

When we access bodies['Mars'], we’re still getting a reference to the hash default object! But why?

A more detailed look at hash default objects

When we introduced the hash default object in the last chapter, we said that you get the default object any time you access a key that hasn’t been assigned to yet. Let’s take a closer look at that last detail.

image with no caption

Let’s suppose we’ve created a hash that will hold student names as the keys, and their grades as the corresponding values. We want the default to be a grade of 'A'. grades = Hash.new('A')

At first, the hash is completely empty. Any student name that we request a grade for will come back with the hash default object, 'A'.

image with no caption

When we assign a value to a hash key, we’ll get that value back instead of the hash default the next time we try to access it.

image with no caption

Even when some keys have had values assigned, we’ll still get the default object for any key that hasn’t been assigned previously.

image with no caption

But accessing a hash value is not the same as assigning to it. If you access a hash value once and then access it again without making an assignment, you’ll still be getting the default object.

image with no caption

Only when a value is assigned to the hash (not just retrieved from it) will anything other than the default object be returned.

image with no caption

Back to the hash of planets and moons

And that is why, when we try to set the type and name attributes of objects in the hash of planets and moons, we wind up altering the default object instead. We’re not actually assigning any values to the hash. In fact, if we inspect the hash itself, we’ll see that it’s totally empty!

image with no caption
image with no caption
image with no caption

Actually, those are calls to the name= and type= attribute writer methods on the hash default object. Don’t mistake them for assignment to the hash.

When we access a key for which no value has been assigned, we get the default object back.

image with no caption

The statement below is not an assignment to the hash. It attempts to access a value for the key 'Mars' from the hash (which is still empty). Since there is no value for 'Mars', it gets a reference to the default object, which it then modifies.

image with no caption

And since there’s still nothing assigned to the hash, the next access gets a reference to the default object as well, and so on.

Fortunately, we have a solution...

image with no caption

Our wish list for hash defaults

We’ve determined that this code doesn’t assign a value to the hash, it just accesses a value. It gets a reference to the default object, which it then (unintentionally) modifies.

image with no caption

Right now, when we access a hash key for which no value has been assigned, we just get a reference to the hash default object.

image with no caption

What we really want is to get an entirely new object for each unassigned hash key.

image with no caption

Of course, if we did that without assigning to the hash, then later accesses would just keep generating new objects over and over...

image with no caption

So it would also be nice if the new object were assigned to the hash for us, so that later accesses would get the same object again (instead of generating new objects over and over).

image with no caption

Hashes have a feature that can do all of this for us!

Hash default blocks

Instead of passing an argument to Hash.new to be used as a hash default object, you can pass a block to Hash.new to be used as the hash default block. When a key is accessed for which no value has been assigned:

  • The block is called.

  • The block receives references to the hash and the current key as block parameters. These can be used to assign a value to the hash.

  • The block return value is returned as the current value of the hash key.

Those rules are a bit complex, so we’ll go over them in more detail in the next few pages. But for now, let’s take a look at your first hash default block:

image with no caption

If we access keys on this hash, we get separate objects for each key, just like we always intended.

image with no caption
image with no caption

Better yet, the first time we access any key, a value is automatically assigned to the hash for us!

image with no caption

Now that we know it will work, let’s take a closer look at the components of that block...

Hash default blocks: Assigning to the hash

In most cases, you’ll want to assign the value created by your hash default block to the hash. A reference to the hash and the current key are passed to the block, in order to allow you to do so.

image with no caption

When we assign values to the hash in the block body, things work like we’ve been expecting all along. A new object is generated for each new key you access. On subsequent accesses, we get the same object back again, with any changes we’ve made intact.

image with no caption

Watch it!

Don’t forget to assign a value to the hash!

If you forget, the generated value will just be thrown away. The hash key still won’t have a value, and the hash will just keep calling the block over and over to generate new defaults.

image with no caption

Hash default blocks: Block return value

When you access an unassigned hash key for the first time, the hash default block’s return value is returned as the value for the key.

image with no caption

As long as you assign a value to the key within the block body, the hash default block won’t be invoked for subsequent accesses of that key; instead, you’ll get whatever value was assigned.

Watch it!

Make sure the block return value matches what you’re assigning to the hash!

Otherwise, you’ll get one value when you first access the key, and a completely different value on subsequent accesses.

image with no caption

Generally speaking, you won’t need to work very hard to remember this rule. As we’ll see on the next page, setting up an appropriate return value for your hash default block happens quite naturally...

Hash default blocks: A shortcut

Thus far, we’ve been returning a value from the hash default block on a separate line:

image with no caption

But Ruby offers a shortcut that can reduce the amount of code in your default block a bit...

You’ve already learned that the value of the last expression in a block is treated as the block’s return value... What we haven’t mentioned is that in Ruby, the value of an assignment expression is the same as the value being assigned.

image with no caption

So we can use an assignment statement by itself in a hash default block, and it will return the assigned value.

image with no caption

And, of course, it will add the value to the hash as well.

image with no caption

So, in the astronomer’s hash, instead of adding a separate line with a return value, we can just let the value of the assignment expression provide the return value for the block.

image with no caption

The astronomer’s hash: Our final code

image with no caption

Here’s our final code for the hash default block:

image with no caption
image with no caption

Here’s how the program works now:

  • We use a hash default block to create a unique object for each hash key. (This is unlike a hash default object, which gives references to one object as the default for all keys.)

  • Within the block, we assign the new object to the current hash key.

  • The new object becomes the value of the assignment expression, which also becomes the block’s return value. So the first time a given hash key is accessed, a new object is returned as the corresponding value.

Using hash default objects safely

image with no caption

Hash default objects work very well if you use a number as the default.

image with no caption

Okay, it’s a little more complicated than that. Hash default objects work very well if you don’t change the default, and if you assign values back to the hash. It’s just that numbers make it easy to follow these rules.

Take this example, which counts the number of times letters occur in an array. (It works just like the vote counting code from last chapter.)

image with no caption

Using a hash default object here works because we follow the above two rules...

Hash default object rule #1: Don’t modify the default object

If you’re going to use a hash default object, it’s important not to modify that object. Otherwise, you’ll get unexpected results the next time you access the default. We saw this happen when we used a default object (instead of a default block) for the astronomer’s hash, and it caused havoc:

image with no caption
image with no caption
image with no caption

In Ruby, doing math operations on a numeric object doesn’t modify that object; it returns an entirely new object. We can see this if we look at object IDs before and after an operation.

image with no caption

In fact, numeric objects are immutable: they don’t have any methods that modify the object’s state. Any operation that might change the number gives you back an entirely new object.

That’s what makes numbers safe to use as hash default objects; you can be certain that the default number won’t be changed accidentally.

Numbers make good hash default objects because they are immutable.

Hash default object rule #2: Assign values to the hash

If you’re going to use a hash default object, it’s also important to ensure that you’re actually assigning values to the hash. As we saw with the astronomer’s hash, sometimes it can look like you’re assigning to the hash when you’re not...

image with no caption

When we use a number as a default object, though, it’s much more natural to actually assign values to the hash. (Because numbers are immutable, we can’t store the incremented values unless we assign them to the hash!)

image with no caption

The rule of thumb for hash defaults

image with no caption

That’s true. So we have a rule of thumb that will keep you out of trouble...

If your default is a number, you can use a hash default object.

If your default is anything else, you should use a hash default block.

As you gain more experience with references, all of this will become second nature, and you can break this rule of thumb when the time is right. Until then, this should prevent most problems you’ll encounter.

Understanding Ruby references and the issue of aliasing won’t help you write more powerful Ruby programs. It will help you quickly find and fix problems when they arise, however. Hopefully this chapter has helped you form a basic understanding of how references work, and will let you avoid trouble in the first place.

Your Ruby Toolbox

That’s it for Chapter 8! You’ve added references to your toolbox.

image with no caption

Up Next...

In the next chapter, we’re going to get back to the topic of organizing your code. You’ve already learned how to share methods between classes with inheritance. But even in situations where inheritance isn’t appropriate, Ruby offers a way to share behavior across classes: mixins. We’ll learn about those next!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.75.165