Chapter 7. ActiveRecord Associations

 

Any time you can rarefy something, you can create something that embodies a concept, it gives you leverage to work with it more powerfully. That’s exactly what’s going on with has_many :through.

 
 --Josh Susser

Active Record associations let you declaratively express relationships between model classes. The power and readability of the Associations API is an important part of what makes working with Rails so special.

This chapter covers the different kinds of ActiveRecord associations available while highlighting use cases and available customizations for each of them. We also take a look at the classes that give us access to relationships themselves.

The Association Hierarchy

Associations typically appear as methods on ActiveRecord model objects. For example, the method timesheets might represent the timesheets associated with a given user.

>> user.timesheets

However, people might get confused about the type of objects that are returned by association with these methods. This is because they have a way of masquerading as plain old Ruby objects and arrays (depending on the type of association we’re considering). In the snippet, the timesheet method may appear to return an array of project objects.

The console will even confirm our thoughts. Ask any association collection what its return type is and it will tell you that it is an Array:

>> obie.timesheets.class
=> Array

It’s actually lying to you, albeit very innocently. Association methods for has_many associations are actually instances of HasManyAssociation, shown within its class hierarchy in Figure 7.1.

The Association proxies in their class hierarchy

Figure 7.1. The Association proxies in their class hierarchy

The parent class of all associations is AssociationProxy. It contains the basic structure and functionality of all assocation proxies. If you look near the top of its source code excerpted in Listing 7.1, you’ll notice that it undefines a bunch of methods.

Example 7.1. Excerpt from lib/active_record/associations/association_proxy.rb

instance_methods.each { |m|
  undef_method m unless m =~ /(^__|^nil?$|^send$|proxy_)/ }

As a result, most normal instance methods aren’t actually defined on the proxy anymore, but are instead delegated to the target of the proxy via method_missing. That means that a call to timesheets.class returns the class of the underlying array rather than the proxy. You can prove that timesheet is actually a proxy by asking it if it responds to one of AssociationProxy’s public methods, such as proxy_owner:

>> obie.timesheets.respond_to? :proxy_owner
=> true

Fortunately, it’s not the Ruby way to care about the actual class of an object. What messages an object responds to is a lot more significant. That’s why I think it would be a mistake to make your code depend on working with an array instead of an association proxy. If absolutely necessary, you can always call to_a to get an actual Array object:

>> obie.timesheets.to_a # make absolutely sure we're working with an
Array
=> []

The parent class of all has_many associations is AssociationCollection and most of the methods that it defines work similarly regardless of the options declared for the relationship. Before we get much further into the details of the association proxies, let’s delve into the most fundamental type of association that is commonly used in Rails applications: the has_many / belongs_to pair.

One-to-Many Relationships

In our recurring sample application, an example of a one-to-many relationship is the association between the User, Timesheet, and ExpenseReport classes:

class User < ActiveRecord::Base
  has_many :timesheets
  has_many :expense_reports
end

Timesheets and expense reports should be linked in the opposite direction as well, so that it is possible to reference the user to which a timesheet or expense report belongs.

class Timesheet < ActiveRecord::Base
  belongs_to :user
end

class ExpenseReport < ActiveRecord::Base
  belongs_to :user
end

When these relationship declarations are executed, Rails uses some metaprogramming magic to dynamically add code to your models. In particular, proxy collection objects are created that let you manipulate the relationship easily.

To demonstrate, let’s play with these relationships in the console. First, I’ll create a user.

>> obie = User.create :login => 'obie', :password => '1234',
:password_confirmation => '1234', :email => '[email protected]'
=> #<User:0x2995278 ...}>

Now I’ll verify that I have collections for timesheets and expense reports.

>> obie.timesheets
ActiveRecord::StatementInvalid:
SQLite3::SQLException: no such column: timesheets.user_id:
SELECT * FROM timesheets WHERE (timesheets.user_id = 1)
from /.../connection_adapters/abstract_adapter.rb:128:in `log'

As David might say, “Whoops!” I forgot to add the foreign key columns to the timesheets and expense_reports tables, so in order to go forward I’ll generate a migration for the changes:

$ script/generate migration add_user_foreign_keys
      exists  db/migrate
      create  db/migrate/004_add_user_foreign_keys.rb

Then I’ll open db/migrate/004_add_user_foreign_keys.rb and add the missing columns.

class AddUserForeignKeys < ActiveRecord::Migration
  def self.up
    add_column :timesheets, :user_id, :integer
    add_column :expense_reports, :user_id, :integer
  end

  def self.down
    remove_column :timesheets, :user_id
    remove_column :expense_reports, :user_id
  end
end

Running rake db:migrate applies the changes:

$ rake db:migrate
(in /Users/obie/prorails/time_and_expenses)
== AddUserForeignKeys: migrating
==============================================
-- add_column(:timesheets, :user_id, :integer)
   -> 0.0253s
-- add_column(:expense_reports, :user_id, :integer)
   -> 0.0101s
== AddUserForeignKeys: migrated (0.0357s)
==============================================

Now I should be able to add a new blank timesheet to my user and check timesheets again to make sure it’s there:

>> obie = User.find(1)
=> #<User:0x29cc91c ... >
>> obie.timesheets << Timesheet.new
=> [#<Timesheet:0x2147524 @new_record=true, @attributes={}>]
>> obie.timesheets
=> [#<Timesheet:0x2147524 @new_record=true, @attributes={}>]

Adding Associated Objects to a Collection

According to the Rails documentation, adding an object to a has_many collection automatically saves that object, unless the parent object (the owner of the collection) is not yet stored in the database. Let’s make sure that’s the case using ActiveRecord’s reload method, which re-fetches the attributes of an object from the database:

>> obie.timesheets.reload
=> [#<Timesheet:0x29b3804 @attributes={"id"=>"1", "user_id"=>"1"}>]

There it is. The foreign key, user_id, was automatically set by the << method.

The << method takes one or more association objects to add to the collection, and since it flattens its argument list and inserts each record, push and concat behave identically.

In the blank timesheet example, I could have used the create method on the association proxy, and it would have worked essentially the same way:

>> obie.timesheets.create
=> #<Timesheet:0x248d378 @new_record=false ... >

However, be careful when deciding between << and create!. Even though at first glance << and create do the same thing, there are several very important differences in how they’re implemented and you need to be aware of them (see the next sub-section “AssociationCollection Methods” for more information).

AssociationCollection Methods

As illustrated in Figure 7.1, AssociationCollection has the following subclasses: HasManyAssociation and HasAndBelongsToManyAssociation. The following methods are inherited by and available to both of these subclasses. (HasManyThroughAssociation defines its own very similar set methods, covered later in the chapter.)

<<(*records) and create(attributes = {})

In Rails 1.2.3 and earlier versions, the first thing that the << method did was to load the entire contents of the collection from the database, an operation that could be very expensive! On the other hand, create simply invoked its counterpart on the association’s model class, passing along the value of the foreign key, so that the link is established in the database. Thankfully, Rails 2.0 corrects the behavior of << in that it doesn’t load the entire collection, making it similar in function to create.

However, this is an area of Rails where you can really hurt yourself if you’re not careful. For instance, both methods will add either a single associated object or many, depending on whether you pass them an array or not. However, << is transactional, and create is not.

Yet another difference has to do with association callbacks (covered in this chapter’s options section for has_many). The << method triggers the :before_add and :after_add callbacks, but the create method does not.

Finally, the return value behavior of both methods varies wildly. The create method returns the new instance created, which is what you’d expect given its counterpart in ActiveRecord::Base. The << method returns the association proxy (ever masquerading as an array), which allows chaining and is also natural behavior for a Ruby array.

However, << will return false and not itself if any of the records being added causes the operation to fail. Therefore you shouldn’t really depend on its return value being an array that you can continue operating on in a chained fashion.

clear

Removes all records from this association by clearing the foreign key field (see delete). If the association is configured with the :dependent option set to :delete_all, then clear iterates over all the associated objects and invokes destroy on each one.

The clear method is transactional.

delete(*records) and delete_all

The delete and delete_all methods are used to sever specified associations, or all of them, respectively. Both methods operate transactionally.

It’s worth noting, for performance reasons, that calling delete_all first loads the entire collection of associated objects into memory in order to grab their ids. Then it executes a SQL UPDATE that sets foreign keys for all currently associated objects to nil, effectively disassociating them from their parent. Since it loads the entire association into memory, it would be ill-advised to use this method with an extremely large collection of associated objects.

Note

The names of the delete and delete_all methods can be misleading. By default, they don’t delete anything from the database—they only sever associations by clearing the foreign key field of the associated record. This behavior is related to the :dependent option, which defaults to :nullify. If the association is configured with the :dependent option set to :delete_all or :destroy, then the associated records will actually be deleted from the database.

destroy_all

The destroy_all method takes no parameters; it’s an all or nothing affair. When called, it begins a transaction and invokes destroy on each object in the association, causing them all to be deleted from the database with individual DELETE SQL statements. Again, there are load issues to consider if you plan to use this method with large association collections, since many objects will be loaded into memory at once.

length

Returns the size of the collection by loading it and calling size on the array. If you want to use this method to check whether the association collection is empty, use length.zero? instead of just empty?. It’s more efficient.

replace(other_array)

Replaces the collection with other_array. Works by deleting objects that exist in the current collection, but not in other_array and inserting (using concat) objects that don’t exist in the current collection, but do exist in other_array.

size

If the collection has already been loaded, the size method returns its size. Otherwise a SELECT COUNT(*) query is executed to get the size of the associated collection without having to load any objects.

When starting from an unloaded state where it’s likely that the collection is not actually empty and you will need to load the collection no matter what, it’ll take one less SELECT query if you use length.

The :uniq setting, which removes duplicates from association collections, comes into play when calculating size—basically it will force all objects to be loaded from the database so that Rails can remove duplicates in Ruby code.

sum(column, *options)

Calculates a summed value in the database using SQL. The first parameter should be a symbol identifying the column to be summed. You have to provide a :group option, so that a summation actually takes place.

total = person.accounts.sum(:credit_limit, :group => 'accounts.firm_id')

Depending on the way your association is structured, you may need to disambiguate the query by prefixing the name of the table to the value you pass to :group.

uniq

Iterates over the target collection and populates a Set with the unique values present. Keep in mind that equality of ActiveRecord objects is determined by identity, meaning that the value of the id attribute is the same for both objects being compared.

The belongs_to Association

The belongs_to class method expresses a relationship from one ActiveRecord object to a single associated object for which it has a foreign key attribute. The trick to remembering whether a class “belongs to” another one is determining where the foreign key column will reside.

Assigning an object to a belongs_to association will set its foreign key attribute to the owner object, but will not save the record to the database automatically, as in the following example:

>> timesheet = Timesheet.create
=> #<Timesheet:0x248f18c ... @attributes={"id"=>1409, "user_id"=>nil,
"submitted"=>nil} ...>
>> timesheet.user = obie
=> #<User:0x24f96a4 ...>
>> timesheet.user.login
=> "obie"
>> timesheet.reload
=> #<Timesheet:0x248f18c @billable_weeks=nil, @new_record=false,
@user=nil...>

Defining a belongs_to relationship on a class establishes an attribute of the same name on instances of that class. As mentioned earlier, the attribute is actually a proxy to the related ActiveRecord object and adds capabilities useful for manipulating the relationship.

Reloading the Association

Just invoking the attribute will query the database (if necessary) and return an instance of the related object. The accessor method actually takes a force_reload parameter that tells ActiveRecord whether to reload the related object, if it happens to have been cached already by a previous access.

In the following capture from my console, I look up a timesheet and take a peek at the object_id of its related user object. Notice that the second time I invoke the association via user, the object_id remains the same. The related object has been cached. However, passing true to the accessor reloads the relationship and I get a new instance.

>> ts = Timesheet.find :first
=> #<Timesheet:0x3454554 @attributes={"updated_at"=>"2006-11-21
05:44:09", "id"=>"3", "user_id"=>"1", "submitted"=>nil,
"created_at"=>"2006-11-21 05:44:09"}>
>> ts.user.object_id
=> 27421330
>> ts.user.object_id
=> 27421330
>> ts.user(true).object_id
=> 27396270

Building and Creating Related Objects via the Association

The belongs_to method does some metaprogramming and adds factory methods for creating new instances of the related class and attaching them via the foreign key automatically.

The build_association method does not save the new object, but the create_association method does. Both methods take an optional hash of attribute parameters with which to initialize the newly instantiated objects. Both are essentially one-line conveniences, which I don’t find particularly useful, because it just doesn’t usually make sense to create instances in that direction!

To illustrate, I’ll simply show the code for building a User from a Timesheet or creating a Client from a BillingCode, neither of which would ever happen in real code because it just doesn’t make sense to do so:

>> ts = Timesheet.find :first
=> #<Timesheet:0x3437260 @attributes={"updated_at"=>"2006-11-21
05:44:09", "id"=>"3", "user_id"=>"1", "submitted"=>nil, "created_at"
=>"2006-11-21 05:44:09"}>

>> ts.build_user
=> #<User:0x3435578 @attributes={"salt"=>nil, "updated_at"=>nil,
"crypted_password"=>nil, "remember_token_expires_at"=>nil,
"remember_token"=>nil, "login"=>nil, "created_at"=>nil, "email"=>nil},
@new_record=true>

>> bc = BillingCode.find :first
=> #<BillingCode:0x33b65e8 @attributes={"code"=>"TRAVEL", "client_id"
=>nil, "id"=>"1", "description"=>"Travel expenses of all sorts"}>

>> bc.create_client
=> #<Client:0x33a3074 @new_record_before_save=true,
@errors=#<ActiveRecord::Errors:0x339f3e8 @errors={},
@base=#<Client:0x33a3074 ...>>, @attributes={"name"=>nil, "code"=>nil,
"id"=>1}, @new_record=false>

You’ll find yourself creating instances of belonging objects from the has_many side of the relationship much more often.

belongs_to Options

The following options can be passed in a hash to the belongs_to method.

:class_name

Assume for a moment that we wanted to establish another belongs_to relationship from the Timesheet class to User, this time modeling the relationship to the approver of the timesheet. You might start by adding an approver_id column to the timesheets table and an authorized_approver column to the users table:

class AddApproverInfo < ActiveRecord::Migration

  def self.up
    add_column :timesheets, :approver_id, :integer
    add_column :users, :authorized_approver, :boolean
  end

  def self.down
    remove_column :timesheets, :approver_id
    remove_column :users, :authorized_approver
  end

end

Then you would add a belongs_to that looks like the following:

class Timesheet < ActiveRecord::Base
  belongs_to :approver
  ...

The thing is that Rails can’t figure out what class you’re trying to connect to with just the information provided, because you’ve (legitimately) acted against the Rails convention of naming a relationship according to the related class. It’s time for a :class_name parameter.

class Timesheet < ActiveRecord::Base
  belongs_to :approver, :class_name => 'User'
  ...

:conditions

What about adding conditions to the belongs_to association? Rails allows us to add conditions to a relationship that must be satisfied in order for it to be valid. The :conditions option allows you to do just that, with the same syntax that is used when you add conditions to a find invocation.

In the last migration, I added an authorized_approver column to the users table and we’ll make use of it here:

class Timesheet < ActiveRecord::Base
  belongs_to :approver,
             :class_name => 'User',
             :conditions => ['authorized_approver = ?', true]
  ...
end

Now in order for the assignment of a user to the approver field to work, that user must be authorized. I’ll go ahead and add a test that both indicates the intention of my code and shows it in action.

First I need to ensure that my users fixture (users.yml) makes an authorized approver available to my test methods. For good measure, I go ahead and add a non-authorized user too. The following markup appears at the bottom of test/fixtures/users.yml:

approver:
  id: 4
  login: "manager"
  authorized_approver: true
joe:
  id: 5
  login: "joe"
  authorized_approver: false

Then I turn my attention to test/unit/timesheet_test.rb, where I add a test to make sure that my application code works and is correct:

require File.dirname(__FILE__) + '/../test_helper'

class TimesheetTest < Test::Unit::TestCase

  fixtures :users

  def test_only_authorized_user_may_be_associated_as_approver
    sheet = Timesheet.create
    sheet.approver = users(:approver)
    assert_not_nil sheet.approver, "approver assignment failed"
  end

end

It’s a good start, but I also want to make sure something happens to prevent the system from assigning a nonauthorized user to the approver field, so I add another test:

def test_non_authorized_user_cannot_be_associated_as_approver
  sheet = Timesheet.create
  sheet.approver = users(:joe)
  assert sheet.approver.nil?, "approver assignment should have
failed"
  end

I have my suspicions about the validity of that test, though, and as I half-expected, it doesn’t really work the way I want it to work:

1) Failure:
test_non_authorized_user_cannot_be_associated_as_approver(TimesheetTest
)
[./test/unit/timesheet_test.rb:16]:
approver assignment should have failed.
<false> is not true.

The problem is that ActiveRecord (for better or worse, probably worse) allows me to make the invalid assignment. The :conditions option only applies during the query to get the association back from the database. I’ll have some more work ahead of me to achieve the desired behavior, but I’ll go ahead and prove out Rails’ actual behavior by fixing my tests:

def test_only_authorized_user_may_be_associated_as_approver
  sheet = Timesheet.create
  sheet.approver = users(:approver)
  assert sheet.save
  assert_not_nil sheet.approver(true), "approver assignment failed"
end

def test_non_authorized_user_cannot_be_associated_as_approver
  sheet = Timesheet.create
  sheet.approver = users(:joe)
  assert sheet.save
  assert sheet.approver(true).nil?, "approver assignment should fail"
end

Those two tests do pass. I went ahead and made sure to save the sheet, since just assigning a value to it will not save the record. Then I took advantage of the force_reload parameter to make Rails reload approver from the database, and not just simply give me the same instance I originally assigned to it.

The lesson to learn is that :conditions on relationships never affect the assignment of associated objects, only how they’re read back from the database. To enforce the rule that a timesheet approver must be authorized, you’d need to add a before_save callback to the Timesheet class itself. Callbacks are covered in detail at the beginning of Chapter 9, “Advanced ActiveRecord,” and since I’ve gotten us a little bit off on a tangent, we’ll go back to the list of options available for the belongs_to association.

:foreign_key

Specifies the name of the foreign key column that should be used to find the associated object. Rails will normally infer this setting from the name of the association, by adding _id . You can override the inferred foreign key name with this option if necessary.

# without the explicit option, Rails would guess administrator_id
belongs_to :administrator, :foreign_key => 'admin_user_id'

:counter_cache

Use this option to make Rails automatically update a counter field on the associated object with the number of belonging objects. The option value can be true, in which case the pluralized name of the belonging class plus _count is used, or you can supply your own column name to be used:

:counter_cache => true
:counter_cache => 'number_of_children'

If a significant percentage of your association collections will be empty at any given moment, you can optimize performance at the cost of some extra database storage by using counter caches liberally. The reason is that when the counter cache attribute is at zero, Rails won’t even try to query the database for the associated records!

Note

The value of the counter cache column must be set to zero by default in the database! Otherwise the counter caching won’t work at all. It’s because the way that Rails implements the counter caching behavior is by adding a simple callback that goes directly to the database with an UPDATE command and increments the value of the counter.

If you’re not careful, and neglect to set a default value of 0 for the counter cache column on the database, or misspell the column name, the counter cache will still seem to work! There is a magic method on all classes with has_many associations called collection_count, just like the counter cache. It will return a correct count value if you don’t have a counter cache option set or the counter cache column value is null!

:include

Takes a list of second-order association names that should be eager-loaded when this object is loaded. A SELECT statement with the necessary LEFT OUTER JOINS will be constructed on the fly so that all the data needed to construct a whole object graph is queried in one database request.

With judicious use of :include and careful benchmarking, you can sometimes improve the performance of your application dramatically, mostly by eliminating N+1 queries. On the other hand, since doing huge multijoin queries and instantiating large object trees can also get very costly, certain usages of :include can actually make your application perform much more slowly. As they say, your mileage may vary.

:polymorphic => true

Use the :polymorphic option to specify that an object is related to its association in a polymorphic way, which is the Rails way of saying that the type of the related object is stored in the database along with its foreign key. By making a belongs_to relationship polymorphic, you abstract out the association so that any other model in the system can fill it.

Polymorphic associations let you trade some measure of relational integrity for the convenience of implementation in child relationships that are reused across your application. Common examples are models such as photo attachments, comments, notes, line items, and so on.

Let’s illustrate by writing a Comment class that attaches to its subjects polymorphically. We’ll associate it to both expense reports and timesheets. Listing 7.2 has the schema information in migration code, followed by the code for the classes involved. Notice the :subject_type column, which stores the class name of the associated class.

Example 7.2. Comment Class Using Polymorphic belongs_to Relationship

create_table :comments do |t|
  t.column :subject,      :string
  t.column :body,         :text
  t.column :subject_id,   :integer
  t.column :subject_type, :string
  t.column :created_at,   :datetime
end

class Comment < ActiveRecord::Base
  belongs_to :subject, :polymorphic => true
end

class ExpenseReport < ActiveRecord::Base
  belongs_to :user
  has_many :comments, :as => :subject
end

class Timesheet < ActiveRecord::Base
  belongs_to :user
  has_many :comments, :as => :subject
end

As you can see in the ExpenseReport and Timesheet classes of Listing 7.2, there is a corresponding syntax where you give ActiveRecord a clue that the relationship is polymorphic by specifying :as => :subject. We haven’t even covered has_many relationships yet, and polymorphic relationships have their own section in Chapter 9. So before we get any further ahead of ourselves, let’s take a look at has_many relationships.

The has_many Association

Just like it sounds, the has_many association allows you to define a relationship in which one model has many other models that belong to it. The sheer readability of code constructs such as has_many is a major reason that people fall in love with Rails.

The has_many class method is often used without additional options. If Rails can guess the type of class in the relationship from the name of the association, no additional configuration is necessary. This bit of code should look familiar by now:

class User
  has_many :timesheets
  has_many :expense_reports

The names of the associations can be singularized and match the names of models in the application, so everything works as expected.

has_many Options

Despite the ease of use of has_many, there is a surprising amount of power and customization possible for those who know and understand the options available.

:after_add

Called after a record is added to the collection via the << method. Is not triggered by the collection’s create method, so careful consideration is needed when relying on association callbacks.

Add callback method options to a has_many by passing one or more symbols corresponding to method names, or Proc objects. See Listing 7.3 in the :before_add option for an example.

:after_remove

Called after a record has been removed from the collection with the delete method. Add callback method options to a has_many by passing one or more symbols corresponding to method names, or Proc objects. See Listing 7.3 in the :before_add option for an example.

:as

Specifies the polymorphic belongs_to association to use on the related class. (See Chapter 9 for more about polymorphic relationships.)

:before_add

Triggered when a record is added to the collection via the << method. (Remember that concat and push are aliases of <<.) Raising an exception in the callback will stop the object from getting added to the collection. (Basically, because the callback is triggered right after the type mismatch check, and there is no rescue clause to be found inside <<.)

Add callback method options to a has_many by passing one or more symbols corresponding to method names, or Proc objects. You can set the option to either a single callback (as a Symbol or Proc) or to an array of them.

Example 7.3. A Simple Example of :before_add Callback Usage

has_many :unchangable_posts,
         :class_name => "Post",
         :before_add => :raise_exception

private

  def raise_exception(object)
    raise "You can't add a post"
  end

Of course, that would have been a lot shorter code using a Proc since it’s a one liner. The owner parameter is the object with the association. The record parameter is the object being added.

has_many :unchangable_posts,
  :class_name => "Post",
  :before_add => Proc.new {|owner, record| raise "Can't do it!"}

One more time, with a lambda, which doesn’t check the arity of block parameters:

has_many :unchangable_posts,
  :class_name => "Post",
  :before_add => lamda {raise "You can't add a post"}

:before_remove

Called before a record is removed from a collection with the delete method. See before_add for more information.

:class_name

The :class_name option is common to all of the associations. It allows you to specify, as a string, the name of the class of the association, and is needed when the class name cannot be inferred from the name of the association itself.

:conditions

The :conditions option is common to all of the associations. It allows you to add extra conditions to the ActiveRecord-generated SQL query that bring back the objects in the association.

You can apply extra :conditions to an association for a variety of reasons. How about approval?

has_many :comments, :conditions => ['approved = ?', true]

Plus, there’s no rule that you can’t have more than one has_many association exposing the same two related tables in different ways. Just remember that you’ll probably have to specify the class name too.

has_many :pending_comments, :conditions => ['approved = ?', true],
                            :class_name => 'Comment'

:counter_sql

Overrides the ActiveRecord-generated SQL query that would be used to count the number of records belonging to this association. Not necessarily needed in conjunction with the :finder_sql option, since ActiveRecord will automatically generate counter SQL code based on the custom finder SQL statement.

As with all custom SQL specifications in ActiveRecord, you must use single-quotes around the entire string to prevent premature interpolation. (That is, you don’t want the string to get interpolated in the context of the class where you’re declaring the association. You want it to get interpolated at runtime.)

has_many :things, :finder_sql => 'select * from t where id = #{id}'

:delete_sql

Overrides the ActiveRecord-generated SQL statement that would be used to break associations. Access to the associated model is provided via the record method.

:dependent => :delete_all

All associated objects are deleted in fell swoop using a single SQL command. Note: While this option is much faster than :destroy_all, it doesn’t trigger any destroy callbacks on the associated objects—you should use this option very carefully. It should only be used on associations that depend solely on the parent object.

:dependent => :destroy_all

All associated objects are destroyed along with the parent object, by iteratively calling their destroy methods.

:dependent => :nullify

The default behavior for deleting associated records is to nullify, or clear, the foreign key that joins them to the parent record. You should never have to specify this option explicitly, it is only here for reference.

:exclusively_dependent

Deprecated; equivalent to :dependent => :delete_all.

:extend => ExtensionModule

Specifies a module with methods that will extend the association collection proxy. Used as an alternative to defining additional methods in a block passed to the has_many method itself. Discussed in the section “Association Extensions”.

:finder_sql

Specifies a complete SQL statement to fetch the association. This is a good way to load complex associations that depend on multiple tables for their data. It’s also quite rare to need to go this route.

Count operations are done with a SQL statement based on the query supplied via the :finder_sql option. If ActiveRecord botches the transformation, it might be necessary to supply an explicit :counter_sql value also.

:foreign_key

Overrides the convention-based foreign key name that would normally be used in the SQL statement that loads the association.

:group

An attribute name by which the result should be grouped. Uses the GROUP BY SQL clause.

:include

Takes an array of second-order association names (as an array) that should be eager-loaded when this collection is loaded. As with the :include option on belongs_to associations, with judicious use of :include and careful benchmarking you can sometimes improve the performance of your application dramatically.

To illustrate, let’s analyze how :include affects the SQL generated while navigating relationships. We’ll use the following simplified versions of Timesheet, BillableWeek, and BillingCode:

class Timesheet < ActiveRecord::Base
  has_many :billable_weeks
end

class BillableWeek < ActiveRecord::Base
  belongs_to :timesheet
  belongs_to :billing_code
end

class BillingCode < ActiveRecord::Base
  belongs_to :client
  has_many :billable_weeks
end

First, I need to set up my test data, so I create a timesheet instance and add a couple of billable weeks to it. Then I assign a billable code to each billable week, which results in an object graph (with four objects linked together via associations).

Next I do a fancy one-line collect, which gives me an array of the billing codes associated with the timesheet:

>> Timesheet.find(3).billable_weeks.collect{ |w| w.billing_code.code }
=> ["TRAVEL", "DEVELOPMENT"]

Without the :include option set on the billable_weeks association of Timesheet, that operation cost me the following four database hits (copied from log/development.log, and prettied up a little):

Timesheet Load (0.000656)    SELECT * FROM timesheets
                             WHERE (timesheets.id = 3)

BillableWeek Load (0.001156) SELECT * FROM billable_weeks
                             WHERE (billable_weeks.timesheet_id = 3)

BillingCode Load (0.000485)  SELECT * FROM billing_codes
                             WHERE (billing_codes.id = 1)

BillingCode Load (0.000439)  SELECT * FROM billing_codes
                             WHERE (billing_codes.id = 2)

This is demonstrates the so-called “N+1 select” problem that inadvertently plagues many systems. Anytime I need one billable week, it will cost me N select statements to retrieve its associated records.

Now let’s add :include to the billable_weeks association, after which the Timesheet class looks as follows:

class Timesheet < ActiveRecord::Base
  has_many :billable_weeks, :include => [:billing_code]
end

Simple! Rerunning our test statement yields the same results in the console:

>> Timesheet.find(3).billable_weeks.collect{ |w| w.billing_code.code }
=> ["TRAVEL", "DEVELOPMENT"]

But look at how different the generated SQL is:

Timesheet Load (0.002926)   SELECT * FROM timesheets LIMIT 1

BillableWeek Load Including Associations (0.001168)   SELECT
billable_weeks."id" AS t0_r0, billable_weeks."timesheet_id" AS t0_r1,
billable_weeks."client_id" AS t0_r2, billable_weeks."start_date" AS
t0_r3, billable_weeks."billing_code_id" AS t0_r4,
billable_weeks."monday_hours" AS t0_r5, billable_weeks."tuesday_hours"
AS t0_r6, billable_weeks."wednesday_hours" AS t0_r7,
billable_weeks."thursday_hours" AS t0_r8,
billable_weeks."friday_hours"
AS t0_r9, billable_weeks."saturday_hours" AS t0_r10,
billable_weeks."sunday_hours" AS t0_r11, billing_codes."id" AS t1_r0,
billing_codes."client_id" AS t1_r1, billing_codes."code" AS t1_r2,
billing_codes."description" AS t1_r3 FROM billable_weeks LEFT OUTER
JOIN
billing_codes ON billing_codes.id = billable_weeks.billing_code_id
WHERE
(billable_weeks.timesheet_id = 3)

Rails has added a LEFT OUTER JOIN clause so that billing code data is loaded along with billable weeks. For larger datasets, the performance improvement can be quite dramatic!

It’s generally easy to find N+1 select issues just by watching the log scroll by while clicking through the different screens of your application. (Of course, make sure that you’re looking at realistic data or the exercise will be pointless.) Screens that might benefit from eager loading will cause a flurry of single-row SELECT statements, one for each record in a given association being used.

If you’re feeling particularly daring (perhaps masochistic is a better term) you can try including a deep hierarchy of associations, by mixing hashes into your eager :include array:

Post.find(:all, :include=>[:author, {:comments=>{:author=>:gravatar }}])

That example snippet will grab not only all the comments for a Post, but all the authors and gravatar pictures as well. You can mix and match symbols, arrays and hashes in any combination to describe the associations you want to load.

Frankly, deep :includes are not well-documented functionality and are probably more trouble than what they’re worth. The biggest problem is that pulling too much data in one query can really kill your performance. You should always start out with the simplest solution that will work, then use benchmarking and analysis to figure out if optimizations such as eager-loading help improve your performance.

:insert_sql

Overrides the ActiveRecord-generated SQL statement that would be used to create associations. Access the associated model via the record method.

:limit

Appends a LIMIT clause to the SQL generated for loading this association.

:offset

An integer determining the offset from where the rows should be fetched.

:order

Specifies the order in which the associated objects are returned via an “ORDER BY” sql fragment, such as "last_name, first_name DESC".

:select

By default, this is * as in SELECT * FROM, but can be changed if you for example want to add additional calculated columns or “piggyback” additional columns from joins onto the associated object as its loaded.

:source and :source_type

Used exclusively as additional options to assist in using has_many :through associations with polymorphic belongs_to and is covered in detail later in the chapter.

:table_name

The :table_name option lets you override the table names (FROM clause) that will be used in SQL statements generated for loading the association.

:through

Creates an association collection via another association. See the section in this chapter entitled “has_many :through” for more information.

:uniq => true

Strips duplicate objects from the collection. Useful in conjunction with has_many :through.

Proxy Methods

The has_many class method creates an association collection proxy, with all the methods provided by AssociationCollection and a few more methods defined in HasManyAssociation.

build(attributes = {})

Instantiates a new object in the associated collection, and links it to the owner by specifying the value of the foreign key. Does not save the new object in the database and the new object is not added to the association collection. As you can see in the following example, unless you capture the return value of build, the new object will be lost:

>> obie.timesheets
=> <timesheets not loaded yet>
>> obie.timesheets.build
=> #<Timesheet:0x24c6b8c @new_record=true, @attributes={"user_id"=>1,
"submitted"=>nil}>
>> obie.timesheets
=> <timesheets not loaded yet>

As the online API documents point out, the build method is exactly the same as constructing a new object and passing in the foreign key value as an attribute:

>> Timesheet.new(:user_id => 1)
=> #<Timesheet:0x24a52fc @new_record=true, @attributes={"user_id"=>1,
"submitted"=>nil}>

count(*args)

Counts the number of associated records in the database using SQL.

find(*args)

Not much different here than the normal ActiveRecord find method, other than that the scope is constrained to associated records and any additional conditions specified in the declaration of the relationship.

Remember the has_one example shown earlier in the chapter? It was somewhat contrived, since it would have been easier to look up the last modified timesheet using find:.

Many-to-Many Relationships

Associating persistent objects via a join table can be one of the trickier aspects of object-relational mapping to implement correctly in a framework. Rails has a couple of techniques that let you represent many-to-many relationships in your model. We’ll start with the older and simpler has_and_belongs_to_many and then cover the newer has_many :through.

has_and_belongs_to_many

The has_and_belongs_to_many method establishes a link between two associated ActiveRecord models via an intermediate join table. Unless the join table is explicitly specified as an option, Rails guesses its name by concatenating the table names of the joined classes, in alphabetical order and separated with an underscore.

For example, if I was using has_and_belongs_to_many (or habtm for short) to establish a relationship between Timesheet and BillingCode, the join table would be named billing_codes_timesheets and the relationship would be defined in the models. Both the migration class and models are listed:

class CreateBillingCodesTimesheets < ActiveRecord::Migration
  def self.up
    create_table :billing_codes_timesheets, :id => false do |t|
      t.column :billing_code_id, :integer, :null => false
      t.column :timesheet_id, :integer, :null => false
    end
  end

  def self.down
    drop_table :billing_codes_timesheets
  end
end

class Timesheet < ActiveRecord::Base
  has_and_belongs_to_many :billing_codes
end

class BillingCode < ActiveRecord::Base
 has_and_belongs_to_many :timesheets
end

Note that an id primary key is not needed, hence the :id => false option was passed to the create_table method. Also, since the foreign key columns are both needed, we pass them a :null => false option. (In real code, you would also want to make sure both of the foreign key columns were indexed properly.)

Self-Referential Relationship

What about self-referential many-to-many relationships? Linking a model to itself via a habtm relationship is easy—you just have to provide explicit options.

In Listing 7.4, I’ve created a join table and established a link between related BillingCode objects. Again, both the migration and model class are listed:

Example 7.4. Related Billing Codes

class CreateRelatedBillingCodes < ActiveRecord::Migration
  def self.up
    create_table :related_billing_codes, :id => false do |t|
      t.column :first_billing_code_id, :integer, :null => false
      t.column :second_billing_code_id, :integer, :null => false
    end
  end

  def self.down
    drop_table :related_billing_codes
  end
end

class BillingCode < ActiveRecord::Base
  has_and_belongs_to_many :related,
    :join_table => 'related_billing_codes',
    :foreign_key => 'first_billing_code_id',
    :association_foreign_key => 'second_billing_code_id',
:class_name => 'BillingCode'
end

Bidirectional Relationships

It’s worth noting that the related relationship of the BillingCode in Listing 7.4 is not bidirectional. Just because you associate two objects in one direction does not mean they’ll be associated in the other direction. But what if you need to automatically establish a bidirectional relationship?

First let’s write a test for the BillingCode class to prove our solution. We’ll start by writing a couple of sample records to work with in test/fixtures/billing_codes.yml:

travel:
  code: TRAVEL
  client_id:
  id: 1
  description: Travel expenses of all sorts
development:
  code: DEVELOPMENT
  client_id:
  id: 2
  description: Coding, etc

When we add bidirectional, we don’t want to break the normal behavior, so at first my test method establishes that the normal habtm relationship works:

require File.dirname(__FILE__) + '/../test_helper'

class BillingCodeTest < Test::Unit::TestCase
  fixtures :billing_codes

  def test_self_referential_habtm_association
    billing_codes(:travel).related << billing_codes(:development)
    assert BillingCode.find(1).related.include?(BillingCode.find(2))
  end
end

I run the test and it passes. Now I can modify the test method to add proof that the bidirectional behavior that we’re going to add works. It ends up looking very similar to the original method. (Normally I would lean toward only having one assertion per test method, but in this case it makes more sense to keep them together.) The second assert statement checks to see that the newly associated class also has its related BillingCode in its related collection.

require File.dirname(__FILE__) + '/../test_helper'

class BillingCodeTest < Test::Unit::TestCase

  fixtures :billing_codes

  def setup
    @travel = billing_codes(:travel)
    @development = billing_codes(:development)
  end

  def test_self_referential_bidirectional_habtm_association
    @travel.related << @development
    assert @travel.related.include?(@development)
    assert @development.related.include?(@travel)
  end
end

Of course, the new test fails, since we haven’t added the new behavior yet. I’m not entirely happy with this approach, since it involves bringing hand-coded SQL into my otherwise beautiful Ruby code. However, the Rails way is to use SQL when it makes sense to do so, and this is one of those cases.

Custom SQL Options

To get our bidirectional, we’ll be using the :insert_sql option of has_and_belongs_to_many to override the normal INSERT statement that Rails would use to associate objects with each other.

Here’s a neat trick so that you don’t have to figure out the syntax of the INSERT statement from memory. Just copy and paste the normal INSERT statement that Rails uses. It’s not too hard to find in log/test.log if you tail the file while running the unit test we wrote in the previous section:

INSERT INTO related_billing_codes (`first_billing_code_id`,
`second_billing_code_id`) VALUES (1, 2)

Now we just have to tweak that INSERT statement so that it adds two rows instead of just one. You might be tempted to just add a semicolon and a second, full INSERT statement. That won’t work, because it is invalid to stuff two statements into one using a semicolon. Try it and see what happens if you’re curious.

After some quick googling, I found the following method of inserting multiple rows with one SQL statement that will work for Postgres, MySQL, and DB2 databases.[1] It is valid according to the SQL-92 standard, just not universally supported:

:insert_sql => `INSERT INTO related_billing_codes
                (`first_billing_code_id`, `second_billing_code_id`)
                VALUES (#{id}, #{record.id}), (#{record.id}, #{id})`

There are some very important things to remember when trying to get custom SQL options to work. The first is to use single quotes around the entire string of custom SQL. If you were to use double quotes, the string would be interpolated in the context of the class where it is being declared, not at the time of your query like you need it to be.

Also, while we’re on the subject of quotation marks and how to use them, note that when I copied the INSERT query over from my log, I ended up with backtick characters around the column names, instead of single quotes. Trying to use single-quotes around values instead of backtick characters will fail, because the database adapter will escape the quotes, producing invalid syntax. Yes, it’s a pain in the neck—luckily you shouldn’t need to specify custom SQL very often.

Another thing to remember is that when your custom SQL string is interpolated, it will happen in the context of the object holding the association. The object being associated will be made available as record. If you look closely at the code listing, you’ll notice that to establish the bidirectional link, we just added two rows in the related_billing_codes table, one in each direction.

A quick test run confirms that our :insert_sql approach did indeed work. We should also use the :delete_sql option to make sure that the relationship can be broken bidirectionally as well. Again, I’ll drive the implementation in a TDD fashion, adding the following test to BillingCodeTest:

def test_that_deletion_is_bidirectional_too
  billing_codes(:travel).related << billing_codes(:development)
  billing_codes(:travel).related.delete(billing_codes(:development))
  assert !BillingCode.find(1).related.include?(BillingCode.find(2))
  assert !BillingCode.find(2).related.include?(BillingCode.find(1))
end

It’s similar to the previous test method, except that after establishing the relationship, it immediately deletes it. I expect that the first assertion will pass right away, but the second should fail:

$ ruby test/unit/billing_code_test.rb
Loaded suite test/unit/billing_code_test
Started
.F
Finished in 0.159424 seconds.

  1) Failure:
test_that_deletion_is_bidirectional_too(BillingCodeTest)
[test/unit/billing_code_test.rb:16]:
<false> is not true.

2 tests, 4 assertions, 1 failures, 0 errors

Yep, just as expected. Let’s peek at log/test.log and grab the SQL DELETE clause that we’ll work with:

DELETE FROM related_billing_codes WHERE first_billing_code_id = 1 AND
second_billing_code_id IN (2)

Hmph! This might be a little trickier than the insert. Curious about the IN operator, I take a peek inside the active_record/associations/has_and_belongs_to_many_association.rb file and find the following relevant method:

def delete_records(records)
  if sql = @reflection.options[:delete_sql]
    records.each { |record|
      @owner.connection.execute(interpolate_sql(sql, record))
    }
else
    ids = quoted_record_ids(records)
    sql = "DELETE FROM #{@reflection.options[:join_table]}
           WHERE #{@reflection.primary_key_name} = #{@owner.quoted_id}
           AND #{@reflection.association_foreign_key} IN (#{ids})"
    @owner.connection.execute(sql)
  end
end

The final BillingCode class now looks like this:

class BillingCode < ActiveRecord::Base
  has_and_belongs_to_many :related,
    :join_table => `related_billing_codes`,
    :foreign_key => `first_billing_code_id`,
    :association_foreign_key => `second_billing_code_id`,
    :class_name => `BillingCode`,
    :insert_sql => `INSERT INTO related_billing_codes
                    (`first_billing_code_id`,
`second_billing_code_id`)
                    VALUES (#{id}, #{record.id}), (#{record.id},
#{id})`
end

Linking Two Existing Objects Efficiently

Prior to Rails 2.0, the << method loads the entire contents of the associated collection from the database into memory—which, depending on how many associated records you have in your database, could take a really long time!

Extra Columns on has_and_belongs_to_many Join Tables

Rails won’t have a problem with you adding as many extra columns as you want to habtm’s join table. The extra attributes will be read in and added onto model objects accessed via the habtm association. However, speaking from experience, the severe annoyances you will deal with in your application code make it really unattractive to go that route.

What kind of annoyances? For one, records returned from join tables with additional attributes will be marked as read-only, because it’s not possible to save changes to those additional attributes.

You should also consider that the way that Rails makes those extra columns of the join table available might cause problems in other parts of your codebase. Having extra attributes appear magically on an object is kind of cool, but what happens when you try to access those extra properties on an object that wasn’t fetched via the habtm association? Kaboom! Get ready for some potentially bewildering debugging exercises.

Other than the deprecated push_with_attributes, methods of the habtm proxy act just as they would for a has_many relationship. Similarly, habtm shares options with has_many; only its :join_table option is unique. It allows customization of the join table name.

To sum up, habtm is a simple way to establish a many-to-many relationship using a join table. As long as you don’t need to capture additional data about the relationship, everything is fine. The problems with habtm begin once you want to add extra columns to the join table, after which you’ll want to upgrade the relationship to use has_many :through instead.

“Real Join Models” and habtm

Rails 1.2 documentation advises readers that: “It’s strongly recommended that you upgrade any [habtm] associations with attributes to a real join model.” Use of habtm, which was one of the original innovative features in Rails, fell out of favor once the ability to create real join models was introduced via the has_many :through association.

Realistically, habtm is not going to be removed from Rails, for a couple of sensible reasons. First of all, plenty of legacy Rails applications need it. Second, habtm provides a way to join classes without a primary key defined on the join table, which is occasionally useful. But most of the time you’ll find yourself wanting to model many-to-many relationships with has_many :through.

has_many :through

Well-known Rails guy and fellow cabooser Josh Susser is considered the expert on ActiveRecord associations, even his blog is called has_many :through. His description of the :through association[2], written back when the feature was originally introduced in Rails 1.1, is so concise and well-written that I couldn’t hope to do any better. So here it is:

The has_many :through association allows you to specify a one-to-many relationship indirectly via an intermediate join table. In fact, you can specify more than one such relationship via the same table, which effectively makes it a replacement for has_and_belongs_to_many. The biggest advantage is that the join table contains full-fledged model objects complete with primary keys and ancillary data. No more push_with_attributes; join models just work the same way all your other ActiveRecord models do.

Join Models

To illustrate the has_many :through association, we’ll set up a Client model so that it has many Timesheet objects, through a normal has_many association named billable_weeks.

class Client < ActiveRecord::Base
  has_many :billable_weeks
  has_many :timesheets, :through => :billable_weeks
end

The BillableWeek class was already in our sample application and is ready to be used as a join model:

class BillableWeek < ActiveRecord::Base
  belongs_to :client
  belongs_to :timesheet
end

We can also set up the inverse relationship, from timesheets to clients, like this.

class Timesheet < ActiveRecord::Base
  has_many :billable_weeks
  has_many :clients, :through => :billable_weeks
end

Notice that has_many :through is always used in conjunction with a normal has_many association. Also, notice that the normal has_many association will often have the same name on both classes that are being joined together, which means the :through option will read the same on both sides.

:through => :billable_weeks

How about the join model; will it always have two belongs_to associations? No.

You can also use has_many :through to easily aggregate has_many or has_one associations on the join model. Forgive me for switching to completely nonrealistic domain for a moment—it’s only intended to clearly demonstrate what I’m trying to describe:

class Grandparent < ActiveRecord::Base
  has_many :parents
  has_many :grand_children, :through => :parents, :source => :childs
end

class Parent < ActiveRecord::Base
  belongs_to :grandparent
  has_many   :childs
end

For the sake of clarity in later chapters, I’ll refer to this usage of has_many :through as aggregating.

Usage Considerations and Examples

You can use nonaggregating has_many :through associations in almost the same ways as any other has_many associations. The limitations have to do with handling of unsaved records.

>> c = Client.create(:name => "Trotter's Tomahawks", :code => "ttom")
=> #<Client:0x2228410...>

>> c.timesheets << Timesheet.new
ActiveRecord::HasManyThroughCantAssociateNewRecords: Cannot associate
new records through 'Client#billable_weeks' on '#'. Both records must
have an id in order to create the has_many :through record associating
them.

Hmm, seems like we had a hiccup. Unlike a normal has_many, ActiveRecord won’t let us add an object to the the has_many :through association if both ends of the relationship are unsaved records.

The create method saves the record before adding it, so it does work as expected, provided the parent object isn’t unsaved itself.

>> c.save
=> true

>> c.timesheets.create
=> [#<Timesheet:0x2212354 @new_record=false, @new_record_before_save=
true, @attributes={"updated_at"=>Sun Mar 18 15:37:18 UTC 2007,
"id"=>2,
"user_id"=>nil, "submitted"=>nil, "created_at"=>Sun Mar 18 15:37:18
UTC
2007}, @errors=#<ActiveRecord::Errors:0x2211940 @base=
#<Timesheet:0x2212354 ...>, @errors={}>> ]

The main benefit of has_many :through is that ActiveRecord takes care of managing the instances of the join model for you. If we call reload on the billable_weeks association, we’ll see that there was a billable week object created for us:

>> c.billable_weeks.reload
=> [#<BillableWeek:0x139329c @attributes={"tuesday_hours"=>nil,
"start_date"=>nil, "timesheet_id"=>"2", "billing_code_id"=>nil,
"sunday_hours"=>nil, "friday_hours"=>nil, "monday_hours"=>nil,
"client_id"=>"2", "id"=>"2", "wednesday_hours"=>nil,
"saturday_hours"=>nil, "thursday_hours"=>nil}> ]

The BillableWeek object that was created is properly associated with both the client and the Timesheet. Unfortunately, there are a lot of other attributes (e.g., start_date, and the hours columns) that were not populated.

One possible solution is to use create on the billable_weeks association instead, and include the new Timesheet object as one of the supplied properties.

>> bw = c.billable_weeks.create(:start_date => Time.now,
                                :timesheet => Timesheet.new)
=> #<BillableWeek:0x250fe08 @timesheet=#<Timesheet:0x2510100
@new_record=false, ...>

Aggregating Associations

When you’re using has_many :through to aggregate multiple child associations, there are more significant limitations—essentially you can query to your hearts content using find and friends, but you can’t append or create new records through them.

For example, let’s add a billable_weeks association to our sample User class:

class User < ActiveRecord::Base
  has_many :timesheets
  has_many :billable_weeks, :through => :timesheets
  ...

The billable_weeks association aggregates all the billable week objects belonging to all of the user’s timesheets.

class Timesheet < ActiveRecord::Base
  belongs_to :user
  has_many :billable_weeks, :include => [:billing_code]
  ...

Now let’s go into the Rails console and set up some example data so that we can use the new billable_weeks collection (on User).

>> quentin = User.find :first
=> #<User id: 1, login: "quentin" ...>

>> quentin.timesheets
=> []

>> ts1 = quentin.timesheets.create
=> #<Timesheet id: 1 ...>

>> ts2 = quentin.timesheets.create
=> #<Timesheet id: 2 ...>

>> ts1.billable_weeks.create(:start_date => 1.week.ago)
=> #<BillableWeek id: 1, timesheet_id: 1 ...>

>> ts2.billable_weeks.create :start_date => 2.week.ago
=> #<BillableWeek id: 2, timesheet_id: 2 ...>

>> quentin.billable_weeks
=> [#<BillableWeek id: 1, timesheet_id: 1 ...>, #<BillableWeek id: 2,
timesheet_id: 2 ...>]

Just for fun, let’s see what happens if we try to create a BillableWeek with a User instance:

>> quentin.billable_weeks.create(:start_date => 3.weeks.ago)
NoMethodError: undefined method `user_id=` for
#<BillableWeek:0x3f84424>

There you go... BillableWeek doesn’t belong to a user, it belongs to a timesheet, so it doesn’t have a user_id field.

Join Models and Validations

When you append to a non-aggregating has_many :through association with <<, ActiveRecord will always create a new join model, even if one already exists for the two records being joined. You can add validates_uniqueness_of constraints on the join model to keep duplicate joins from happening.

This is what such a constraint might look like on our BillableWeek join model.

validates_uniqueness_of :client_id, :scope => :timesheet_id

That says, in effect: “There should only be one of each client per timesheet.”

If your join model has additional attributes with their own validation logic, then there’s another important consideration to keep in mind. Adding records directly to a has_many :through association causes a new join model to be automatically created with a blank set of attributes. Validations on additional columns of the join model will probably fail. If that happens, you’ll need to add new records by creating join model objects and associating them appropriately through their own association proxy.

timesheet.billable_weeks.create(:start_date => 1.week.ago)

has_many :through Options

The options for has_many :through are the same as the options for has_many—remember that :through is just an option on has_many! However, the use of some of has_many’s options change or become more significant when :through is used.

First of all, the :class_name and :foreign_key options are no longer valid, since they are implied from the target association on the join model.

Here are the rest of the options that have special significance together with has_many :through.

:source

The :source option specifies which association to use on the associated class. This option is not mandatory because normally ActiveRecord assumes that the target association is the singular (or plural) version of the has_many association name. If your association names don’t match up, then you have to set :source explicitly.

For example, the following code will use the BillableWeek’s sheet association to populate timesheets.

has_many :timesheets, :through => :billable_weeks, :source => :sheet

:source_type

The :source_type option is needed when you establish a has_many :through to a polymorphic belongs_to association on the join model.

Consider the following example of clients and contacts:

class Client < ActiveRecord::Base
  has_many :contact_cards
  has_many :contacts, :through => :contact_cards
end

class ContactCard < ActiveRecord::Base
  belongs_to :client
  belongs_to :contacts, :polymorphic => true
end

The most important fact here is that a Client has many contacts, which can be any kind of model since they are declared polymorphically on the join model, ContactCard. For example purposes, let’s associate people and businesses to contact cards:

class Person < ActiveRecord::Base
  has_many :contact_cards, :as => :contact
end

class Business < ActiveRecord::Base
  has_many :contact_cards, :as => :contact
end

Now take a moment to consider the backflips that ActiveRecord would have to perform in order to figure out which tables to query for a client’s contacts. It would theoretically need to be aware of every model class that is linked to the other end of the contacts polymorphic association.

In fact, it can’t do those kinds of backflips, which is probably a good thing as far as performance is concerned:

>> Client.find(:first).contacts
ArgumentError: /.../active_support/core_ext/hash/keys.rb:48:
in `assert_valid_keys`: Unknown key(s): polymorphic

The only way to make this scenario work (somewhat) is to give ActiveRecord some help by specifying which table it should search when you ask for the contacts collection, and you do that with the source_type option. The value of the option is the name of the target class, symbolized:

class Client < ActiveRecord::Base
  has_many :people_contacts, :through => :contact_cards,
           :source => :contacts, :source_type => :person

  has_many :business_contacts, :through => :contact_cards,
           :source => :contacts, :source_type => :business
end

After the :source_type is specified, the association will work as expected.

>> Client.find(:first).people_contacts.create!
[#<Person:0x223e788 @attributes={"id"=>1}, @errors=
#<ActiveRecord::Errors:0x223dc0c @errors={}, @base=
#<Person: 0x...>>, @new_record_before_save=true, @new_record=false>]

The code is a bit longer and less magical, but it works. If you’re upset that you cannot associate people_contacts and business_contacts together in a contacts association, you could try writing your own accessor method for a client’s contacts:

class Client < ActiveRecord::Base
  def contacts
    people_contacts + business_contacts
  end
end

Of course, you should be aware that calling that contacts method will result in at least two database requests and will return an Array, without the association proxy methods that you might expect it to have.

:uniq

The :uniq option tells the association to include only unique objects. It is especially useful when using has_many :through, since two different BillableWeeks could reference the same Timesheet.

>> client.find(:first).timesheets.reload
[#<Timesheet:0x13e79dc @attributes={"id"=>"1", ...}>,
 #<Timesheet:0x13e79b4 @attributes={"id"=>"1", ...}>]

It’s not extraordinary for two distinct model instances of the same database record to be in memory at the same time—it’s just not usually desirable.

class Client < ActiveRecord::Base
  has_many :timesheets, :through => :billable_weeks, :uniq => true
end

After adding the :uniq option, only one instance per record is returned.

>> client.find(:first).timesheets.reload
[#<Timesheet:0x22332ac ...>]

The implementation of uniq on AssociationCollection is a neat little example of how to build a collection of unique values in Ruby, using a Set and the inject method. It also proves that the record’s primary key (and nothing else) is what’s being used to establish uniqueness:

def uniq(collection = self)
  seen = Set.new
  collection.inject([]) do |kept, record|
    unless seen.include?(record.id)
      kept << record
      seen << record.id
    end
    kept
  end
end

One-to-One Relationships

One of the most basic relationship types is a one-to-one object relationship. In ActiveRecord we declare a one-to-one relationship using the has_one and belongs_to methods together. As in the case of a has_many relationship, you call belongs_to on the model whose database table contains the foreign key column linking the two records together.

has_one

Conceptually, has_one works almost exactly like has_many does, except that when the database query is executed to retrieve the related object, a LIMIT 1 clause is added to the generated SQL so that only one row is returned.

The name of a has_one relationship should be singular, which will make it read naturally, for example: has one :last_timesheet, has one :primary_account, has one :profile_photo, and so on.

Let’s take a look at has_one in action by adding avatars for our users.

class Avatar < ActiveRecord::Base
  belongs_to :user
end

class User < ActiveRecord::Base
  has_one :avatar
  # ... the rest of our User code ...
end

That’s simple enough. Firing this up in script/console, we can look at some of the new methods that has_one adds to User.

>> u = User.find(:first)
>> u.avatar
=> nil

>> u.build_avatar(:url => '/avatars/smiling')
#<Avatar:0x2266bac @new_record=true, @attributes={"url"=>
"/avatars/smiling", "user_id"=>1}>

>> u.avatar.save
=> true

As you can see, we can use build_avatar to build a new avatar object and associate it with the user. While it’s great that has_one will associate an avatar with the user, it isn’t really anything that has_many doesn’t already do. So let’s take a look at what happens when we assign a new avatar to the user.

>> u = User.find(:first)
>> u.avatar
=> #<Avatar:0x2266bac @attributes={"url"=>"/avatars/smiling",
"user_id"=>1}>

>> u.create_avatar(:url => '/avatars/frowning')

=> #<Avatar:0x225071c @new_record=false, @attributes={"url"=>
"/avatars/4567", "id"=>2, "user_id"=>1}, @errors=
#<ActiveRecord::Errors:0x224fc40 @base=#<Avatar:0x225071c ...>,
@errors={}>>

>> Avatar.find(:all)
=> [#<Avatar:0x22426f8 @attributes={"url"=>"/avatars/smiling",
"id"=>"1", "user_id"=>nil}>, #<Avatar:0x22426d0
@attributes={"url"=>"/avatars/frowning", "id"=>"2", "user_id"=>"1"}>]

The last line from that script/console session is the most interesting, because it shows that our initial avatar is now no longer associated with the user. Of course, the previous avatar was not removed from the database, which is something that we want in this scenario. So, we’ll use the :dependent => :destroy option to force avatars to be destroyed when they are no longer associated with a user.

class User
  has_one :avatar, :dependent => :destroy
end

With some fiddling around in the console, we can verify that it works as intended.

>> u = User.find(:first)
>> u.avatar
=> #<Avatar:0x22426d0 @attributes={"url"=>"/avatars/frowning",
"id"=>"2", "user_id"=>"1"}>

>> u.avatar = Avatar.create(:url => "/avatars/jumping")
=> #<Avatar:0x22512ac @new_record=false,
@attributes={"url"=>"avatars/jumping", "id"=>3, "user_id"=>1},
@errors=#<ActiveRecord::Errors:0x22508e8 @base=#<Avatar:0x22512ac
...>,
@errors={}>>

>> Avatar.find(:all)
=> [#<Avatar:0x22426f8 @attributes={"url"=>"/avatars/smiling", "id"
=>"1", "user_id"=>nil}>, #<Avatar:0x2245920 @attributes={"url"=>
"avatars/jumping","id"=>"3", "user_id"=>"1"}>]

As you can see, adding :dependent => :destroy got rid of the frowning avatar, but not the smiling avatar. Rails only destroys the avatar that was just removed from the user, so bad data that is in your database from before will still remain. Keep this in mind when you decide to add :dependent => :destroy and remember to manually clear any bad data from before.

As I alluded to earlier, has_one is often used to single out one record of significance alongside an already established has_many relationship. For instance, let’s say we want to easily be able to access the last timesheet a user was working on:

class User < ActiveRecord::Base
  has_many :timesheets
  has_one  :latest_timesheet, :class_name => 'Timesheet'
end

I had to specify a :class_name, so that ActiveRecord knows what kind of object we’re associating. (It can’t figure it out based on the name of the association, :latest_timesheet.)

When adding a has_one relationship to a model that already has a has_many defined to the same related model, it is not necessary to add another belongs_to method call to the target object, just for the new has_one. That might seem a little counterintuitive at first, but if you think about it, the same foreign key value is being used to read the data from the database.

What happens when you replace an existing has_one target object with another? A lot depends on whether the newly related object was created before or after the object that it is replacing, because ActiveRecord doesn’t add any additional ordering parameters to its has_one query.

has_one Options

The options for has_one associations are similar to the ones for has_many.

:as

Allows you to set up a polymorphic association, covered in Chapter 9.

:class_name

Allows you to specify the class this association uses. When you’re doing has_one :latest_timesheet, :class_name => 'Timesheet', :class_name => 'Timesheet' specifies that latest_timesheet is actually the last Timesheet object in the database that is associated with this user. Normally, this option is inferred by Rails from the name of the association.

:conditions

Allows you to specify conditions that the object must meet to be included in the association. The conditions are specified the same as if you were using ActiveRecord#find.

class User
  has_one :manager,
          :class_name => 'Person',
          :conditions => ["type = ?", "manager"]
end

Here manager is specified as a person object that has type = "manager". I tend to almost always use :conditions in conjunction with has_one. When ActiveRecord loads the association, it’s grabbing one of potentially many rows that have the right foreign key. Absent some explicit conditions (or perhaps an order clause), you’re leaving it in the hands of the database to pick a row.

:dependent

The :dependent option specifies how ActiveRecord should treat associated objects when the parent object is deleted. There are a few different values that you can pass and they work just like the :dependent option of has_many.

If you pass :destroy to it, you tell Rails to destroy the associated object when it is no longer associated with the primary object. Passing :delete will destroy the associated object without calling any of Rails’ normal hooks. Finally, the default (:nullify) will simply set the foreign key values to null so that the connection is broken.

:foreign_key

Specifies the name of the foreign key column on the association’s table.

:include

Allows you to “eagerload” additional association objects when your associated object is loaded. See the :include option of the has_many and belongs_to associations for more details.

:order

Allows you to specify a SQL fragment that will be used to order the results. This is an especially useful option with has_one when trying to associate the latest of something or another.

class User
  has_one :latest_timesheet,
          :class_name => 'Timesheet',
          :order => 'created_at desc'
end

Unsaved Objects and Associations

You can manipulate objects and associations before they are saved to the database, but there is some special behavior you should be aware of, mostly involving the saving of associated objects. Whether an object is considered unsaved is based on the result of calling new_record?

One-to-One Associations

Assigning an object to a has_one association automatically saves that object and the object being replaced (if there is one), so that their foreign key fields are updated. The exception to this behavior is if the parent object is unsaved, since that would mean that there is no foreign key value to set.

If save fails for either of the objects being updated (due to one of them being invalid) the assignment operation returns false and the assignment is cancelled. That behavior makes sense (if you think about it), but it can be the cause of much confusion when you’re not aware of it. If you have an association that doesn’t seem to work, check the validation rules of the related objects.

If you happen to want to assign an object to a has_one association without saving it, you can use the association’s build method:

user.profile_photo.build(params[:photo])

Assigning an object to a belongs_to association does not save the parent or the associated object.

Collections

Adding an object to has_many and has_and_belongs_to_many collections automatically saves it, unless the parent object (the owner of the collection) is not yet stored in the database.

If objects being added to a collection (via << or similar means) fail to save properly, then the addition operation will return false. If you want your code to be a little more explicit, or you want to add an object to a collection without automatically saving it, then you can use the collection’s build method. It’s exactly like create, except that it doesn’t save.

Members of a collection are automatically saved (or updated) when their parent is saved (or updated).

Association Extensions

The proxy objects that handle access to associations can be extended with your own application code. You can add your own custom finders and factory methods to be used specifically with a particular association.

For example, let’s say you wanted a concise way to refer to an account’s people by name. You might wrap the find_or_create_by_first_name_and_last_name method of a people collection in the following neat little package as shown in Listing 7.5.

Example 7.5. An Association Extension on a People Collection

class Account < ActiveRecord::Base

  has_many :people do
    def named(name)
      first_name, last_name = name.split(" ", 2)
      find_or_create_by_first_name_and_last_name(first_name,
last_name)
    end
  end

end

Now we have a named method available to use on the people collection.

person = Account.find(:first).people.named("David Heinemeier Hansson")
person.first_name # => "David"
person.last_name  # => "Heinemeier Hansson"

If you need to share the same set of extensions between many associations, you can use specify an extension module, instead of a block with method definitions.

Here is the same feature shown in Listing 7.5, except broken out into its own Ruby module:

module ByNameExtension
  def named(name)
    first_name, last_name = name.split(" ", 2)
    find_or_create_by_first_name_and_last_name(first_name, last_name)
  end
end

Now we can use it to extend many different relationships, as long as they’re compatible. (Our contract in the example consists of the find_or_create_by_first_name_and_last_name method.)

class Account < ActiveRecord::Base
  has_many :people, :extend => ByNameExtension
end

class Company < ActiveRecord::Base
  has_many :people, :extend => ByNameExtension
end

If you need to use multiple named extension modules, you can pass an array of modules to the :extend option instead of a single module, like this:

has_many :people, :extend => [ByNameExtension, ByRecentExtension]

In the case of name conflicts, methods contained in modules added later in the array supercede those earlier in the array.

The AssociationProxy Class

AssociationProxy, the parent of all association proxies (refer to Figure 7.1 if needed), contributes a number of useful methods that apply to most kinds of associations and can come into play when you’re writing association extensions.

reload and reset

The reset method puts the association proxy back in its initial state, which is unloaded (cached association objects are cleared). The reload method invokes reset, and then loads associated objects from the database.

proxy_owner, proxy_reflection, and proxy_target

References to the internal owner, reflection, and target attributes of the association proxy, respectively.

The proxy_owner method provides a reference to the parent object holding the association.

The proxy_reflection object is an instace of ActiveRecord::Reflection::AssociationReflection and contains all of the configuration options for the association. That includes both default settings and those that were passed to the association method when it was declared.[3]

The proxy_target is the associated array (or associated object itself in the case of belongs_to and has_one).

It might not appear sane to expose these attributes publicly and allow their manipulation. However, without access to them it would be much more difficult to write advanced association extensions. The loaded?, loaded, target, and target= methods are public for similar reasons.

The following code sample demonstrates the use of proxy_owner within a published_prior_to extension method contributed by Wilson Bilkovich:

class ArticleCategory < ActiveRecord::Base

 acts_as_tree

 has_many :articles do

  def published_prior_to(date, options = {})
    if proxy_owner.top_level?
      Article.find_all_published_prior_to(date, :category =>
proxy_owner)
    else
      # self is the 'articles' association here so we inherit its
scope
      self.find(:all, options)
    end
  end

 end # has_many :articles extension

 def top_level?
   # do we have a parent, and is our parent the root node of the tree?
   self.parent && self.parent.parent.nil?
 end

end

The acts_as_tree ActiveRecord plugin extension creates a self-referential association based on a parent_id column. The proxy_owner reference is used to check if the parent of this association is a “top-level” node in the tree.

Conclusion

The ability to model associations is what make ActiveRecord more than just a data-access layer. The ease and elegance with which you can declare those associations are what make ActiveRecord more than your ordinary object-relational mapper.

In this chapter, we covered the basics of how ActiveRecord associations work. We started by taking a look at the class hierarchy of associations classes, starting with AssociationProxy. Hopefully, by learning about how associations work under the hood, you’ve picked up some enhanced understanding about their power and flexibility.

Finally, the options and methods guide for each type of association should be a good reference guide for your day-to-day development activities.

References

1.

http://en.wikipedia.org/wiki/Insert_(SQL)#Multirow_inserts

2.

http://blog.hasmanythrough.com/articles/2006/02/28/association-goodness

3.

To learn more about how the reflection object can be useful, including an explanation on how to establish has_many :through associations via other has_many :through associations, check out the must-read article: http://www.pivotalblabs.com/articles/2007/08/26/ten-things-i-hate-about-proxy-objects-part-i.

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.130.24