Any time you can reify something, you can create something that embodies a concept, it gives you leverage to work with it more powerfully. That’s exactly what’s going on with has_many :through.
—Josh Susser
Active Record associations let you declaratively express relationships between model classes. The power and readability of the Associations API is an important part of what makes working with Rails so special.
This chapter covers the different kinds of Active Record associations available while highlighting use cases and available customizations for each of them. We also take a look at the classes that give us access to relationships themselves.
Associations typically appear as methods on Active Record model objects. For example, the method timesheets
might represent the timesheets associated with a given user
.
user.timesheets
However, people might get confused about the type of objects that are returned by association with these methods. This is because they have a way of masquerading as plain old Ruby objects. For instance, in previous versions of Rails, an association collection would seem to return an array of objects, when in fact the return type was actually an association proxy. As of Rails 4, asking any association collection what its return type is will tell you that it is an ActiveRecord::Associations::CollectionProxy
:
>> user.timesheets
=> #<ActiveRecord::Associations::CollectionProxy []>
It’s actually lying to you, albeit very innocently. Association methods for has_many
associations are actually instances of HasManyAssociation
.
The CollectionProxy
acts like a middleman between the object that owns the association and the actual associated object. Methods that are unknown to the proxy are sent to the target object via method_missing
.
Fortunately, it’s not the Ruby way to care about the actual class of an object. What messages an object responds to is a lot more significant.
The parent class of all has_many
associations is CollectionAssociation
and most of the methods that it defines work similarly, regardless of the options declared for the relationship. Before we get much further into the details of the association proxies, let’s delve into the most fundamental type of association that is commonly used in Rails applications: the has_many
/ belongs_to
pair, used to define one-to-many relationships.
In our recurring sample application, an example of a one-to-many relationship is the association between the User
, Timesheet
, and ExpenseReport
classes:
1 class User < ActiveRecord::Base
2 has_many :timesheets
3 has_many :expense_reports
4 end
Timesheets and expense reports should be linked in the opposite direction as well, so that it is possible to reference the user
to which a timesheet or expense report belongs.
1 class Timesheet < ActiveRecord::Base
2 belongs_to :user
3 end
4
5 class ExpenseReport < ActiveRecord::Base
6 belongs_to :user
7 end
When these relationship declarations are executed, Rails uses some metaprogramming magic to dynamically add code to your models. In particular, proxy collection objects are created that let you manipulate the relationship easily. To demonstrate, let’s play with these relationships in the console. First, I’ll create a user.
>> obie = User.create login: 'obie', password: '1234',
password_confirmation: '1234', email: '[email protected]'
=> #<User...>
Now I’ll verify that I have collections for timesheets and expense reports.
>> obie.timesheets
Timesheet Load (0.4ms) SELECT "timesheets".* FROM "timesheets" WHERE
"timesheets"."user_id" = ? [[nil, 1]]
SQLite3::SQLException: no such column: timesheets.user_id: SELECT
"timesheets".* FROM "timesheets" WHERE "timesheets"."user_id" = ?
As David might say, “Whoops!” I forgot to add the foreign key columns to the timesheets
and expense_reports
tables, so in order to go forward I’ll generate a migration for the changes:
$ rails generate migration add_user_foreign_keys
invoke active_record
create db/migrate/20130330201532_add_user_foreign_keys.rb
Then I’ll open db/migrate/20130330201532_add_user_foreign_keys.rb
and add the missing columns. (Using change_table
would mean writing many more lines of code, so we’ll stick with the traditional add_column
syntax, which still works fine.)
1 class AddUserForeignKeys < ActiveRecord::Migration
2 def change
3 add_column :timesheets, :user_id, :integer
4 add_column :expense_reports, :user_id, :integer
5 end
6 end
Running rake db:migrate
applies the changes:
$ rake db:migrate
== AddUserForeignKeys: migrating========================================
-- add_column(:timesheets, :user_id, :integer)
-> 0.0011s
-- add_column(:expense_reports, :user_id, :integer)
-> 0.0005s
== AddUserForeignKeys: migrated (0.0018s) ==============================
Index Associations for Performance Boost
Premature optimization is the root of all evil. However, most experienced Rails developers don’t mind adding indexes for foreign keys at the time that those are created. In the case of our migration example, you’d add the following statements:
1 add_index :timesheets, :user_id
2 add_index :expense_reports, :user_id
The loading of your associations (which is usually more common than the creation of items) will get a big performance boost.
Now I should be able to add a new blank timesheet to my user and check timesheets
again to make sure it’s there:
>> obie = User.find(1)
=> #<User id: 1...>
>> obie.timesheets << Timesheet.new
=> #<ActiveRecord::Associations::CollectionProxy [#<Timesheet id: 1 ...]>
>> obie.timesheets
=> #<ActiveRecord::Associations::CollectionProxy [#<Timesheet id: 1 ...]>
Notice that the Timesheet
object gains an id
immediately.
As you can deduce from the previous example, appending an object to a has_many
collection automatically saves that object—that is, unless the parent object (the owner of the collection) is not yet stored in the database. Let’s make sure that’s the case using Active Record’s reload
method, which refetches the attributes of an object from the database:
>> obie.timesheets.reload
=> #<ActiveRecord::Associations::CollectionProxy [#<Timesheet id: 1, user_id: 1 ...]>
There it is. The foreign key, user_id
, was automatically set by the <<
method. It takes one or more association objects to add to the collection, and since it flattens its argument list and inserts each record, push
and concat
behave identically.
In the blank timesheet example, I could have used the create
method on the association proxy, and it would have worked essentially the same way:
>> obie.timesheets.create
=> #<ActiveRecord::Associations::CollectionProxy [#<Timesheet id: 1, user_id: 1 ...]>
Even though at first glance <<
and create
do the same thing, there are some important differences in how they’re implemented that are covered in the following section.
Association collections are basically fancy wrappers around a Ruby array and have a normal array’s methods. Named scopes and all of ActiveRecord::Base
’s class methods are also available on association collections, including find
, order
, where
, and so on.
user.timesheets.where(submitted: true).order('updated_at desc')
user.timesheets.late # assuming a scope :late defined on the Timesheet class
The following methods of CollectionProxy
are available to association collections:
Both methods will add either a single associated object or many, depending on whether you pass them an array or not. They both also trigger the :before_add
and :after_add
callbacks (covered in this chapter’s section “has_many
Options”).
Finally, the return value behavior of both methods varies wildly. The create
method returns the new instance created, which is what you’d expect given its counterpart in ActiveRecord::Base
. The <<
method returns the association proxy, which allows chaining and is also natural behavior for a Ruby array.
However, <<
will return false
and not itself if any of the records being added causes the operation to fail. You shouldn’t depend on the return value of <<
being an array that you can continue operating on in a chained fashion.
The any?
method behaves like its Enumerable
counterpart if you give it a block; otherwise, it’s the opposite of empty?
. Its companion method many?
, which is an Active Support extension to Enumerable, returns true if the size of the collection is greater than one or if a block is given if two or more elements match the supplied criteria.
Convenience wrapper for calculate(:average, ...)
.
Traditionally, the build
method has corresponded to the new
method of Active Record classes, except that it presets the owner’s foreign key and appends it to the association collection in one operation. However, as of Rails 2.2, the new
method has the same behavior and probably should be used instead of build
.
user.timesheets.build(attributes)
user.timesheets.new(attributes) # same as calling build
One possible reason to still use build
is that as a convenience, if the attributes
parameter is an array of hashes (instead of just one), then build
executes for each one. However, you would usually accomplish that kind of behavior using accepts_nested_attributes_for
on the owning class, covered in Chapter 11, “All about Helpers,” in section 11.9.3, “Integrating Additional Objects in One Form.”
Provides aggregate (:sum
, :average
, :minimum
, and :maximum
) values within the scope of associated records. Covered in detail in Chapter 9, “Advanced Active Record.”
The clear
method is similar to invoking delete_all
(covered later in this section); however, instead of returning an array of deleted objects, it is chainable.
Counts all associated records in the database. The first parameter, column_name
, gives you the option of counting on a column instead of generating COUNT(*)
in the resulting SQL. If the :counter_sql
option is set for the association, it will be used for the query; otherwise, you can pass a custom value via the options hash of this method.
Assuming that no :counter_sql
or :finder_sql
options are set on the association or passed to count
, the target class’s count method is used, scoped to only count associated records.
Instantiates a new record with its foreign key attribute set to the owner’s id, adds it to the association collection, and saves it, all in one method call. The bang variant raises Active::RecordInvalid
if saving fails, while the nonbang variant returns true or false, as you would expect it to based on the behavior of create methods in other places.
The owning record must be saved in order to use create; otherwise, an ActiveRecord::RecordNotSaved
exception is raised.
>> User.new.timesheets.create
ActiveRecord::RecordNotSaved: You cannot call create unless the parent is saved
If a block is passed to create
or create!
, it will get yielded the newly created instance after the passed-in attributes are assigned but before saving the record to the database.
The delete
and delete_all
methods are used to sever specified associations or all of them, respectively. Both methods operate transactionally.
Invoking delete_all
executes an SQL UPDATE
that sets foreign keys for all currently associated objects to nil, effectively disassociating them from their parent.
Note
The names of the delete
and delete_all
methods can be misleading. By default, they don’t delete anything from the database—they only sever associations by clearing the foreign key field of the associated record. This behavior is related to the :dependent
option, which defaults to :nullify
. If the association is configured with the :dependent
option set to :delete
or :destroy
, then the associated records will actually be deleted from the database.
The destroy
and destroy_all
methods are used to remove specified or all associations from the database. Both methods operate transactionally.
The destroy_all
method takes no parameters; it’s an all or nothing affair. When called, it begins a transaction and invokes destroy
on each object in the association, causing them all to be deleted from the database with individual DELETE
SQL statements. There are load issues to consider if you plan to use this method with large association collections, since many objects will be loaded into memory at once.
Simply calls size.zero?
.
Finds an associated record by id
—a really common operation when dealing with nested RESTful resources. Raises ActiveRecord::RecordNotFound
exception if either the id
or foreign_key
of the owner record is not found.
Returns the first associated record. Wondering how Active Record figures out whether to go to the database instead of loading the entire association collection into memory?
1 def fetch_first_or_last_using_find?(args)
2 if args.first.is_a?(Hash)
3 true
4 else
5 !(loaded? ||
6 owner.new_record? ||
7 options[:finder_sql] ||
8 target.any? { |record| record.new_record? || record.changed? } ||
9 args.first.kind_of?(Integer))
10 end
11 end
Passing first
an integer argument mimics the semantics of Ruby’s Array#first
, returning that number of records.
>> c = Client.first
=> #<Client id: 1, name: "Taigan", code: "TAIGAN", created_at: "2010-01-24
03:18:58", updated_at: "2010-01-24 03:18:58">
>> c.billing_codes.first(2)
=> [#<BillingCode id: 1, client_id: 1, code: "MTG", description: "Meetings">,
#<BillingCode id: 2, client_id: 1, code: "DEV", description: "Development">]
Convenience wrapper for pluck(primary_key)
, covered in detail in Chapter 9, “Advanced Active Record.”
Checks to see if the supplied record exists in the association collection and that it still exists in the underlying database table.
Returns the last associated record. Refer to description of first
earlier in this section for more details—it behaves exactly the same except for the obvious.
Returns the size of the collection by loading it and calling size
on the array.
Convenience wrapper for calculate(:maximum, ...)
, covered in detail in Chapter 9, “Advanced Active Record.”
Convenience wrapper for calculate(:minimum, ...)
, covered in detail in Chapter 9, “Advanced Active Record.”
Instantiates a new record with its foreign key attribute set to the owner’s id and adds it to the association collection, in one method call.
Returns an array of attribute values, covered in detail in Chapter 9, “Advanced Active Record.”
Replaces the collection with other_array
. Works by deleting objects that exist in the current collection but not in other_array
and inserting (using concat
) objects that don’t exist in the current collection but do exist in other_array
.
The select
method allows the specification one or many attributes to be selected for an association result set.
>> user.timesheets.select(:submitted).to_a
=> [#<Timesheet id: nil, submitted: false>,
#<Timesheet id: nil, submitted: true>]
>> user.timesheets.select([:id,:submitted]).to_a
=> [#<Timesheet id: 1, submitted: false>,
#<Timesheet id: 2, submitted: true>]
Keep in mind that only attributes specified will be populated in the resulting objects! For instance, continuing the first example, trying to access updated_at
on any of the returned timesheets results in an ActiveModel::MissingAttributeError
exception being raised.
>> timesheet = user.timesheets.select(:submitted).first
=> #<Timesheet id: nil, submitted: false>
>> timesheet.updated_at
ActiveModel::MissingAttributeError: missing attribute: updated_at
Alternatively, passing a block to the select
method behaves similarly to Array#select
. The result set from the database scope is converted into an array of objects and iterated through using Array#select
, including only objects where the specified block returns true.
If the collection has already been loaded or its owner object has never been saved, the size
method simply returns the size of the current underlying array of associated objects. Otherwise, assuming default options, a SELECT COUNT(*)
query is executed to get the size of the associated collection without having to load any objects. The query is bounded to the :limit
option of the association, if there is any set.
Note that if there is a counter_cache
option set on the association, then its value is used instead of hitting the database.
When you know that you are starting from an unloaded state and it’s likely that there are associated records in the database that you will need to load no matter what, it’s more efficient to use length
instead of size
.
Some association options, such as :group
and :uniq
, come into play when calculating size—basically they will always force all objects to be loaded from the database so that the resulting size of the association array can be returned.
Convenience wrapper for calculate(:sum, ...)
, covered in detail in Chapter 9, “Advanced Active Record.”
Iterates over the target collection and populates an Array
with the unique values present. Keep in mind that equality of Active Record objects is determined by identity, meaning that the value of the id
attribute is the same for both objects being compared.
A Warning about Association Names
Don’t create associations that have the same name as instance methods of ActiveRecord::Base
. Since the association adds a method with that name to its model, it will override the inherited method and break things. For instance, attributes
and connection
would make really bad choices for association names.
The belongs_to
class method expresses a relationship from one Active Record object to a single associated object for which it has a foreign key attribute. The trick to remembering whether a class “belongs to” another one is considering which has the foreign key column in its database table.
Assigning an object to a belongs_to
association will set its foreign key attribute to the owner object’s id but will not save the record to the database automatically, as in the following example:
>> timesheet = Timesheet.create
=> #<Timesheet id: 1409, user_id: nil...>
>> timesheet.user = obie
=> #<User id: 1, login: "obie"...>
>> timesheet.user.login
=> "obie"
>> timesheet.reload
=> #<Timesheet id: 1409, user_id: nil...>
Defining a belongs_to
relationship on a class creates a method with the same name on its instances. As mentioned earlier, the method is actually a proxy to the related Active Record object and adds capabilities useful for manipulating the relationship.
Just invoking the association method will query the database (if necessary) and return an instance of the related object. The method takes a force_reload
parameter that tells Active Record whether to reload the related object, if it happens to have been cached already by a previous access.
In the following capture from my console, I look up a timesheet and view the object_id
of its related user object. Notice that the second time I invoke the association via user
, the object_id
remains the same. The related object has been cached. However, passing true
to the accessor reloads the relationship, and I get a new instance.
>> ts = Timesheet.first
=> #<Timesheet id: 3, user_id: 1...>
>> ts.user.object_id
=> 70279541443160
>> ts.user.object_id
=> 70279541443160
>> ts.user(true).object_id
=> 70279549419740
During the belongs_to
method’s metaprogramming, it also adds factory methods for creating new instances of the related class and attaching them via the foreign key automatically.
The build_association
method does not save the new object, but the create_association
method does. Both methods take an optional hash of attribute parameters with which to initialize the newly instantiated objects. Both are essentially one-line convenience methods, which I don’t find particularly useful. It just doesn’t usually make sense to create instances in that direction!
To illustrate, I’ll simply show the code for building a User
from a Timesheet
or creating a Client
from a BillingCode
, neither of which would ever happen in real code because it just doesn’t make sense to do so:
>> ts = Timesheet.first
=> #<Timesheet id: 3, user_id: 1...>
>> ts.build_user
=> #<User id: nil, email: nil...>
>> bc = BillingCode.first
=> #<BillingCode id: 1, code: "TRAVEL"...>
>> bc.create_client
=> #<Client id: 1, name=>nil, code=>nil...>
You’ll find yourself creating instances of belonging objects from the has_many
side of the relationship much more often.
The following options can be passed in a hash to the belongs_to
method.
Indicates whether to automatically save the owning record whenever this record is saved. Defaults to false
.
Assume for a moment that we wanted to establish another belongs_to
relationship from the Timesheet
class to User
, this time modeling the relationship to the approver of the timesheet. You might start by adding an approver_id
column to the timesheets
table and an authorized_approver
column to the users
table via a migration. Then you would add a second belongs_to
declaration to the Timesheet
class:
1 class Timesheet < ActiveRecord::Base
2 belongs_to :approver
3 belongs_to :user
4 ...
Active Record won’t be able to figure out what class you’re trying to link with, just the information provided, because you’ve (legitimately) acted against the Rails convention of naming a relationship according to the related class. It’s time for a :class_name
parameter.
1 class Timesheet < ActiveRecord::Base
2 belongs_to :approver, class_name: 'User'
3 belongs_to :user
4 ...
Use this option to make Rails automatically update a counter field on the associated object with the number of belonging objects. The option value can be true
, in which case the pluralized name of the belonging class plus _count
is used, or you can supply your own column name to be used:
counter_cache: true
counter_cache: :number_of_children
If a significant percentage of your association collections will be empty at any given moment, you can optimize performance at the cost of some extra database storage by using counter caches liberally. The reason is that when the counter cache attribute is at zero, Rails won’t even try to query the database for the associated records!
Note
The value of the counter cache column must be set to zero by default in the database! Otherwise, the counter caching won’t work at all. It’s because the way that Rails implements the counter caching behavior is by adding a simple callback that goes directly to the database with an UPDATE
command and increments the value of the counter. If you’re not careful and neglect to set a default value of zero for the counter cache column on the database or misspell the column name, the counter cache will still seem to work! There is a magic method on all classes with has_many
associations called collection
_count
, just like the counter cache. It will return a correct count value based on the in-memory object, even if you don’t have a counter cache option set or the counter cache column value is null!
In the case that a counter cache was altered on the database side, you may tell Active Record to reset a potentially stale value to the correct count via the class method reset_counters
. It’s parameters are the id of the object and a list of association names.
Timesheet.reset_counters(5, :weeks)
Specifies a rule that the associated owner record should be destroyed or just deleted from the database, depending on the value of the option. When triggered, :destroy
will call the dependent’s callbacks, whereas :delete
will not.
Usage of this option might make sense in a has_one
/ belongs_to
pairing. However, it is really unlikely that you want this behavior on has_many
/ belongs_to
relationship; it just doesn’t seem to make sense to code things that way. Additionally, if the owner record has its :dependent
option set on the corresponding has_many
association, then destroying one associated record will have the ripple effect of destroying all its siblings.
Specifies the name of the foreign key column that should be used to find the associated object. Rails will normally infer this setting from the name of the association by adding _id
to it. You can override the inferred foreign key name with this option if necessary.
# Without the explicit option, Rails would guess administrator_id.
belongs_to :administrator, foreign_key: 'admin_user_id'
Explicitly declares the name of the inverse association in a bidirectional relationship. Considered an optimization, use of this option allows Rails to return the same instance of an object no matter which side of the relationship it is accessed from.
This is covered in detail in the section “inverse_of: name_of_belongs_to_association
” in this chapter.
Use the :polymorphic
option to specify that an object is related to its association in a polymorphic way, which is the Rails way of saying that the type of the related object is stored in the database along with its foreign key. By making a belongs_to
relationship polymorphic, you abstract out the association so that any other model in the system can fill it.
Polymorphic associations let you trade some measure of relational integrity for the convenience of implementation in child relationships that are reused across your application. Common examples are models such as photo attachments, comments, notes, line items, and so on.
Let’s illustrate by writing a Comment
class that attaches to its subjects polymorphically. We’ll associate it to both expense reports and timesheets. Listing 7.1 has the schema information in migration code, followed by the code for the classes involved. Notice the :subject_type
column, which stores the class name of the associated class.
1 create_table :comments do |t|
2 t.text :body
3 t.references :subject, polymorphic: true
4
5 # References can be used as a shortcut for following two
statements.
6 # t.integer :subject_id
7 # t.string :subject_type
8
9 t.timestamps
10 end
11
12 class Comment < ActiveRecord::Base
13 belongs_to :subject, polymorphic: true
14 end
15
16 class ExpenseReport < ActiveRecord::Base
17 belongs_to :user
18 has_many :comments, as: :subject
19 end
20
21 class Timesheet < ActiveRecord::Base
22 belongs_to :user
23 has_many :comments, as: :subject
24 end
As you can see in the ExpenseReport
and Timesheet
classes of Listing 7.1, there is a corresponding syntax where you give Active Record a clue that the relationship is polymorphic by specifying as: :subject
. We haven’t covered has_many
’s options yet in this chapter, and polymorphic relationships have their own section in Chapter 9, “Advanced Active Record.”
You should never need to use this option, except perhaps with strange legacy database schemas. It allows you to specify a surrogate column on the owning record to use as the target of the foreign key instead of the usual primary key.
“Touches” the owning record’s updated_at
timestamp or a specific timestamp column specified by column_name
, if it is supplied. Useful for caching schemes where timestamps are used to invalidate cached view content. The column_name
option is particularly useful here if you want to do fine-grained fragment caching of the owning record’s view.
For example, let’s set the foundation for doing just that with the user/timesheet association:
$ rails generate migration AddTimesheetsUpdatedAtToUsers
timesheets_updated_at:datetime
invoke active_record
create db/migrate/20130413175038_add_timesheets_updated_
at_to_users.rb
$ rake db:migrate
== AddTimesheetsUpdatedAtToUsers: migrating ===================
-- add_column(:users, :timesheets_updated_at, :datetime)
-> 0.0005s
== AddTimesheetsUpdatedAtToUsers: migrated (0.0005s) ==========
1 class Timesheet < ActiveRecord::Base
2 belongs_to :user, touch: :timesheets_updated_at
3 ...
Defaults to false
on belongs_to
associations, contrary to its counterpart setting on has_many
. Tells Active Record to validate the owner record, but only in circumstances where it would normally save the owning record, such as when the record is new and a save is required in order to get a foreign key value.
Tim Says ...
Use validates_associated
if you want association validation outside of automatic saving.
Sometimes the need arises to have a relationship that must satisfy certain conditions in order for it to be valid. To facilitate this, Rails allows us to supply chain query criteria, or a scope, to a relationship definition as an optional second block argument. Active Record scopes are covered in detail in Chapter 9, “Advanced Active Record.”
To illustrate supplying a condition to a belongs_to
relationship, let’s assume that the users
table has a column approver
:
1 class Timesheet < ActiveRecord::Base
2 belongs_to :approver,
3 -> { where(approver: true) },
4 class_name: 'User'
5 ...
6 end
Now in order for the assignment of a user to the approver
field to work, that user must be authorized. I’ll go ahead and add a spec that both indicates the intention of my code and shows it in action. I turn my attention to spec/models/timesheet_spec.rb
.
1 require 'spec_helper'
2
3 describe Timesheet do
4 subject(:timesheet) { Timesheet.create }
5
6 describe '#approver' do
7 it 'may have a user associated as an approver' do
8 timesheet.approver = User.create(approver: true)
9 expect(timesheet.approver).to be
10 end
11 end
12 end
It’s a good start, but I also want to make sure something happens to prevent the system from assigning a nonauthorized user to the approver
field, so I add another spec:
1 it 'cannot be associated with a nonauthorized user' do
2 timesheet.approver = User.create(approver: false)
3 expect(timesheet.approver).to_not be
4 end
I have my suspicions about the validity of that spec, though, and as I half expected, it doesn’t really work the way I want it to work:
1) Timesheet#approver cannot be associated with a nonauthorized user
Failure/Error: expect(timesheet.approver).to_not be
expected #<User id: 1, approver: false ...> to evaluate to false
The problem is that Active Record (for better or worse—probably worse) allows me to make the invalid assignment. The scope
option only applies during the query to get the association back from the database. I’ll have some more work ahead of me to achieve the desired behavior, but I’ll go ahead and prove out Rails’ actual behavior by fixing my specs. I’ll do so by passing true
to the approver
method’s optional force_reload
argument, which tells it to reload its target object:
1 describe Timesheet do
2 subject(:timesheet) { Timesheet.create }
3
4 describe '#approver' do
5 it 'may have a user associated as an approver' do
6 timesheet.approver = User.create(approver: true)
7 timesheet.save
8 expect(timesheet.approver(true)).to be
9 end
10
11 it 'cannot be associated with a nonauthorized user' do
12 timesheet.approver = User.create(approver: false)
13 timesheet.save
14 expect(timesheet.approver(true)).to_not be
15 end
16 end
17 end
Those two specs do pass, but note that I went ahead and saved the timesheet
, since just assigning a value to it will not save the record. Then, as mentioned, I took advantage of the force_reload
parameter to make Rails reload approver
from the database and not just simply give me the same instance I originally assigned to it.
The lesson to learn is that providing a scope
on relationships never affects the assignment of associated objects, only how they’re read back from the database. To enforce the rule that a timesheet approver must be authorized, you’d need to add a before_save
callback to the Timesheet
class itself. Callbacks are covered in detail at the beginning of Chapter 9, “Advanced Active Record.”
In previous versions of Rails, relationship definitions had an :include
option that would take a list of second-order association names (on the owning record) that should be eager loaded when the current object was loaded. As of Rails 4, the way to do this is supplying an includes
query method to the scope argument of a relationship.
belongs_to :post, -> { includes(:author) }
In general, this technique is used to knock N+1 select operations down to N plus the number associations being included. It is rare to use this technique on a belongs_to
rather than on the has_many
side.
If necessary, due to conditions or orders referencing tables other than the main one, a SELECT
statement with the necessary LEFT OUTER JOINS
will be constructed on the fly so that all the data needed to construct a whole object graph is queried in one big database request.
With judicious use of using a relationship scope to include second-order associations and careful benchmarking, you can sometimes improve the performance of your application dramatically, mostly by eliminating N+1 queries. On the other hand, pulling lots of data from the database and instantiating large object trees can be very costly, so using an includes
scope is no “silver bullet.” As they say, your mileage may vary.
Replaces the SQL select clause that is normally generated when loading this association, which usually takes the form table_name.*
. This is just additional flexibility that it normally never needed.
Locks down the reference to the owning record so that you can’t modify it. Theoretically, this might make sense in terms of constraining your programming contexts very specifically, but I’ve never had a use for it. Still, for illustrative purposes, here is an example where I’ve made the user
association on Timesheet
readonly:
1 class Timesheet < ActiveRecord::Base
2 belongs_to :user, ~> { readonly }
3 ...
4
5 >> t = Timesheet.first
6 => #<Timesheet id: 1, submitted: nil, user_id: 1...>
7
8 >> t.user
9 => #<User id: 1, login: "admin"...>
10
11 >> t.user.save
12 ActiveRecord::ReadOnlyRecord: ActiveRecord::ReadOnlyRecord
Just like it sounds, the has_many
association allows you to define a relationship in which one model has many other models that belong to it. The sheer readability of code constructs such as has_many
is a major reason that people fall in love with Rails.
The has_many
class method is often used without additional options. If Rails can guess the type of class in the relationship from the name of the association, no additional configuration is necessary. This bit of code should look familiar by now:
1 class User < ActiveRecord::Base
2 has_many :timesheets
3 has_many :expense_reports
The names of the associations can be singularized and match the names of models in the application, so everything works as expected.
Despite the ease of use of has_many
, there is a surprising amount of power and customization possible for those who know and understand the options available.
Called after a record is added to the collection via the <<
method. This is not triggered by the collection’s create
method, so careful consideration is needed when relying on association callbacks. A lambda callback will get called directly versus a symbol, which correlates to a method on the owning record that takes the newly added child as a parameter. It’s also possible to pass an array of lambda or symbols.
Add callback method options to a has_many
by passing one or more symbols corresponding to method names or Proc
objects. See Listing 7.2 in the :before_add
option for an example.
Called after a record has been removed from the collection with the delete
method. A lambda callback will get called directly versus a symbol, which correlates to a method on the owning record that takes the newly added child as a parameter. It’s also possible to pass an array of lambda or symbols. See Listing 7.2 in the :before_add
option for an example.
Specifies the polymorphic belongs_to
association to use on the related class. (See Chapter 9, “Advanced Active Record,” for more about polymorphic relationships.)
Indicates whether to automatically save all modified records in an association collection when the parent is saved. Defaults to false
, but note that normal Active Record behavior is to save new associations records automatically when the parent is saved.
Triggered when a record is added to the collection via the <<
method. (Remember that concat
and push
are aliases of <<
.)
A lambda callback will get called directly versus a symbol, which correlates to a method on the owning record that takes the newly added child as a parameter. It’s also possible to pass an array of lambda or symbols.
Raising an exception in the callback will stop the object from getting added to the collection (basically because the callback is triggered right after the type mismatch check and there is no rescue clause to be found inside <<
).
1 has_many :unchangable_posts,
2 class_name: "Post",
3 before_add: :raise_exception
4
5 private
6
7 def raise_exception(object)
8 raise "You can't add a post"
9 end
Of course, that would have been a lot shorter code using a Proc
since it’s a one liner. The owner
parameter is the object with the association. The record
parameter is the object being added.
has_many :unchangable_posts,
class_name: "Post",
before_add: ->(owner, record) { raise "Can't do it!" }
Here it is one more time with a lambda, which doesn’t check the arity of block parameters:
has_many :unchangable_posts,
class_name: "Post",
before_add: lambda { raise "You can't add a post" }
Called before a record is removed from a collection with the delete
method. See before_add
for more information. As with :before_add
, raising an exception stops the remove operation.
1 class User < ActiveRecord::Base
2 has_many :timesheets,
3 before_remove: :check_timesheet_destruction,
4 dependent: :destroy
5
6 protected
7
8 def check_timesheet_destruction(timesheet)
9 if timesheet.submitted?
10 raise TimesheetError, "Cannot destroy a submitted
timesheet."
11 end
12 end
Note that this is a somewhat contrived example, because it violates my sense of good object-oriented principles. The User
class shouldn’t really be responsible for knowing when it’s OK to delete a timesheet or not. The check_timesheet_destruction
method would more properly be added as a before_destroy
callback on the Timesheet
class.
The :class_name
option is common to all the associations. It allows you to specify, as a string, the name of the class of the association and is needed when the class name cannot be inferred from the name of the association itself.
has_many :draft_timesheets, -> { where(submitted: false) },
class_name: 'Timesheet'
All associated objects are deleted in fell swoop using a single SQL command. Note: While this option is much faster than :destroy
, it doesn’t trigger any destroy callbacks on the associated objects—you should use this option very carefully. It should only be used on associations that depend solely on the parent object.
All associated objects are destroyed along with the parent object by iteratively calling their destroy
methods.
The default behavior when deleting a record with has_many
associations is to leave those associated records alone. Their foreign key fields will still point at the record that was deleted. The :nullify
option tells Active Record to nullify, or clear, the foreign key that joins them to the parent record.
If associated objects are present when the parent object is destroyed, Rails raises an ActiveRecord::DeleteRestrictionError
exception.
An error is added to the parent object if any associated objects are present, rolling back the deletion from the database.
Overrides the convention-based foreign key column name that would normally be used in the SQL statement that loads the association. Normally it would be the owning record’s class name with _id
appended to it.
Explicitly declares the name of the inverse association in a bidirectional relationship. Considered an optimization, use of this option allows Rails to return the same instance of an object no matter which side of the relationship it is accessed from.
Consider the following, using our recurring example without the use of inverse_of
.
>> user = User.first
>> timesheet = user.timesheets.first
=> <Timesheet id: 1, user_id: 1...>
>> timesheet.user.equal? user
=> false
If we add :inverse_of
to the association objection on User
, like
has_many :timesheets, inverse_of: :user
then timesheet.user.equal?
user will be true
. Try something similar in one of your apps to see it for yourself.
Specifies a surrogate key to use instead of the owning record’s primary key, whose value should be used when querying to fill the association collection.
Used exclusively as additional options to assist in using has_many :through
associations with polymorphic belongs_to
. Covered in detail later in this chapter.
Creates an association collection via another association. See the section in this chapter titled “has_many :through
” for more information.
In cases where the child records in the association collection would be automatically saved by Active Record, this option (true by default) dictates whether to ensure that they are valid. If you always want to check the validity of associated records when saving the owning record, then use validates_associated :association_name
.
The has_many
association provides the ability to customize the query used by the database to retrieve the association collection. This is achieved by passing a scope block to the has_many
method definition using any of the standard Active Record query methods, as covered in Chapter 5, “Working with Active Record.” In this section, we’ll cover the most common scope methods used with has_many
associations.
Using the query method where
, one could add extra conditions to the Active Record–generated SQL query that brings back the objects in the association.
You can apply extra conditions to an association for a variety of reasons. How about approval of comments?
has_many :comments,
Plus, there’s no rule that you can’t have more than one has_many
association exposing the same two related tables in different ways. Just remember that you’ll probably have to specify the class name too.
has_many :pending_comments, -> { where(approved: true) },
class_name: 'Comment'
Specifies one or many modules with methods that will extend the association collection proxy. This is used as an alternative to defining additional methods in a block passed to the has_many
method itself. It is discussed in the section “Association Extensions” in this chapter.
Adds a GROUP BY SQL clause to the queries used to load the contents of the association collection.
Must be used in conjunction with the group
query method and adds extra conditions to the resulting SQL query used to load the contents of the association collection.
Takes an array of second-order association names (as an array) that should be eager loaded when this collection is loaded. With judicious use of the includes
query method and careful benchmarking, you can sometimes improve the performance of your application dramatically.
To illustrate, let’s analyze how includes
affects the SQL generated while navigating relationships. We’ll use the following simplified versions of Timesheet
, BillableWeek
, and BillingCode
:
1 class Timesheet < ActiveRecord::Base
2 has_many :billable_weeks
3 end
4
5 class BillableWeek < ActiveRecord::Base
6 belongs_to :timesheet
7 belongs_to :billing_code
8 end
9
10 class BillingCode < ActiveRecord::Base
11 belongs_to :client
12 has_many :billable_weeks
13 end
First, I need to set up my test data, so I create a timesheet
instance and add a couple of billable weeks to it. Then I assign a billable code to each billable week, which results in an object graph (with four objects linked together via associations).
Next I do a fancy one-line collect
, which gives me an array of the billing codes associated with the timesheet:
>> Timesheet.find(3).billable_weeks.collect(&:code)
=> ["TRAVEL", "DEVELOPMENT"]
Without the includes
scope method set on the billable_weeks
association of Timesheet
, that operation cost me the following four database hits (copied from log/development.log
and prettied up a little):
Timesheet Load (0.3ms) SELECT timesheets.* FROM timesheets WHERE
(timesheets.id = 3) LIMIT 1
BillableWeek Load (1.3ms) SELECT billable_weeks.* FROM billable_weeks WHERE
(billable_weeks.timesheet_id = 3)
BillingCode Load (1.2ms) SELECT billing_codes.* FROM billing_codes WHERE
(billing_codes.id = 7) LIMIT 1
BillingCode Load (3.2ms) SELECT billing_codes.* FROM billing_codes WHERE
(billing_codes.id = 8) LIMIT 1
This demonstrates the “N+1 select” problem that inadvertently plagues many systems. Any time I need one billable week, it will cost me N select statements to retrieve its associated records. Now let’s provide the billable_weeks
association a scope block using includes
, after which the Timesheet
class looks as follows:
1 class Timesheet < ActiveRecord::Base
2 has_many :billable_weeks, -> { includes(:billing_code) }
3 end
Simple! Rerunning our test statement yields the same results in the console:
>> Timesheet.find(3).billable_weeks.collect(&:code)
=> ["TRAVEL", "DEVELOPMENT"]
But look at how different the generated SQL is:
Timesheet Load (0.4ms) SELECT timesheets.* FROM timesheets WHERE (timesheets.id
= 3) LIMIT 1
BillableWeek Load (0.6ms) SELECT billable_weeks.* FROM billable_weeks WHERE
(billable_weeks.timesheet_id = 3)
BillingCode Load (2.1ms) SELECT billing_codes.* FROM billing_codes WHERE
(billing_codes.id IN (7,8))
Active Record smartly figures out exactly which BillingCode
records it will need and pulls them in using one query. For large datasets, the performance improvement can be quite dramatic!
It’s generally easy to find N+1 select issues just by watching the log scroll by while clicking through the different screens of your application. (Of course, make sure that you’re looking at realistic data or the exercise will be pointless.) Screens that might benefit from eager loading will cause a flurry of single-row SELECT
statements—one for each record in a given association being used.
If you’re feeling particularly daring (perhaps masochistic is a better term), you can try including a deep hierarchy of associations by mixing hashes into your includes
query method, like in this fictional example from a bulletin board:
has_many :posts, -> { includes([:author, {comments: {author: :avatar }}]) }
That example snippet will grab not only all the comments for a Post
but all their authors and avatar pictures as well. You can mix and match symbols, arrays, and hashes in any combination to describe the associations you want to load.
The biggest potential problem with “deep” includes is pulling too much data out of the database. You should always start out with the simplest solution that will work and then use benchmarking and analysis to figure out if optimizations such as eager loading help improve your performance.
Wilson Says ...
Let people learn eager loading by crawling across broken glass, like we did. It builds character!
Appends a LIMIT
clause to the SQL generated for loading this association. This option is potentially useful in capping the size of very large association collections. Use in conjunction with the order
query method to make sure you’re grabbing the most relevant records.
An integer determining the offset from where the rows should be fetched when loading the association collection. I assume this is here mostly for completeness, since it’s hard to envision a valid use case.
Specifies the order in which the associated objects are returned via an “ORDER BY” SQL fragment, such as "last_name, first_name DESC"
.
Sets all records in the association collection to readonly mode, which prevents saving them.
By default, this is *
as in SELECT * FROM
but can be changed if you, for example, want to add additional calculated columns or “piggyback” additional columns that will be joined with the associated object as it is loaded.
Strips duplicate objects from the collection. Sometimes useful in conjunction with has_many :through
.
Associating persistent objects via a join table can be one of the trickier aspects of object-relational mapping to implement correctly in a framework. Rails has a couple of techniques that let you represent many-to-many relationships in your model. We’ll start with the older and simpler has_and_belongs_to_many
and then cover the newer has_many :through
.
Before proceeding with this section, I must clear my conscience by stating that has_and_belongs_to_many
is practically obsolete in the minds of many Rails developers, including the authors of this book. Use has_many :through
instead and your life should be a lot easier. The section is preserved in this edition almost exactly as it appeared in the previous editions because it contains good techniques that enlighten the reader about nuances of Active Record behavior.
The has_and_belongs_to_many
method establishes a link between two associated Active Record models via an intermediate join table. Unless the join table is explicitly specified as an option, Rails guesses its name by concatenating the table names of the joined classes in alphabetical order and separated with an underscore.
For example, if I was using has_and_belongs_to_many
(or habtm
for short) to establish a relationship between Timesheet
and BillingCode
, the join table would be named billing_codes_timesheets
and the relationship would be defined in the models. Both the migration class and models are listed:
1 class CreateBillingCodesTimesheets < ActiveRecord::Migration
2 def change
3 create_table :billing_codes_timesheets, id: false do |t|
4 t.references :billing_code, null: false
5 t.references :timesheet, null: false
6 end
7 end
8 end
9
10 class Timesheet < ActiveRecord::Base
11 has_and_belongs_to_many :billing_codes
12 end
13
14 class BillingCode < ActiveRecord::Base
15 has_and_belongs_to_many :timesheets
16 end
Note that an id
primary key is not needed; hence, the id: false
option was passed to the create_table
method. Also, since the foreign key columns are both needed, we pass them a null: false
option. (In real code, you would also want to make sure both of the foreign key columns were indexed properly.)
Kevin Says ...
A new migration method create_join_table
was added to Rails 4 to create a join table using the order of the first two arguments. The migration in the preceding code example is equivalent to the following:
1 class CreateBillingCodesTimesheets < ActiveRecord::Migration
2 def change
3 create_join_table :billing_codes, :timesheets
4 end
5 end
What about self-referential many-to-many relationships? Linking a model to itself via a habtm
relationship is easy—you just have to provide explicit options. In Listing 7.3, I’ve created a join table and established a link between related BillingCode
objects. Again, both the migration and model class are listed:
1 class CreateRelatedBillingCodes < ActiveRecord::Migration
2 def change
3 create_table :related_billing_codes, id: false do |t|
4 t.column :first_billing_code_id, :integer, null: false
5 t.column :second_billing_code_id, :integer, null: false
6 end
7 end
8 end
9
10 class BillingCode < ActiveRecord::Base
11 has_and_belongs_to_many :related,
12 join_table: 'related_billing_codes',
13 foreign_key: 'first_billing_code_id',
14 association_foreign_key: 'second_billing_code_id',
15 class_name: 'BillingCode'
16 end
It’s worth noting that the related
relationship of the BillingCode
in Listing 7.3 is not bidirectional. Just because you associate two objects in one direction does not mean they’ll be associated in the other direction. But what if you need to automatically establish a bidirectional relationship?
First let’s write a spec for the BillingCode
class to prove our solution. When we add bidirectional, we don’t want to break the normal behavior, so at first my spec example establishes that the normal habtm
relationship works:
1 describe BillingCode do
2 let(:travel_code) { BillingCode.create(code: 'TRAVEL') }
3 let(:dev_code) { BillingCode.create(code: 'DEV') }
4
5 it "has a working related habtm association" do
6 travel_code.related << dev_code
7 expect(travel_code.reload.related).to include(dev_code)
8 end
9 end
I run the spec and it passes. Now I can modify the example to prove that the bidirectional behavior that we’re going to add works. It ends up looking very similar to the first example.
1 describe BillingCode do
2 let(:travel_code) { BillingCode.create(code: 'TRAVEL') }
3 let(:dev_code) { BillingCode.create(code: 'DEV') }
4
5 it "has a bidirectional habtm association" do
6 travel_code.related << dev_code
7 expect(travel_code.reload.related).to include(dev_code)
8 expect(dev_code.reload.related).to include(travel_code)
9 end
Of course, the new version fails, since we haven’t added the new behavior yet. I’ll omit the output of running the spec, since it doesn’t tell us anything we don’t know already.
Rails won’t have a problem with you adding as many extra columns as you want to habtm
’s join table. The extra attributes will be read in and added onto model objects accessed via the habtm
association. However, speaking from experience, the severe annoyances you will deal with in your application code make it really unattractive to go that route.
What kind of annoyances? For one, records returned from join tables with additional attributes will be marked as readonly, because it’s not possible to save changes to those additional attributes.
You should also consider that the way that Rails makes those extra columns of the join table available might cause problems in other parts of your codebase. Having extra attributes appear magically on an object is kind of cool, but what happens when you try to access those extra properties on an object that wasn’t fetched via the habtm
association? Kaboom! Get ready for some potentially bewildering debugging exercises.
Methods of the habtm
proxy act just as they would for a has_many
relationship. Similarly, habtm
shares options with has_many
; only its :join_table
option is unique. It allows customization of the join table name.
To sum up, habtm
is a simple way to establish a many-to-many relationship using a join table. As long as you don’t need to capture additional data about the relationship, everything is fine. The problems with habtm
begin once you want to add extra columns to the join table, after which you’ll want to upgrade the relationship to use has_many :through
instead.
The Rails documentation advises readers that “it’s strongly recommended that you upgrade any [habtm
] associations with attributes to a real join model.” Use of habtm
, which was one of the original innovative features in Rails, fell out of favor once the ability to create real join models was introduced via the has_many :through
association.
Realistically, habtm
is not going to be removed from Rails for a couple of sensible reasons. First of all, plenty of legacy Rails applications need it. Second, habtm
provides a way to join classes without a primary key defined on the join table, which is occasionally useful. But most of the time you’ll find yourself wanting to model many-to-many relationships with has_many :through
.
Well-known Rails guy Josh Susser is considered the expert on Active Record associations—even his blog is called has_many :through
. His description of the :through
association, written back when the feature was originally introduced in Rails 1.1, is so concise and well-written that I couldn’t hope to do any better. So here it is:
The has_many :through
association allows you to specify a one-to-many relationship indirectly via an intermediate join table. In fact, you can specify more than one such relationship via the same table, which effectively makes it a replacement for has_and_belongs_to_many
. The biggest advantage is that the join table contains full-fledged model objects complete with primary keys and ancillary data. No more push_with_attributes
; join models just work the same way all your other Active Record models do.1
1. http://blog.hasmanythrough.com/2006/2/28/association-goodness
To illustrate the has_many :through
association, we’ll set up a Client
model so that it has many Timesheet
objects through a normal has_many
association named billable_weeks
.
1 class Client < ActiveRecord::Base
2 has_many :billable_weeks
3 has_many :timesheets, through: :billable_weeks
4 end
The BillableWeek
class was already in our sample application and is ready to be used as a join model:
1 class BillableWeek < ActiveRecord::Base
2 belongs_to :client
3 belongs_to :timesheet
4 end
We can also set up the inverse relationship, from timesheets to clients, like this:
1 class Timesheet < ActiveRecord::Base
2 has_many :billable_weeks
3 has_many :clients, through: :billable_weeks
4 end
Notice that has_many :through
is always used in conjunction with a normal has_many
association. Also, notice that the normal has_many
association will often have the same name on both classes that are being joined together, which means the :through
option will read the same on both sides.
through: :billable_weeks
How about the join model; will it always have two belongs_to
associations? No.
You can also use has_many :through
to easily aggregate has_many
or has_one
associations on the join model. Forgive me for switching to completely non-realistic domain for a moment—it’s only intended to clearly demonstrate what I’m trying to describe:
1 class Grandparent < ActiveRecord::Base
2 has_many :parents
3 has_many :grand_children, through: :parents, source: :children
4 end
5
6 class Parent < ActiveRecord::Base
7 belongs_to :grandparent
8 has_many :children
9 end
For the sake of clarity in later chapters, I’ll refer to this usage of has_many :through
as aggregating.
Courtenay Says ...
We use has_many :through
so much! It has pretty much replaced the old has_and_belongs_to_many
because it allows your join models to be upgraded to full objects. It’s like when you’re just dating someone and they start talking about the relationship (or, eventually, marriage). It’s an example of an association being promoted to something more important than the individual objects on each side.
You can use nonaggregating has_many :through
associations in almost the same ways as any other has_many
associations. For instance, appending an object to a has_many :through
collection will save the object as expected:
>> c = Client.create(name: "Trotter's Tomahawks", code "ttom")
=> #<Client id: 5 ...>
>> c.timesheets << Timesheet.new
=> #<ActiveRecord::Associations::CollectionProxy [#<Timesheet id: 2 ...>]>
The main benefit of has_many :through
is that Active Record takes care of managing the instances of the join model for you. If we call reload
on the billable_weeks
association, we’ll see that there was a billable week object created for us:
>> c.billable_weeks.reload.to_a
=> [#<BillableWeek id: 2, tuesday_hours: nil, start_date: nil,
timesheet_id: 2, billing_code_id: nil, sunday_hours: nil,
friday_hours: nil, monday_hours: nil, client_id: 2, wednesday_hours: nil,
saturday_hours: nil, thursday_hours: nil>]
The BillableWeek
object that was created is properly associated with both the client and the Timesheet
. Unfortunately, there are a lot of other attributes (e.g., start_date
and the hours columns) that were not populated.
One possible solution is to use create
on the billable_weeks
association instead and include the new Timesheet
object as one of the supplied properties.
>> bw = c.billable_weeks.create(start_date: Time.now,
timesheet: Timesheet.new)
When you’re using has_many :through
to aggregate multiple child associations, there are more significant limitations—essentially, you can query to your hearts content using find
and friends, but you can’t append or create new records through them.
For example, let’s add a billable_weeks
association to our sample User
class:
1 class User < ActiveRecord::Base
2 has_many :timesheets
3 has_many :billable_weeks, through: :timesheets
4 ...
The billable_weeks
association aggregates all the billable week objects belonging to the user’s timesheets.
1 class Timesheet < ActiveRecord::Base
2 belongs_to :user
3 has_many :billable_weeks, -> { include(:billing_code) }
4 ...
Now let’s go into the Rails console and set up some example data so that we can use the new billable_weeks
collection (on User
).
>> quentin = User.first
=> #<User id: 1, login: "quentin" ...>
>> quentin.timesheets.to_a
=> []
>> ts1 = quentin.timesheets.create
=> #<Timesheet id: 1 ...>
>> ts2 = quentin.timesheets.create
=> #<Timesheet id: 2 ...>
>> ts1.billable_weeks.create(start_date: 1.week.ago)
=> #<BillableWeek id: 1, timesheet_id: 1 ...>
>> ts2.billable_weeks.create(start_date: 2.week.ago)
=> #<BillableWeek id: 2, timesheet_id: 2 ...>
>> quentin.billable_weeks.to_a
=> [#<BillableWeek id: 1, timesheet_id: 1 ...>, #<BillableWeek id: 2,
timesheet_id: 2 ...>]
Just for fun, let’s see what happens if we try to create a BillableWeek
with a User
instance:
>> quentin.billable_weeks.create(start_date: 3.weeks.ago)
ActiveRecord::HasManyThroughCantAssociateThroughHasOneOrManyReflection:
Cannot modify association 'User#billable_weeks' because the source
reflection class 'BillableWeek' is associated to 'Timesheet' via :has_many.
There you go. Since BillableWeek
only belongs to a timesheet and not a user, Rails raises a HasManyThroughCantAssociateThroughHasOneOrManyReflection
exception.
When you append to a nonaggregating has_many :through
association with <<
, Active Record will always create a new join model, even if one already exists for the two records being joined. You can add validates_uniqueness_of
constraints on the join model to keep duplicate joins from happening.
This is what such a constraint might look like on our BillableWeek
join model.
validates_uniqueness_of :client_id, scope: :timesheet_id
That says, in effect, “There should only be one of each client per timesheet.”
If your join model has additional attributes with their own validation logic, then there’s another important consideration to keep in mind. Adding records directly to a has_many :through
association causes a new join model to be automatically created with a blank set of attributes. Validations on additional columns of the join model will probably fail. If that happens, you’ll need to add new records by creating join model objects and associating them appropriately through their own association proxy.
timesheet.billable_weeks.create(start_date: 1.week.ago)
The options for has_many :through
are the same as the options for has_many
—remember that :through
is just an option on has_many
! However, the use of some of has_many
’s options change or become more significant when :through
is used.
First of all, the :class_name
and :foreign_key
options are no longer valid since they are implied from the target association on the join model. The following are the rest of the options that have special significance together with has_many :through
.
The :source
option specifies which association to use on the associated class. This option is not mandatory because normally Active Record assumes that the target association is the singular (or plural) version of the has_many
association name. If your association names don’t match up, then you have to set :source
explicitly.
For example, the following code will use the BillableWeek
’s sheet
association to populate timesheets
.
has_many :timesheets, through: :billable_weeks, source: :sheet
The :source_type
option is needed when you establish a has_many :through
to a polymorphic belongs_to
association on the join model. Consider the following example concerning clients and contacts:
1 class Client < ActiveRecord::Base
2 has_many :client_contacts
3 has_many :contacts, through: :client_contacts
4 end
5
6 class ClientContact < ActiveRecord::Base
7 belongs_to :client
8 belongs_to :contact, polymorphic: true
9 end
In this somewhat contrived example, the most important fact is that a Client
has many contacts
through their polymorphic relationship to the join model, ClientContact
. There isn’t a Contact
class; we just want to be able to refer to contacts in a polymorphic sense, meaning either a Person
or a Business
.
1 class Person < ActiveRecord::Base
2 has_many :client_contacts, as: :contact
3 end
4
5 class Business < ActiveRecord::Base
6 has_many :client_contacts, as: :contact
7 end
Now take a moment to consider the backflips that Active Record would have to perform in order to figure out which tables to query for a client’s contacts. Remember that there isn’t a contacts table!
>> Client.first.contacts
Active Record would theoretically need to be aware of every model class that is linked to the other end of the contacts polymorphic association. In fact, it cannot do those kinds of backflips, which is probably a good thing as far as performance is concerned:
>> Client.first.contacts
ActiveRecord::HasManyThroughAssociationPolymorphicSourceError: Cannot have a
has_many :through association 'Client#contacts' on the polymorphic object
'Contact#contact' without 'source_type'.
The only way to make this scenario work (somewhat) is to give Active Record some help by specifying which table it should search when you ask for the contacts
collection, and you do that with the source_type
option naming the target class, symbolized like this:
1 class Client < ActiveRecord::Base
2 has_many :client_contacts
3 has_many :people, through: :client_contacts,
4 source: :contact, source_type: :person
5
6 has_many :businesses, through: :client_contacts,
7 source: :contact, source_type: :business
8 end
After the :source_type
is specified, the association will work as expected, but sadly we don’t get a general purpose contacts
collection to work with, as it seemed might be possible at first.
>> Client.first.people.create!
=> [#<Person id: 1>]
If you’re upset that you cannot associate people
and business
together in a contacts association, you could try writing your own accessor method for a client’s contacts:
1 class Client < ActiveRecord::Base
2 def contacts
3 people_contacts + business_contacts
4 end
5 end
Of course, you should be aware that calling that contacts
method will result in at least two database requests and will return an Array
, without the association proxy methods that you might expect it to have.
The distinct
scope method tells the association to include only unique objects. It is especially useful when using has_many :through
, since two different BillableWeeks
could reference the same Timesheet
.
>> Client.first.timesheets.reload.to_a
[#<Timesheet id: 1...>, #<Timesheet id: 1...>]
It’s not extraordinary for two distinct model instances of the same database record to be in memory at the same time—it’s just not usually desirable.
1 class Client < ActiveRecord::Base
2 has_many :timesheets, -> { distinct }, through:
:billable_weeks
3 end
After adding the distinct
scope to the has_many :through
association, only one instance per record is returned.
>> Client.first.timesheets.reload.to_a
=> [#<Timesheet id: 1...>]
One of the most basic relationship types is a one-to-one object relationship. In Active Record we declare a one-to-one relationship using the has_one
and belongs_to
methods together. As in the case of a has_many
relationship, you call belongs_to
on the model whose database table contains the foreign key column linking the two records together.
Conceptually, has_one
works almost exactly like has_many
does, except that when the database query is executed to retrieve the related object, a LIMIT 1
clause is added to the generated SQL so that only one row is returned.
The name of a has_one
relationship should be singular, which will make it read naturally—for example, has_one :last_timesheet
, has_one :primary_account
, has_one :profile_photo
, and so on. Let’s take a look at has_one
in action by adding avatars for our users.
1 class Avatar < ActiveRecord::Base
2 belongs_to :user
3 end
4
5 class User < ActiveRecord::Base
6 has_one :avatar
7 # ... the rest of our User code ...
8 end
That’s simple enough. Firing this up in rails console
, we can look at some of the new methods that has_one
adds to User
.
>> u = User.first
>> u.avatar
=> nil
>> u.build_avatar(url: '/avatars/smiling')
=> #<Avatar id: nil, url: "/avatars/smiling", user_id: 1>
>> u.avatar.save
=> true
As you can see, we can use build_avatar
to build a new avatar object and associate it with the user. While it’s great that has_one
will associate an avatar with the user, it isn’t really anything that has_many
doesn’t already do. So let’s take a look at what happens when we assign a new avatar to the user.
>> u = User.first
>> u.avatar
=> #<Avatar id: 1, url: "/avatars/smiling", user_id: 1>
>> u.create_avatar(url: '/avatars/frowning')
=> #<Avatar id: 2, url: "/avatars/4567", user_id: 1>
>> Avatar.all.to_a
=> [#<Avatar id: 1, url: "/avatars/smiling", user_id: nil>, #<Avatar id: 2, url:
"/avatars/4567", user_id: 1>]
The last line from that console session is the most interesting, because it shows that our initial avatar is now no longer associated with the user. Of course, the previous avatar was not removed from the database, which is something that we want in this scenario. So we’ll use the dependent: :destroy
option to force avatars to be destroyed when they are no longer associated with a user.
1 class User < ActiveRecord::Base
2 has_one :avatar, dependent: :destroy
3 end
With some additional fiddling around in the console, we can verify that it works as intended. In doing so, you might notice that Rails only destroys the avatar that was just removed from the user, so bad data that was in your database from before will still remain. Keep this in mind when you decide to add dependent: :destroy
to your code and remember to manually clear orphaned data that might otherwise remain.
As I alluded to earlier, has_one
is sometimes used to single out one record of significance alongside an already established has_many
relationship. For instance, let’s say we want to easily be able to access the last timesheet a user was working on:
1 class User < ActiveRecord::Base
2 has_many :timesheets
3
4 has_one :latest_sheet,
5 -> { order('created_at desc') },
6 class_name: 'Timesheet'
7 end
I had to specify a :class_name
so that Active Record knows what kind of object we’re associating. (It can’t figure it out based on the name of the association :latest_sheet
.)
When adding a has_one
relationship to a model that already has a has_many
defined to the same related model, it is not necessary to add another belongs_to
method call to the target object just for the new has_one
. That might seem a little counterintuitive at first, but if you think about it, the same foreign key value is being used to read the data from the database.
The options for has_one
associations are similar to the ones for has_many
. For your convenience, we briefly cover the most relevant ones here.
Allows you to set up a polymorphic association, covered in Chapter 9, “Advanced Active Record.”
Allows you to specify the class this association uses. When you’re doing has_one :latest_timesheet, class_name: 'Timesheet'
, class_name: 'Timesheet'
specifies that latest_timesheet
is actually the last Timesheet
object in the database that is associated with this user. Normally, this option is inferred by Rails from the name of the association.
The :dependent
option specifies how Active Record should treat associated objects when the parent object is deleted. (The default is to do nothing with associated objects, which will leave orphaned records in the database.) There are a few different values that you can pass, and they work just like the :dependent
option of has_many
. If you pass :destroy
to it, you tell Rails to destroy the associated object when it is no longer associated with the primary object. Setting the :dependent
option to :delete
will destroy the associated object without calling any of Rails’ normal hooks. Passing :restrict_with_exception
causes Rails to throw an exception if there is any associated object present, while :restrict_with_error
adds an error to the owner object causing validations to fail before saving. Finally, :nullify
will simply set the foreign key values to nil
so that the relationship is broken.
The scopes for has_one
associations are similar to the ones for has_many
. For your convenience, we briefly cover the most relevant ones here.
Allows you to specify conditions that the object must meet to be included in the association.
1 class User < ActiveRecord::Base
2 has_one :manager, -> ( where(type: 'manager')),
3 class_name: 'Person'
Here manager
is specified as a person object that has type = 'manager'
. I almost always use a where
scope block in conjunction with has_one
. When Active Record loads the association, it’s grabbing one of potentially many rows that have the right foreign key. Absent some explicit conditions (or perhaps an order
scope), you’re leaving it in the hands of the database to pick a row.
Allows you to specify an SQL fragment that will be used to order the results. This is an especially useful option with has_one
when trying to associate the latest of something or another.
1 class User < ActiveRecord::Base
2 has_one :latest_timesheet,
3 -> { order('created_at desc') },
4 class_name: 'Timesheet'
5 end
Sets the record in the association to readonly mode, which prevents saving it.
You can manipulate objects and associations before they are saved to the database, but there is some special behavior you should be aware of, mostly involving the saving of associated objects. Whether an object is considered unsaved is based on the result of calling new_record?
.
Assigning an object to a belongs_to
association does not save the parent or the associated object.
Assigning an object to a has_one
association automatically saves that object and the object being replaced (if there is one) so that their foreign key fields are updated. The exception to this behavior is if the parent object is unsaved, since that would mean that there is no foreign key value to set. If save fails for either of the objects being updated (due to one of them being invalid), the assignment operation returns false and the assignment is cancelled. That behavior makes sense (if you think about it), but it can be the cause of much confusion when you’re not aware of it. If you have an association that doesn’t seem to work, check the validation rules of the related objects.
Adding an object to has_many
and has_and_belongs_to_many
collections automatically saves it, unless the parent object (the owner of the collection) is not yet stored in the database.
If objects being added to a collection (via <<
or similar means) fail to save properly, then the addition operation will return false
. If you want your code to be a little more explicit or you want to add an object to a collection without automatically saving it, then you can use the collection’s build
method. It’s exactly like create
except that it doesn’t save
.
Members of a collection are automatically saved or updated when their parent is saved or updated, unless autosave: false
is set on the association.
Associations that are set with an autosave: true
option are also afforded the ability to have their records deleted when an inverse record is saved. This is to allow the records from both sides of the association to get persisted within the same transaction and is handled through the mark_for_destruction
method. Consider our User
and Timesheet
models again:
1 class User < ActiveRecord::Base
2 has_many :timesheets, autosave: true
3 end
If I would like to have a Timesheet
destroyed when the User
is saved, mark it for destruction.
1 user = User.where(name: "Durran")
2 timesheet = user.timesheets.closed
3 timesheet.mark_for_destruction # => Flags timesheet
4 user.save # => The timesheet gets deleted.
Since both are persisted in the same transaction, if the operation were to fail, the database would not be in an inconsistent state. Do note that although the child record did not get deleted in that case, it still would be marked for destruction and any later attempts to save the inverse would once again attempt to delete it.
The proxy objects that handle access to associations can be extended with your own application code. You can add your own custom finders and factory methods to be used specifically with a particular association.
For example, let’s say you wanted a concise way to refer to an account’s people by name. You may create an extension on the association like the following:
1 class Account < ActiveRecord::Base
2 has_many :people do
3 def named(full_name)
4 first_name, last_name = full_name.split(" ", 2)
5 where(first_name: first_name, last_name: last_name).first_or_create
6 end
7 end
8 end
Now we have a named
method available to use on the people
collection.
1 account = Account.first
2 person = account.people.named("David Heinemeier Hansson")
3 person.first_name # => "David"
4 person.last_name # => "Heinemeier Hansson"
If you need to share the same set of extensions between many associations, you can specify an extension module instead of a block with method definitions. Here is the same feature shown in Listing 7.4, except broken out into its own Ruby module:
1 module ByNameExtension
2 def named(full_name)
3 first_name, last_name = full_name.split(" ", 2)
4 where(first_name: first_name, last_name: last_name).
first_or_create
5 end
6 end
Now we can use it to extend many different relationships as long as they’re compatible. (Our contract in the example consists of a model with columns first_name
and last_name
.)
1 class Account < ActiveRecord::Base
2 has_many :people, -> { extending(ByNameExtension) }
3 end
4
5 class Company < ActiveRecord::Base
6 has_many :people, -> { extending(ByNameExtension) }
7 end
If you need to use multiple named extension modules, you can pass an array of modules to the extending
query method instead of a single module, like this:
has_many :people, -> { extending(ByNameExtension, ByRecentExtension) }
In the case of name conflicts, methods contained in modules added later in the array supersede those earlier in the array.
Unless you have a valid reason to reuse the extension logic with more than one type of model, you’re probably better off leveraging the fact that class methods are automatically available on has_many
associations.
1 class Person < ActiveRecord::Base
2 belongs_to :account
3
4 def self.named(full_name)
5 first_name, last_name = full_name.split(" ", 2)
6 where(first_name: first_name, last_name: last_name).first_or_create
7 end
8 end
CollectionProxy
, the parent of all association proxies, contributes a handful of useful methods that apply to most kinds of associations and can come into play when you’re writing association extensions.
The owner
method provides a reference to the parent object holding the association.
The reflection
object is an instance of ActiveRecord::Reflection::AssociationReflection
and contains all the configuration options for the association. That includes both default settings and those that were passed to the association method when it was declared.
Finally, the target
is the associated collection of objects (or associated object itself in the case of belongs_to
and has_one
).
It might not appear sane to expose these attributes publicly and allow their manipulation. However, without access to them, it would be much more difficult to write advanced association extensions. The loaded?
, loaded
, target
, and target=
methods are public for similar reasons.
The following code sample demonstrates the use of owner
within a published_prior_to
extension method, originally contributed by Wilson Bilkovich:
1 class ArticleCategory < ActiveRecord::Base
2 has_ancestry
3
4 has_many :articles do
5 def published_prior_to(date, options = {})
6 if owner.is_root?
7 Article.where('published_at < ? and category_id = ?', date, proxy_owner)
8 else
9 # Self is the "articles" association here, so we inherit its scope.
10 self.all(options)
11 end
12 end
13 end
14 end
The has_ancestry
Active Record extension gem adds the ability to organize Active Record models as a tree structure. The self-referential association is based on a ancestry
string column. The owner
reference is used to check if the parent of this association is a “top-level” node in the tree.
The reset
method puts the association proxy back in its initial state, which is unloaded (cached association objects are cleared). The reload
method invokes reset
and then loads associated objects from the database.
The ability to model associations is what makes Active Record more than just a data-access layer. The ease and elegance with which you can declare those associations are what make Active Record more than your ordinary object-relational mapper.
In this chapter, we covered the basics of how Active Record associations work. We started by taking a look at the class hierarchy of associations classes, starting with CollectionProxy
. Hopefully, by learning about how associations work under the hood, you’ve picked up some enhanced understanding about their power and flexibility.
Finally, the options and methods guide for each type of association should be a good reference guide for your day-to-day development activities.
3.12.166.131