“When we .NET developers say that getting into Rails is tough, this is the kind of stuff we’re talking about. [...] It’s the Linux shell, server applications, and other things we’re not used to that trip us up.” | ||
--Brian Eng http://www.softiesonrails.com/2007/4/5/the-absolute-moron-s-guide-to-capistrano |
Need being the great mother of invention, the story goes that in his work for 37signals, Jamis created Switchtower (later renamed to Capistrano[1]) when Basecamp grew to be hosted on more than one production server. It’s a tool for automating tasks on remote servers.
Although Capistrano is now firmly established as the standard solution to Rails deployment challenges, its multiserver transactional approach to running remote commands lends it to a fairly broad range of deployment scenarios.
This chapter delves into Capistrano (version 2[2]), clearly showing how it works out of the box to solve tricky and time-consuming activities. We also show what it doesn’t do for you and how you can extend it by crafting your own deployment recipes to customize Capistrano’s capabilities. In conclusion, we’ll explore a few great recipes created by the Capistrano community that have gained broad acceptance.
We’ll start with a high-level overview of Capistrano. The first stop on that journey is to take stock of Capistrano’s domain-specific terminology.
Capistrano has its own vocabulary that distinguishes it from simply being an extension of Rake, on which it is based:
One or many deployment machines are the computers to which we will deploy.
Recipes are what tasks are to Rake. Whether they represent a single task or more, a recipe provides a desirable solution. Recipes are made up of tasks.
Tasks are atomic units of functionality, and exist to be called directly by an end user, or internally by another task. Tasks sit within namespaces.
A Capistrano namespace groups tasks logically. In this way a task name may be repeated within another namespace and are a great way for contributing authors to add recipes without the fear of task-name collisions.
Roles such as :app
and :db
are means by which we can group task execution—we can target the running of a task within a role—so that the task runs only against that role. You can think of them as a task qualifier where the qualifier is generally a class of deployment machine such as :app
, :db
, or :web
for which the task will run.
Variables
are global variables available to the rest of the script.
The first time I sat down to use Capistrano, I asked myself: What do I need to do to use Capistrano? What does it expect of my app? Of my server? And when a deployment is finished, what has Capistrano done and what hasn’t it?
The following sections answer these questions and should set in place the larger view of Capistrano in your project development workflow.
First, it’s important to note that Capistrano is only installed on your development (or “deploy from”) computer. Capistrano has no requirements to be installed on the deployment machine itself. It operates solely on its core requirements and assumptions of the deployment machine, and executes commands to the deployment machines through an established and secure SSH infrastructure.
So, the short answer is to consider the assumptions and requirements and address them. You’ll likely find your first few deployments are a little rough—trial and error. But over time I bet you’ll find Capistrano to be your key deployment ally.
Some requirements are mandatory and some are baseline assumptions that can be overridden. This core philosophy encourages best practices while not locking you in to them.
Capistrano’s core requirements are that
The following assumptions are overridable:
You want to deploy a Rails application.
You’re using Subversion to manage source code.
You’ve got your production environment already built (operating system, Ruby, Rails, other gems, database, web/app/db servers).
Passwords are the same for your deployment machine and your svn
repository.
You’ve created the deployment database plus a user that can access it.
You have all config files in subversions ready to run in your production environment (which includes user/passwords for the aforementioned deployment database).
Your migrations will run from 0 to 100 (deploy:migrate
).
Your web/app servers are configured with a spin
script to start/stop/restart the web app.
As mentioned, these requirements and assumptions are the defaults. If they don’t suit you, stay tuned, because we’ll show you how to build tasks and callbacks to customize Capistrano to suit your specific needs.
When you’re done getting Capistrano and its requirements addressed (either directly or through customization), you’re left in deployment Nirvana. Okay, that’s a little strong, but your code will be deployed from your repository, migrations run, Apache and your app server (Mongrel, fastcgi) will be booted, and well... now you’re deployed, and ready for that next deployment event.
So now you can start doing great things like updating your servers with the latest svn
check-in using cap deploy:update
, or go full out and update to the latest release and restart your app servers using cap deploy
. You can even put up a maintenance page during extended downtime using cap deploy:web:disable
, and roll back when you mess up a deployment, using cap deploy:rollback
.
If you’ve got multiple servers to manage, the commands don’t change, just your setup. Just a little config-file tweaking and you’re using cap deploy
and cap deploy:invoke
to run arbitrary commands on all servers simultaneously.
Now that we’ve taken seen the view from 10,000 feet, we’re now going to get on the ground. For the sake of this first exercise, we’re going to take for granted that all of Capistrano’s expectations as described in the last section have been met.
To get started, let’s install Capistrano:
$ sudo gem install capistrano Install required dependency net-ssh? [Yn] Install required dependency net-sftp? [Yn] Install required dependency highline? [Yn] Successfully installed capistrano-2.0
Running cap
with the --tasks
switch will tell us what tasks Capistrano knows about.
$ cap --tasks cap invoke #Invoke a single command on the remote servers. cap shell #Begin an interactive Capistrano session. Learn more about each with the explain switch(-e): cap -e command #ie. cap -e deploy:pending:diff
Both of these are general-purpose, built-in tasks that allow you to run one or more commands on the deployment machines. The invoke
task will run a single command, while shell
opens up an irb
-like command-prompt where you can issue multiple commands. But where are all the great things you can do with Capistrano?
In order to prepare your project for deployment, Capistrano provides us with the capify
command, which builds the basic configuration files, and with a little bit of editing on our part our app will be ready to deploy. Taking my_project
, previously configured as per Capistrano’s assumptions, we’ll create two files:
$ cd my_project $ capify . [add] writing `./Capfile' [add] writing `./config/deploy.rb' [done] capified!
Now before we look at the two files created, let’s run cap --tasks
again.
$ cap --tasks cap deploy #Deploys your project. cap deploy:check #Test deployment dependencies. cap deploy:cleanup #Clean up old releases. cap deploy:cold #Deploys and starts a 'cold' application. cap deploy:migrate #Run the migrate rake task. cap deploy:migrations #Deploy and run pending migrations. cap deploy:pending #Displays the commits since your last deploy. cap deploy:pending:diff #Displays the diff' since your last deploy. cap deploy:restart #Restarts your application. cap deploy:rollback #Rolls back to a previous version and restarts. cap deploy:rollback_code #Rolls back to the previously deployed version. cap deploy:setup #Prepares one or more servers for deployment. cap deploy:start #Start the application servers. cap deploy:stop #Stop the application servers. cap deploy:symlink #Updates the symlink to the deployed version. cap deploy:update #Copies your project and updates the symlink. cap deploy:update_code #Copies your project to the remote servers. cap deploy:upload #Copy files to the currently deployed version. cap deploy:web:disable #Present a maintenance page to visitors. cap deploy:web:enable #Makes the application web-accessible again. cap invoke #Invoke a single command on the remote servers. cap shell #Begin an interactive Capistrano session.
Now that’s more like it! But where did these new tasks come from? For that answer, we’ll look at Capfile
:
$ cat Capfile require 'capistrano/version' load 'deploy' if respond_to?(:namespace) # cap2 differentiator load 'config/deploy'
It’s short! As you can see, the cap
command loads up recipes by reading the Capfile
in the present directory. Just like Rake, Capistrano will search up the directory tree until it finds a Capfile
, which means you can run cap
in a subdirectory of your project.
The Capfile
built by capify
loads up a bunch of standard Capistrano recipes in deploy
, plus it also loads up your project-specific recipes in config/deploy
:
my_project> $ cat config/deploy.rb set :application, "set your application name here" set :repository, "set your repository location here" # If you aren't deploying to /u/apps/#{application} on the target # servers (which is the default), you can specify the actual location # via the :deploy_to variable: # set :deploy_to, "/var/www/#{application}" # If you aren't using Subversion to manage your source code, specify # your SCM below: # set :scm, :subversion role :app, "your app-server here" role :web, "your web-server here" role :db, "your db-server here", :primary => true
Minor edits to the boilerplate config/deploy.rb
file are all that’s required to prepare your app for deployment. The deploy.rb
file describes your application deployment in simple-to-read language.
The first basic setting is the name of your application:
set :application, "set your application name here" # used as a folder name
Although the help text in deploy.rb
has spaces in the application name, you probably don’t want them, because the application name will be used as a directory name on the deployment machine.
Next, we need to tell Capistrano where to find the source code for your application:
set :repository, "set your repository location here"
Assuming a Subversion server, set :repository
to a Subversion URL (whether HTTP, svn
, or svn+ssh
). Another built-in assumption is that the username and password for the Subversion account are the same as those of your deploy user. The login name of the user running Capistrano will be used to connect to svn
, and you will be prompted if your svn
server requires authentication.
Next we’re going to point Capistrano at the domain name or IP address of your deployment machine(s). Capistrano will SSH to this address to perform the actions that you script. For this easy case, all three Roles (or classes) of machines will be the same.
role :app, "my_deployment_machine" #or you can use an IP address role :web, "my_deployment_machine" role :db, "my_deployment_machine", :primary => true
Roles are a powerful means by which we can target execution of specific tasks on a class of machines. For instance, we can use deploy shell
to run a grep
on all :app
machines.
In our example, the :db
role is marked with the option :primary => true
. This attribute indicates the primary database server. Certain tasks, such as database migrations, will only run on the primary database server, since the most common database-clustering scenario calls for slaves to synch up with the primary. You can also specify that a given role is a slave (using :slave => true
) so that you can target certain types of tasks, such as backups. Think of these attributes as qualifiers of the Role for finer grain control, and although there are standard qualifiers provided by Capistrano, you can define your own.
That’s it! Once the name of the application, the source-control information, and roles are defined, you are done with configuration.
After a successful deploy, Capistrano is going to try to start (spin) your application. By default, it will look for ./script/spin
, expecting that file to contain a script to start your application servers. It’s your job to write that script (since Rails doesn’t come with one), starting the app server of your choice. The easiest approach is to have your spin script talk to the spawner script, provided by Rails, since it knows how to fire up both FCGI and Mongrel.
You can read about the spawner tool easily; just type script/process/spawner --help
at the console. A simple spin script to call spawner to start Mongrel[5] will look like this:
/deploy_to/current/script/process/spawner -p 8256 -i 2
Add that line of code as ./script/spin
to your repository, and upon a successful deploy, Capistrano will start two Mongrel instances, listening on ports 8256 and 8257. The distinct advantage of using the standard Rails spawner is that it will track the process ids, which means other standard Rails scripts are available. This includes script/reaper
, used for restarting, monitoring, and stopping application server instances.
If you decide to step outside the tightly integrated spin solution, perhaps because you want to roll in some background process management, or you want to use a third-party startup script such as mongrel_cluster
, then you’ll have to override the standard deploy tasks. Later on in the chapter, in Baking Exercise #2, we’ll show you how to do just that.
Now that we’ve got our default configuration in place, and all assumptions are covered, we can ask Capistrano to set up our deployment machine. What does setup mean? The deploy:setup
task essentially creates the directory structure that holds your application deployments.
$ cap deploy:setup
After running deploy:setup
, SSH over to your deployment server, and peruse the directory structure that was created. The default is /var/www/apps/
application_name
, containing the following subdirectories:
releases current shared shared/log shared/system shared/pids
This structure bears some discussion. Whenever you deploy, Capistrano checks out (or exports) your project from svn
, and places the files in the releases
folder, each in its own release folder named based on the current date and time. After this happens successfully, Capistrano builds a link from the releases
folder to application_name/
current
, where your currently deployed web application is found.
Capistrano also makes the following symbolic links on each deployment:
application_name
/shared/log
is linked to your current project’s log
directory, so that logs persist across releases.
application_name
/shared/pids
is linked to your current project’s tmp/pids
directory.
application_name
/shared/system
is linked to your current projects public/system
directory. This is where Capistrano stores HTML files that show your project in maintenance mode. (See cap deploy:web:disable/enable
.)
Before we move on to our first deploy, Capistrano provides us with the deploy:check
task to verify that all assumptions and pieces are in place:
cap --quiet deploy:check # quiet shuts out the verbosity
In addition to the default permissions checking, utilities (svn
), and others required, deploy:check
also provides a means to verify application-specific dependencies. These can be nicely declared within your deploy.rb
, and that works for both local and remote dependencies:
depend :remote, :gem, "runt", ">= 0.3.0" depend :remote, :directory, :writeable, /var/www/current/config depend :remote, :command, "monit" depend :local, :command, "svn"
When a dependency fails during the deploy:check
task, you’ll see a message like the following one:
The following dependencies failed. Please check them and try again: --> gem `runt' >= 0.3.0 could not be found (my_deployment_machine)
If you’ve done everything correctly up to now in setting up your application’s deploy.rb
script and you have set up the remote machines, then you’re ready to actually do a deployment.
On a first deploy of this app, we’ll run cap deploy:cold
—otherwise we’d run the default task cap deploy
. The only difference between these two is that cap deploy
will try to shut down your server first, which won’t work since it isn’t running yet.
$ cap deploy:cold # cold will "svn co", run migrations, link this release # to the current and start your servers
In most cases, you will be asked to enter the password to your svn
server. This is a case where it is extremely handy to have set up SSH keys on the local and client machine so that you authenticate automatically with certificates instead of having to enter passwords multiple times.
If all went well, you shouldn’t see any error messages, and a browser session will show your app is up. If not, read the verbose output, go over the assumptions, and start again.
Now that we’ve learned how to use Capistrano with its standard assumptions, we’ll take a look at scenarios in which they need to be worked around. Some scenarios involve simply setting additional Capistrano variables, while others involve overriding existing tasks or hooking into callback functions. Some scenarios require entirely new tasks to be defined.
To use a remote user account other than the currently logged-in user, just set the :user
variable to the desired remote user account name.
set :user, "deploy"
That was simple, and also makes an important point. Capistrano tries to simplify working outside the stated assumptions. Another example would be changing the source-control system to something other than Subversion.
Although the default SCM system is Subversion, Perforce, Bzr, and Darcs are also supported.
set :scm, :subversion # default. :perforce, :bzr, :darcs
You can also customize the deployment strategy, and we do not recommend that you stick with the default, which is :checkout
, since it is very inefficient. All those little .svn
directories chew up disk space like crazy! Instead, try the :export
option, which will do an svn export
of your codebase into the release directory. The :remote_cache
performs well also, since it makes a copy of the last release, and then executes svn up
to get the latest code, but we like :export
best.
set :deploy_via, :checkout # default set :deploy_via, :export set :deploy_via, :remote_cache # copies from cache, then svn up
Sometimes, particularly for security reasons, you don’t have (or want) access to SCM on your deployment machine. For these cases, Capistrano provides a means to deploy_via :copy
. The :copy
strategy tars and gzips your project before using SFTP to upload it to the release directory on the remote machine. In the rare case that your local machine does not have the required binaries, you can tell Capistrano to use Ruby’s internal zip library.
set :deploy_via, :copy # local scm check out. # Cap will tar/gzip sftp to deployment machine set :copy_strategy, :export # changes deploy_via :copy to :export # instead of default scm check out set :copy_compression, :zip # if you don't have tar/gzip binaries, # Cap will zip for you
Hopefully you are noticing that although Capistrano’s assumptions are good defaults, you have a great deal of flexibility in specifying alternatives, without having to write your own tasks.
People sometimes leave configuration files such as database.yml
out of the repository for a number of reasons, among them security concerns. We don’t advise this practice, since it complicates distributed development and local config requirements. However, learning how to work around the challenge posed by nonversioned configuration files provides us a valuable and realistic opportunity to teach you how to move beyond basic understanding of Capistrano.
Note that out of the following three options, there is only one that is really good in our opinion. However, you stand to learn from all three.
In this option, you add a file like production.database.yml
to the repository, and then rename it to database.yml
automatically on deployment. We might have gotten to this challenge because we didn’t want passwords in our repository to begin with, but at least some Rails developers will feel this is a legitimate solution. After all, the more popular reason to leave database.yml
out of the repository is so that each developer on a team can have their own local database connection configuration.
Pros Easy solution. No default configuration; just create the additional task and add a file named production.database.yml
to the repository.
Cons Plaintext passwords in the repository, which could be a big problem in some shops.
To implement this option, add a file named production.database.yml
to the repository and give it your production database details. Then add the task definition (as shown in Listing 21.1) to config/deploy.rb
.
Tasks named :after_update_code
function as callbacks, invoked after the code for the new release is updated. Here we also see demonstrated the run
command, which gives you an idea of how easy it is to run commands remotely.
Here’s another solution, but take heed that it isn’t the best one either. We’re giving it to you mainly to demonstrate the :after_symlink
callback.
Pros No username and password information stored in your repository.
Cons Requires manually copying configuration files to the deployment machine.
To implement this option, add the following task to config/deploy.rb
:
task :after_symlink, :roles => :app do run "cp #{shared_path}/config/database.yml #{release_path}/config/database.yml" end then... a. cap setup b. copy production-ready database.yml file to the shared/config folder c. cap cold_deploy
This one just sounds so right that we hope you don’t even consider the other two. You autogenerate database.yml
(as shown in Listing 21.2) and put it in the shared config directory on the remote machine. Then you link it to your release’s config folder.
Pros Easy reuse, flexibility, and no passwords in the repository.
Cons A little harder to code, but we only do this once.[6]
Example 21.2. Create database.yml
in Shared Path Based on Template
require 'erb' before "deploy:setup", :db after "deploy:update_code", "db:symlink" namespace :db do desc "Create database yaml in shared path" task :default do db_config = ERB.new <<-EOF base: &base adapter: mysql socket: /tmp/mysql.sock username: #{user} password: #{password} development: database: #{application}_dev <<: *base test: database: #{application}_test <<: *base production: database: #{application}_prod <<: *base EOF run "mkdir -p #{shared_path}/config" put db_config.result, "#{shared_path}/config/database.yml" end desc "Make symlink for database yaml" task :symlink do run "ln -nfs #{shared_path}/config/database.yml #{release_path}/config/database.yml" end end
Then to use your new .yml
file, execute the following commands:
$ cap deploy:setup $ cap deploy:update
The Capistrano recipe deploy:migrations
expects a database.yml
production-specified database, does a deployment as usual, and runs any pending migrations. Unfortunately, it is pretty common to get yourself into a situation where your whole suite of migrations won’t run without an error.
One possible solution is to create a task that sets up your database using db/schema.rb
, a Ruby script that (using the Migrations API) handily stores the most recently migrated version of your database schema (DDL). The Capistrano task in Listing 21.3 loads the schema.
Example 21.3. Load Database Schema Remotely Using schema.rb
namespace :deploy do desc "runs 'rake db:schema:load' for current release" task :load_schema, :roles => :db do rake = fetch(:rake, "rake") rails_env = fetch(:rails_env, "production") run "cd #{current_release}; #{rake} RAILS_ENV=#{rails_env} db:schema:load" end end
Run the task in two steps: cap deploy:cold
followed by cap deploy:load_schema
.
One of the greatest things about Capistrano is the ease with which we can roll our own recipes. But before we do, there are some things we need to clear up about variables.
There are two ways we can specify variables. One is from the recipe definitions (as we’ve already seen in this chapter). The other is from the command line.
From the command line we can set variable values using either an -s
or -S
switch. Typing cap -s foo=bar
is equivalent to having set :foo, "bar"
after all your recipes are loaded, and cap -S foo=bar
does so before recipes are loaded.
As with Rake, you can also specify variables within the OS environment. As this is a little tougher in Windows, Capistrano command-line switches are a better cross-platform solution.
Whenever possible, I prefer to create specific Capistrano tasks that set common variables rather than depend on shell scripts or command-line operations. This keeps the deployment knowledge within Capistrano, and not spread around your environment.
A topic that deserves a little investigation at this time is the scoping of Capistrano variables. Are variables local to the task? Local to the namespace? Or global to Capistrano in general? To figure out this bit of undocumented Capistrano, we’re going to write some tasks (Listing 21.4) to play with the possibilities. We’ll start by creating two namespaces (:one
and :two
) and assign identically named variables (:var_one
) to them. Then we’ll set and get those values from a third namespace. The resulting values should teach us much about Capistrano’s scoping rules.
Example 21.4. Exploring Scoping of Capistrano Variables
namespace :one do task :default do set :var_one, "var_one value" end end namespace :two do task :default do set :var_one, "var_two value" end end namespace :three do task :default do puts "!!!! one.var_one == #{one.var_one}" puts "!!!! global name space var_one == #{var_one}" two.default puts "!!!! one.var_one == #{one.var_one}" puts "!!!! two.var_one == #{two.var_one}" puts "!!!! global name space var_one == #{var_one}" end end before "deploy:update", :one before "deploy:update", :two after "deploy:update", :three $cap deploy:update
Running the code in Listing 21.4 dumps the following output to the console:
* executing `three' # run one !!!! one.var_one == var_one value !!!! global name space var_one == var_one value * executing `two' # run two !!!! one.var_one == var_two value !!!! two.var_one == var_two value !!!! global name space var_one == var_two value
What does this mean? First we can see that referencing one.var_one
and (the global) var_one
returns the same value; looks as if they’re one and the same. Taking a second kick at it, in the second run we call task two.default
. Setting only two.var_one
confirms that there isn’t a local variable namespace—but simply one global namespace for variables.
In the preceding tester code you may have noticed the funny two.default
call in the example—this is simply the way to call namespaced tasks. In the case of default, we use the explicit name (merely two
won’t resolve as it does in the namespace syntax shortcut :two
).
In summary, we have shown that variables aren’t scoped by namespace; referencing them by namespace will always return the global value and will probably confuse you and others down the line. But be aware that scoped variables are planned for a future version of Capistrano.
A particularly useful trick with Capistrano is the ability to initialize a starting configuration for tasks. A great use for such preinitialization is for setting up the environment for which we will be deploying.
We can do this in two ways; first, using the -S switch to set an initial value:
$ cap -S app_server=the_rails_way.com,secure_ssh_port=8256 deploy
The deploy task will be run with the app_server
and secure_ssh_port
parameters set. This strategy rapidly gets out of hand as we need new parameters. Do we need an additional shell script to hold those parameters? Yuck! Capistrano can be a powerful friend, especially if you don’t distribute your deployment logic.
The second, and much more DRY, method is to code tasks that define each of your staging environments, so that you can say something like cap production deploy
, letting the production
task setup the needed variables, just as we show you in Listing 21.5.
Example 21.5. Production and Staging Environment Tasks
desc "deploy to the production environment" task :production do set :tag, "release-1.0" unless variables[:tag] set :domain, "the-rails-way.com" set :repository, "http://svn.nnovation.ca/svn/the-rails- way/tags/#{tag}" set :rails_env, "production" set :app_server, "the-rails-way.net" set :secure_ssh_port, 8256 role :app, "#{app_server}:#{secure_ssh_port}" role :web, "#{app_server}:#{secure_ssh_port}" role :db, "#{app_server}:#{secure_ssh_port}", :primary => true end desc "deploy to the staging environment" task :staging do set :domain, "staging.the-rails-way.com" set :repository, "http://svn.nnovation.ca/svn/the-rails-way/trunk" set :rails_env, "development" set :app_server, "staging.the-rails-way.com" set :secure_ssh_port, 8256 role :app, "#{app_server}:#{secure_ssh_port}" role :web, "#{app_server}:#{secure_ssh_port}" role :db, "#{app_server}:#{secure_ssh_port}", :primary => true end
Thus, we can deploy very concisely and without worrying about command-line parameters.
$ cap staging deploy # trunk to staging $ cap production deploy # tags/release-1.0 to production
Now I don’t know about you, but those repetitive role assignments smell a little. How about we just move them out of the tasks, perhaps below the two task definitions? We tried, and learned an important lesson when Capistrano reported that app_server
and secure_ssh_port
weren’t understood. Is it a matter of scoping? Is our execution order off?
The answer is that both scoping and execution order are coming into play. When you move the role assignments out of the task’s do..end
blocks into the main body of the script, you are changing the timing of their evaluation. Those lines will in fact execute prior to the code that is inside of the task definition. So in this particular case, the role assignments would get executed before an app_server
value is set.
Luckily for us, the solution is pretty simple and simplifies our code nicely. We’ll do a Capistrano version of an extract method refactoring, except with an additional task instead of a method.
Define a task named :finalize_staging_init
, and then add a call to it at the end of the staging
and production
tasks.
task :staging do set :domain, "staging.the-rails-way.com" set :repository, "http://svn.nnovation.ca/svn/the-rails-way/trunk" set :rails_env, "development" set :app_server, "staging.the-rails-way.com" set :secure_ssh_port, 8256 finalize_staging_init end task :production do set :tag, "release-1.0" unless variables[:tag] set :domain, "the-rails-way.com" set :repository, "http://svn.nnovation.ca/svn/the-rails- way/tags/#{tag}" set :rails_env, "production" set :app_server, "the-rails-way.net" set :secure_ssh_port, 8256 finalize_staging_init end task :finalize_staging_init do role :app, "#{app_server}:#{secure_ssh_port}" role :web, "#{app_server}:#{secure_ssh_port}" role :db, "#{app_server}:#{secure_ssh_port}", :primary => true end
A typical Rails deployment scenario involves a cluster of Mongrels, and perhaps some additional processes such as backgroundrb
, memcache
, and search engine daemons.
The canned deploy:start
, :stop
, and :start
take care of a single Mongrel instance fronted by Apache. In this exercise, as shown in Listing 21.6, we’re going to override the default tasks, and insert our own to manage a Mongrel cluster and a BackgrounDRb
installation; however, note that this recipe doesn’t manage Apache, as I rarely bring it up or down. At this point you should know enough about Capistrano to understand the recipe and easily bake in Apache support.
Example 21.6. A Comprehensive Deploy Task
namespace :deploy do desc "Restart the Mongrel cluster and backgroundrb" task :restart, :roles => :app do stop start end desc "Start the mongrel cluster and backgroundrb" task :start, :roles => :app do start_mongrel start_backgroundrb end desc "Stop the mongrel cluster and backgroundrb" task :stop, :roles => :app do stop_mongrel stop_backgroundrb end desc "Start Mongrel" task :start_mongrel, :roles => :app do begin run "mongrel_cluster_ctl start -c #{app_mongrel_config_dir}" rescue RuntimeError => e puts e puts "Mongrel appears to be down already. " end end desc "Stop Mongrel" task :stop_mongrel, :roles => :app do begin run "mongrel_cluster_ctl stop -c #{app_mongrel_config_dir}" rescue RuntimeError => e puts e puts "Mongrel appears to be down already. " end end desc "Start the backgroundrb server" task :start_backgroundrb , :roles => :app do begin puts "starting brb in folder #{current_path}" run "cd #{current_path} && RAILS_ENV=#{rails_env} nohup ./script/backgroundrb start > /dev/null 2>&1" rescue RuntimeError => e puts e puts "Problems starting backgroundrb - running already?" end end desc "Stop the backgroundrb server" task :stop_backgroundrb , :roles => :app do begin puts "stopping brb in folder #{current_path}" run "cd #{current_path} && ./script/backgroundrb stop" rescue RuntimeError => e puts e puts "Backgroundrb appears to be down already." end end end
Things are going great on your one server. But as life would have it, business is booming and you decide to build out a cluster of servers. Some are specialized to serve your application, while others are specialized to run the web server, and one is specifically for running asynchronous processing, and so forth. A booming business sure is nice, but traditional deployment is not fun at all. Deploying to 10 machines? Well, might you be tempted to call in sick on deployment day?
Not with Capistrano! It was built right from the beginning to handle multiserver deployments. Jamis really wanted to make deploying to 100 machines as easy as deploying to one. In fact, many people claim that is exactly where Capistrano shines brightest.
Capistrano succeeds so well in its multiserver handling that you won’t notice the difference either from the command line or in writing the task definitions. The secret sauce to multiserver deployments lies in the role
[7] command, by which we can assign one or more deployment machines per task. When a task with many machines qualifies for simultaneous execution, each server assigned to the role will have the task executed—in parallel!
role :app, "uno.booming-biz.com", "dos.booming-biz.com" role :db, "kahuna.booming-biz.com", :primary => true
Adding a second (or third) server to our deployment means it will automatically be executed by all qualifying tasks. More specifically this means that all referenced tasks will execute on your newly added server—all tasks that either specifically require the :app
role plus all tasks that don’t indicate a qualifying role.
namespace :monitor task :exceptions :roles => :app do run "grep 'Exception' #{shared_path}/log/*.log" end end
As an example, running cap monitor:exceptions
will run against the entire :app
role of machines you throw at it. Capistrano will grep
all log files in parallel, streaming a merged result back to your terminal.
Capistrano’s adherence to the DRY principle means that you can scale your physical deployment with zero impact on deployment rules and very little impact on your deployment configuration.
What about the impact on your deployment process? More machines means that mistakes and unexpected errors have bigger negative consequences, right? Not necessarily, since Capistrano has a feature that is usually associated with databases, not deployment systems: Transactions!
Although a failed or incomplete install on one deployment machine can be tough to restore, consider when you have many more, each with their own particular flavor of install failure. Capistrano provides tasks with a transaction infrastructure that wraps and protects key deployment commands. It also has unique on_rollback
handlers to ensure that we can recover from a disastrous deployment scenario with as little collateral damage as possible.
For example, look at the code for the :update_code
and :symlink
tasks—both have on_rollback
blocks that clean up their respective actions, if necessary. This bears some similarity to the up-and-down migration methods of ActiveRecord
.
namespace :deploy do task :update do transaction do update_code symlink end end task :update_code do on_rollback { run "rm -rf #{release_path}" } strategy.deploy! finalize_update end task :symlink, :except => { :no_release => true } do on_rollback do run "rm -f #{current_path}" run "ln -s #{previous_release} #{current_path}; true" end run "rm -f #{current_path} && ln -s #{release_path} #{current_path}" end end
The preceding example is yanked right from Capistrano’s source code[8]—the :update
task uses your SCM strategy of choice to update the deployed codebase, and then sets up the symbolic links of the newly installed application to the ./current
folder.
But what if our app failed to deploy at strategy.deploy!
, perhaps because of problems connecting to the Subversion server? Would the deploy continue creating symlinks? Or perhaps the subversion deploy worked, but the symlinks failed to happen? Either scenario would leave our application in a fractured state. The problem would be compounded if we deployed successfully to one deployment machine, but failed on the second—problems would not necessarily be apparent right away!
task :update_code do on_rollback { run "rm -rf #{release_path}" } strategy.deploy! finalize_update end
To mitigate the risks of failure and handle the greatest number of reasons for possible failure, the :update
task is wrapped in a transaction. If a fault condition were to occur, each and every on_rollback
block would be called, in reverse. That’s why each on_rollback
block should be designed to algorithmically reverse the current task’s operation.
For example, the preceding on_rollback
block removes all files possible created by both strategy.deploy!
and finalize_update
, correcting a potentially fractured deployment.
The transaction system employed by Capistrano isn’t like any that you may have encountered before. For example, it doesn’t keep track of local or remote object changes. The simple but effective transaction system puts you in control of the rollback. It should also be said that Capistrano doesn’t place migrations under transaction—DDL transactions aren’t widely supported by databases,[9] which makes it very difficult to roll back a failed migration.
Real-world deployments often protect application servers through the use of secure proxies and firewalls. The result is that we won’t be able to SSH directly to the deployment machine. However, don’t let that stop you from using Capistrano, thanks to its support for proxied access to deployment machines using the :gateway
setting:
set :gateway, 'gateway.booming-biz.com' role :app, "192.168.1.100", "192.168.1.101" role :db, "192.168.1.200", :primary => true
Setting a :gateway
will cause all requests to tunnel securely to your roled machines through the specified gateway proxy machines. The assumption made by Capistrano is that the roled hosts are not directly accessible, and so to access them it must first establish a connection to gateway.booming-biz.com
and establish SSH tunnels from there.
It’s magical—well, the magic of port forwarding anyway.[10] Other than making sure that the roled machines can be reached through TCP/IP, there’s very little you need to do to support the gateway capability. In fact if you’re using passwords to authenticate there’s nothing else—you will be prompted. But if you’re using PKI, you’ll have to add the public key of your gateway server to your nonpublic roled servers.
This chapter gave you a crash course in using Capistrano to automate your Rails deployment tasks. It should have also pointed you in the right direction to begin using Capistrano as your systems administration helper, given its ability to reliably automate tasks and execute tasks in parallel across one or dozens of remote servers.
1. | Switchtower, the original name, was changed to Capistrano in response to a trademark violation. For details see http://weblog.rubyonrails.org/2006/3/6/switchtower-is-now-capistrano. |
2. | Since it is well documented on the web, but now obsolete, we omit coverage of Capistrano 1.x. The site capify.org provides ample upgrade instructions for developers wanting to migrate to the latest versions of Capistrano. |
3. | This means “you,” Windows, although some have had success with cygwin. |
4. | This probably doesn’t need to be said, but please, please consider PKI—you’re gonna seriously reduce the possibility of break-ins. |
5. | Although Mongrel is today’s best-of-show choice, fastcgi can be as easily configured. |
6. | http://shanesbrain.net/articles/2007/05/30/database-yml-management-with-capistrano-2-0 |
7. | You can also use the |
8. | Use |
9. | MySQL doesn’t, but I understand that Postgres may support DDL transactions. |
10. | For all the gory details, read Jamis Buck’s post: http://weblog.jamisbuck.org/2006/9/26/inside-capistrano-the-gateway-implementation. |
18.117.142.144