Chapter 21. Capistrano

 

“When we .NET developers say that getting into Rails is tough, this is the kind of stuff we’re talking about. [...] It’s the Linux shell, server applications, and other things we’re not used to that trip us up.”

 
 --Brian Eng http://www.softiesonrails.com/2007/4/5/the-absolute-moron-s-guide-to-capistrano

Need being the great mother of invention, the story goes that in his work for 37signals, Jamis created Switchtower (later renamed to Capistrano[1]) when Basecamp grew to be hosted on more than one production server. It’s a tool for automating tasks on remote servers.

Although Capistrano is now firmly established as the standard solution to Rails deployment challenges, its multiserver transactional approach to running remote commands lends it to a fairly broad range of deployment scenarios.

This chapter delves into Capistrano (version 2[2]), clearly showing how it works out of the box to solve tricky and time-consuming activities. We also show what it doesn’t do for you and how you can extend it by crafting your own deployment recipes to customize Capistrano’s capabilities. In conclusion, we’ll explore a few great recipes created by the Capistrano community that have gained broad acceptance.

Overview of Capistrano

We’ll start with a high-level overview of Capistrano. The first stop on that journey is to take stock of Capistrano’s domain-specific terminology.

Terminology

Capistrano has its own vocabulary that distinguishes it from simply being an extension of Rake, on which it is based:

  • One or many deployment machines are the computers to which we will deploy.

  • Recipes are what tasks are to Rake. Whether they represent a single task or more, a recipe provides a desirable solution. Recipes are made up of tasks.

  • Tasks are atomic units of functionality, and exist to be called directly by an end user, or internally by another task. Tasks sit within namespaces.

  • A Capistrano namespace groups tasks logically. In this way a task name may be repeated within another namespace and are a great way for contributing authors to add recipes without the fear of task-name collisions.

  • Roles such as :app and :db are means by which we can group task execution—we can target the running of a task within a role—so that the task runs only against that role. You can think of them as a task qualifier where the qualifier is generally a class of deployment machine such as :app, :db, or :web for which the task will run.

  • Variables are global variables available to the rest of the script.

The Basics

The first time I sat down to use Capistrano, I asked myself: What do I need to do to use Capistrano? What does it expect of my app? Of my server? And when a deployment is finished, what has Capistrano done and what hasn’t it?

The following sections answer these questions and should set in place the larger view of Capistrano in your project development workflow.

What Do I Need to Do to Use Capistrano?

First, it’s important to note that Capistrano is only installed on your development (or “deploy from”) computer. Capistrano has no requirements to be installed on the deployment machine itself. It operates solely on its core requirements and assumptions of the deployment machine, and executes commands to the deployment machines through an established and secure SSH infrastructure.

So, the short answer is to consider the assumptions and requirements and address them. You’ll likely find your first few deployments are a little rough—trial and error. But over time I bet you’ll find Capistrano to be your key deployment ally.

What Does Capistrano Expect?

Some requirements are mandatory and some are baseline assumptions that can be overridden. This core philosophy encourages best practices while not locking you in to them.

Capistrano’s core requirements are that

  • You are using SSH to access the remote machine.

  • The deployment machine has a POSIX[3]-compatible shell.

  • If you’re using passwords, that all deployment machines have the same password (PKI is the recommended solution). [4]

The following assumptions are overridable:

  • You want to deploy a Rails application.

  • You’re using Subversion to manage source code.

  • You’ve got your production environment already built (operating system, Ruby, Rails, other gems, database, web/app/db servers).

  • Passwords are the same for your deployment machine and your svn repository.

  • You’ve created the deployment database plus a user that can access it.

  • You have all config files in subversions ready to run in your production environment (which includes user/passwords for the aforementioned deployment database).

  • Your migrations will run from 0 to 100 (deploy:migrate).

  • Your web/app servers are configured with a spin script to start/stop/restart the web app.

As mentioned, these requirements and assumptions are the defaults. If they don’t suit you, stay tuned, because we’ll show you how to build tasks and callbacks to customize Capistrano to suit your specific needs.

What Has Capistrano Done and What Hasn’t It?

When you’re done getting Capistrano and its requirements addressed (either directly or through customization), you’re left in deployment Nirvana. Okay, that’s a little strong, but your code will be deployed from your repository, migrations run, Apache and your app server (Mongrel, fastcgi) will be booted, and well... now you’re deployed, and ready for that next deployment event.

So now you can start doing great things like updating your servers with the latest svn check-in using cap deploy:update, or go full out and update to the latest release and restart your app servers using cap deploy. You can even put up a maintenance page during extended downtime using cap deploy:web:disable, and roll back when you mess up a deployment, using cap deploy:rollback.

If you’ve got multiple servers to manage, the commands don’t change, just your setup. Just a little config-file tweaking and you’re using cap deploy and cap deploy:invoke to run arbitrary commands on all servers simultaneously.

Getting Started

Now that we’ve taken seen the view from 10,000 feet, we’re now going to get on the ground. For the sake of this first exercise, we’re going to take for granted that all of Capistrano’s expectations as described in the last section have been met.

Installation

To get started, let’s install Capistrano:

$ sudo gem install capistrano
Install required dependency net-ssh? [Yn]
Install required dependency net-sftp? [Yn]
Install required dependency highline? [Yn]

Successfully installed capistrano-2.0

Running cap with the --tasks switch will tell us what tasks Capistrano knows about.

$ cap --tasks
cap invoke #Invoke a single command on the remote servers.
cap shell    #Begin an interactive Capistrano session.

Learn more about each with the explain switch(-e):
cap -e command #ie. cap -e deploy:pending:diff

Both of these are general-purpose, built-in tasks that allow you to run one or more commands on the deployment machines. The invoke task will run a single command, while shell opens up an irb-like command-prompt where you can issue multiple commands. But where are all the great things you can do with Capistrano?

“Capify” Your Rails Application

In order to prepare your project for deployment, Capistrano provides us with the capify command, which builds the basic configuration files, and with a little bit of editing on our part our app will be ready to deploy. Taking my_project, previously configured as per Capistrano’s assumptions, we’ll create two files:

$ cd my_project
$ capify .
[add] writing `./Capfile'
[add] writing `./config/deploy.rb'
 [done] capified!

Now before we look at the two files created, let’s run cap --tasks again.

$ cap --tasks
cap deploy                #Deploys your project.
cap deploy:check          #Test deployment dependencies.
cap deploy:cleanup        #Clean up old releases.
cap deploy:cold           #Deploys and starts a 'cold' application.
cap deploy:migrate        #Run the migrate rake task.
cap deploy:migrations     #Deploy and run pending migrations.
cap deploy:pending        #Displays the commits since your last deploy.
cap deploy:pending:diff   #Displays the diff' since your last deploy.
cap deploy:restart        #Restarts your application.
cap deploy:rollback       #Rolls back to a previous version and restarts.
cap deploy:rollback_code  #Rolls back to the previously deployed version.
cap deploy:setup          #Prepares one or more servers for deployment.
cap deploy:start          #Start the application servers.
cap deploy:stop           #Stop the application servers.
cap deploy:symlink        #Updates the symlink to the deployed version.
cap deploy:update         #Copies your project and updates the symlink.
cap deploy:update_code    #Copies your project to the remote servers.
cap deploy:upload         #Copy files to the currently deployed version.
cap deploy:web:disable    #Present a maintenance page to visitors.
cap deploy:web:enable     #Makes the application web-accessible again.
cap invoke                #Invoke a single command on the remote servers.
cap shell                 #Begin an interactive Capistrano session.

Now that’s more like it! But where did these new tasks come from? For that answer, we’ll look at Capfile:

$ cat Capfile
require 'capistrano/version'
load 'deploy' if respond_to?(:namespace) # cap2 differentiator
load 'config/deploy'

It’s short! As you can see, the cap command loads up recipes by reading the Capfile in the present directory. Just like Rake, Capistrano will search up the directory tree until it finds a Capfile, which means you can run cap in a subdirectory of your project.

The Capfile built by capify loads up a bunch of standard Capistrano recipes in deploy, plus it also loads up your project-specific recipes in config/deploy:

my_project> $ cat config/deploy.rb

set :application, "set your application name here"
set :repository,  "set your repository location here"

# If you aren't deploying to /u/apps/#{application} on the target
# servers (which is the default), you can specify the actual location
# via the :deploy_to variable:
# set :deploy_to, "/var/www/#{application}"

# If you aren't using Subversion to manage your source code, specify
# your SCM below:
# set :scm, :subversion

role :app, "your app-server here"
role :web, "your web-server here"
role :db,  "your db-server here", :primary => true

Configuring the Deployment

Minor edits to the boilerplate config/deploy.rb file are all that’s required to prepare your app for deployment. The deploy.rb file describes your application deployment in simple-to-read language.

Name Your Application

The first basic setting is the name of your application:

set :application, "set your application name here" # used as a folder
name

Although the help text in deploy.rb has spaces in the application name, you probably don’t want them, because the application name will be used as a directory name on the deployment machine.

Repository Info

Next, we need to tell Capistrano where to find the source code for your application:

set :repository, "set your repository location here"

Assuming a Subversion server, set :repository to a Subversion URL (whether HTTP, svn, or svn+ssh). Another built-in assumption is that the username and password for the Subversion account are the same as those of your deploy user. The login name of the user running Capistrano will be used to connect to svn, and you will be prompted if your svn server requires authentication.

Define Roles

Next we’re going to point Capistrano at the domain name or IP address of your deployment machine(s). Capistrano will SSH to this address to perform the actions that you script. For this easy case, all three Roles (or classes) of machines will be the same.

role :app, "my_deployment_machine" #or you can use an IP address
role :web, "my_deployment_machine"
role :db,  "my_deployment_machine", :primary => true

Roles are a powerful means by which we can target execution of specific tasks on a class of machines. For instance, we can use deploy shell to run a grep on all :app machines.

Extra Role Properties

In our example, the :db role is marked with the option :primary => true. This attribute indicates the primary database server. Certain tasks, such as database migrations, will only run on the primary database server, since the most common database-clustering scenario calls for slaves to synch up with the primary. You can also specify that a given role is a slave (using :slave => true) so that you can target certain types of tasks, such as backups. Think of these attributes as qualifiers of the Role for finer grain control, and although there are standard qualifiers provided by Capistrano, you can define your own.

That’s it! Once the name of the application, the source-control information, and roles are defined, you are done with configuration.

A Little Spin, Please...

After a successful deploy, Capistrano is going to try to start (spin) your application. By default, it will look for ./script/spin, expecting that file to contain a script to start your application servers. It’s your job to write that script (since Rails doesn’t come with one), starting the app server of your choice. The easiest approach is to have your spin script talk to the spawner script, provided by Rails, since it knows how to fire up both FCGI and Mongrel.

You can read about the spawner tool easily; just type script/process/spawner --help at the console. A simple spin script to call spawner to start Mongrel[5] will look like this:

/deploy_to/current/script/process/spawner -p 8256 -i 2

Add that line of code as ./script/spin to your repository, and upon a successful deploy, Capistrano will start two Mongrel instances, listening on ports 8256 and 8257. The distinct advantage of using the standard Rails spawner is that it will track the process ids, which means other standard Rails scripts are available. This includes script/reaper, used for restarting, monitoring, and stopping application server instances.

If you decide to step outside the tightly integrated spin solution, perhaps because you want to roll in some background process management, or you want to use a third-party startup script such as mongrel_cluster, then you’ll have to override the standard deploy tasks. Later on in the chapter, in Baking Exercise #2, we’ll show you how to do just that.

Set Up the Deployment Machine

Now that we’ve got our default configuration in place, and all assumptions are covered, we can ask Capistrano to set up our deployment machine. What does setup mean? The deploy:setup task essentially creates the directory structure that holds your application deployments.

$ cap deploy:setup

Deployment Directory Structure

After running deploy:setup, SSH over to your deployment server, and peruse the directory structure that was created. The default is /var/www/apps/application_name, containing the following subdirectories:

releases
current
shared
shared/log
shared/system
shared/pids

This structure bears some discussion. Whenever you deploy, Capistrano checks out (or exports) your project from svn, and places the files in the releases folder, each in its own release folder named based on the current date and time. After this happens successfully, Capistrano builds a link from the releases folder to application_name/current, where your currently deployed web application is found.

Symbolic Links

Capistrano also makes the following symbolic links on each deployment:

  • application_name/shared/log is linked to your current project’s log directory, so that logs persist across releases.

  • application_name/shared/pids is linked to your current project’s tmp/pids directory.

  • application_name/shared/system is linked to your current projects public/system directory. This is where Capistrano stores HTML files that show your project in maintenance mode. (See cap deploy:web:disable/enable.)

Checking a New Deployment Setup

Before we move on to our first deploy, Capistrano provides us with the deploy:check task to verify that all assumptions and pieces are in place:

cap --quiet deploy:check # quiet shuts out the verbosity

In addition to the default permissions checking, utilities (svn), and others required, deploy:check also provides a means to verify application-specific dependencies. These can be nicely declared within your deploy.rb, and that works for both local and remote dependencies:

depend :remote, :gem, "runt", ">= 0.3.0"
depend :remote, :directory, :writeable, /var/www/current/config
depend :remote, :command, "monit"
depend :local, :command, "svn"

When a dependency fails during the deploy:check task, you’ll see a message like the following one:

The following dependencies failed. Please check them and try again:
--> gem `runt' >= 0.3.0 could not be found (my_deployment_machine)

Deploy!

If you’ve done everything correctly up to now in setting up your application’s deploy.rb script and you have set up the remote machines, then you’re ready to actually do a deployment.

On a first deploy of this app, we’ll run cap deploy:cold —otherwise we’d run the default task cap deploy. The only difference between these two is that cap deploy will try to shut down your server first, which won’t work since it isn’t running yet.

$ cap deploy:cold # cold will "svn co", run migrations, link this
release
                  # to the current and start your servers

In most cases, you will be asked to enter the password to your svn server. This is a case where it is extremely handy to have set up SSH keys on the local and client machine so that you authenticate automatically with certificates instead of having to enter passwords multiple times.

If all went well, you shouldn’t see any error messages, and a browser session will show your app is up. If not, read the verbose output, go over the assumptions, and start again.

Overriding Capistrano Assumptions

Now that we’ve learned how to use Capistrano with its standard assumptions, we’ll take a look at scenarios in which they need to be worked around. Some scenarios involve simply setting additional Capistrano variables, while others involve overriding existing tasks or hooking into callback functions. Some scenarios require entirely new tasks to be defined.

Using a Remote User Account

To use a remote user account other than the currently logged-in user, just set the :user variable to the desired remote user account name.

set :user, "deploy"

That was simple, and also makes an important point. Capistrano tries to simplify working outside the stated assumptions. Another example would be changing the source-control system to something other than Subversion.

Customizing the SCM System Used by Capistrano

Although the default SCM system is Subversion, Perforce, Bzr, and Darcs are also supported.

set :scm, :subversion # default. :perforce, :bzr, :darcs

You can also customize the deployment strategy, and we do not recommend that you stick with the default, which is :checkout, since it is very inefficient. All those little .svn directories chew up disk space like crazy! Instead, try the :export option, which will do an svn export of your codebase into the release directory. The :remote_cache performs well also, since it makes a copy of the last release, and then executes svn up to get the latest code, but we like :export best.

set :deploy_via, :checkout         # default
set :deploy_via, :export
set :deploy_via, :remote_cache # copies from cache, then svn up

Working without SCM Access from the Deployment Machine

Sometimes, particularly for security reasons, you don’t have (or want) access to SCM on your deployment machine. For these cases, Capistrano provides a means to deploy_via :copy. The :copy strategy tars and gzips your project before using SFTP to upload it to the release directory on the remote machine. In the rare case that your local machine does not have the required binaries, you can tell Capistrano to use Ruby’s internal zip library.

set :deploy_via, :copy        # local scm check out.
                    # Cap will tar/gzip sftp to deployment machine

set :copy_strategy, :export  # changes deploy_via :copy to :export
                    # instead of default scm check out

set :copy_compression, :zip    # if you don't have tar/gzip binaries,
                    # Cap will zip for you

Hopefully you are noticing that although Capistrano’s assumptions are good defaults, you have a great deal of flexibility in specifying alternatives, without having to write your own tasks.

What If I Don’t Store database.yml in My SCM Repository?

People sometimes leave configuration files such as database.yml out of the repository for a number of reasons, among them security concerns. We don’t advise this practice, since it complicates distributed development and local config requirements. However, learning how to work around the challenge posed by nonversioned configuration files provides us a valuable and realistic opportunity to teach you how to move beyond basic understanding of Capistrano.

Note that out of the following three options, there is only one that is really good in our opinion. However, you stand to learn from all three.

Option A: Version It Anyway, but with a Different Name

In this option, you add a file like production.database.yml to the repository, and then rename it to database.yml automatically on deployment. We might have gotten to this challenge because we didn’t want passwords in our repository to begin with, but at least some Rails developers will feel this is a legitimate solution. After all, the more popular reason to leave database.yml out of the repository is so that each developer on a team can have their own local database connection configuration.

  • Pros Easy solution. No default configuration; just create the additional task and add a file named production.database.yml to the repository.

  • Cons Plaintext passwords in the repository, which could be a big problem in some shops.

To implement this option, add a file named production.database.yml to the repository and give it your production database details. Then add the task definition (as shown in Listing 21.1) to config/deploy.rb.

Example 21.1. Copying a production.database.yml Configuration after Code Update

task :after_update_code, :roles => :app do
  run "cp #{release_path}/config/production.database.yml
          #{release_path}/config/database.yml"
end

Tasks named :after_update_code function as callbacks, invoked after the code for the new release is updated. Here we also see demonstrated the run command, which gives you an idea of how easy it is to run commands remotely.

Option B: Store Production-Ready database.yml in the shared/config Folder

Here’s another solution, but take heed that it isn’t the best one either. We’re giving it to you mainly to demonstrate the :after_symlink callback.

  • Pros No username and password information stored in your repository.

  • Cons Requires manually copying configuration files to the deployment machine.

To implement this option, add the following task to config/deploy.rb:

task :after_symlink, :roles => :app do
   run "cp #{shared_path}/config/database.yml
           #{release_path}/config/database.yml"
end

then...
 a. cap setup
 b. copy production-ready database.yml file to the shared/config
folder
 c. cap cold_deploy

Option C: The Best Option: Autogenerate database.yml

This one just sounds so right that we hope you don’t even consider the other two. You autogenerate database.yml (as shown in Listing 21.2) and put it in the shared config directory on the remote machine. Then you link it to your release’s config folder.

  • Pros Easy reuse, flexibility, and no passwords in the repository.

  • Cons A little harder to code, but we only do this once.[6]

Example 21.2. Create database.yml in Shared Path Based on Template

require 'erb'

before "deploy:setup", :db
after "deploy:update_code", "db:symlink"

namespace :db do
  desc "Create database yaml in shared path"
  task :default do
    db_config = ERB.new <<-EOF
    base: &base
      adapter: mysql
      socket: /tmp/mysql.sock
      username: #{user}
      password: #{password}

    development:
      database: #{application}_dev
      <<: *base

    test:
      database: #{application}_test
      <<: *base

    production:
      database: #{application}_prod
      <<: *base
    EOF

    run "mkdir -p #{shared_path}/config"
    put db_config.result, "#{shared_path}/config/database.yml"
  end

  desc "Make symlink for database yaml"
  task :symlink do
    run "ln -nfs #{shared_path}/config/database.yml
         #{release_path}/config/database.yml"
  end
end

Then to use your new .yml file, execute the following commands:

$ cap deploy:setup
$ cap deploy:update

What If My Migrations Won’t Run from 0 to 100?

The Capistrano recipe deploy:migrations expects a database.yml production-specified database, does a deployment as usual, and runs any pending migrations. Unfortunately, it is pretty common to get yourself into a situation where your whole suite of migrations won’t run without an error.

One possible solution is to create a task that sets up your database using db/schema.rb, a Ruby script that (using the Migrations API) handily stores the most recently migrated version of your database schema (DDL). The Capistrano task in Listing 21.3 loads the schema.

Example 21.3. Load Database Schema Remotely Using schema.rb

namespace :deploy do
  desc "runs 'rake db:schema:load' for current release"
  task :load_schema, :roles => :db do

    rake = fetch(:rake, "rake")
    rails_env = fetch(:rails_env, "production")

    run "cd #{current_release}; #{rake} RAILS_ENV=#{rails_env}
db:schema:load"
  end
end

Run the task in two steps: cap deploy:cold followed by cap deploy:load_schema.

Useful Capistrano Recipes

One of the greatest things about Capistrano is the ease with which we can roll our own recipes. But before we do, there are some things we need to clear up about variables.

Variables and Their Scope

There are two ways we can specify variables. One is from the recipe definitions (as we’ve already seen in this chapter). The other is from the command line.

From the command line we can set variable values using either an -s or -S switch. Typing cap -s foo=bar is equivalent to having set :foo, "bar" after all your recipes are loaded, and cap -S foo=bar does so before recipes are loaded.

As with Rake, you can also specify variables within the OS environment. As this is a little tougher in Windows, Capistrano command-line switches are a better cross-platform solution.

Whenever possible, I prefer to create specific Capistrano tasks that set common variables rather than depend on shell scripts or command-line operations. This keeps the deployment knowledge within Capistrano, and not spread around your environment.

A topic that deserves a little investigation at this time is the scoping of Capistrano variables. Are variables local to the task? Local to the namespace? Or global to Capistrano in general? To figure out this bit of undocumented Capistrano, we’re going to write some tasks (Listing 21.4) to play with the possibilities. We’ll start by creating two namespaces (:one and :two) and assign identically named variables (:var_one) to them. Then we’ll set and get those values from a third namespace. The resulting values should teach us much about Capistrano’s scoping rules.

Example 21.4. Exploring Scoping of Capistrano Variables

namespace :one do
  task :default do
    set :var_one, "var_one value"
  end
end

namespace :two do
  task :default do
    set :var_one, "var_two value"
  end
end

namespace :three do
  task :default do
    puts "!!!! one.var_one == #{one.var_one}"
    puts "!!!! global name space var_one == #{var_one}"

    two.default

    puts "!!!! one.var_one == #{one.var_one}"
    puts "!!!! two.var_one == #{two.var_one}"
    puts "!!!! global name space var_one == #{var_one}"
  end
end

before "deploy:update", :one
before "deploy:update", :two
after "deploy:update", :three

$cap deploy:update

Running the code in Listing 21.4 dumps the following output to the console:

  * executing `three' # run one
!!!! one.var_one == var_one value
!!!! global name space var_one == var_one value

  * executing `two' # run two
!!!! one.var_one == var_two value
!!!! two.var_one == var_two value
!!!! global name space var_one == var_two value

What does this mean? First we can see that referencing one.var_one and (the global) var_one returns the same value; looks as if they’re one and the same. Taking a second kick at it, in the second run we call task two.default. Setting only two.var_one confirms that there isn’t a local variable namespace—but simply one global namespace for variables.

In the preceding tester code you may have noticed the funny two.default call in the example—this is simply the way to call namespaced tasks. In the case of default, we use the explicit name (merely two won’t resolve as it does in the namespace syntax shortcut :two).

In summary, we have shown that variables aren’t scoped by namespace; referencing them by namespace will always return the global value and will probably confuse you and others down the line. But be aware that scoped variables are planned for a future version of Capistrano.

Exercise #1: Staging

A particularly useful trick with Capistrano is the ability to initialize a starting configuration for tasks. A great use for such preinitialization is for setting up the environment for which we will be deploying.

We can do this in two ways; first, using the -S switch to set an initial value:

$ cap -S app_server=the_rails_way.com,secure_ssh_port=8256 deploy

The deploy task will be run with the app_server and secure_ssh_port parameters set. This strategy rapidly gets out of hand as we need new parameters. Do we need an additional shell script to hold those parameters? Yuck! Capistrano can be a powerful friend, especially if you don’t distribute your deployment logic.

The second, and much more DRY, method is to code tasks that define each of your staging environments, so that you can say something like cap production deploy, letting the production task setup the needed variables, just as we show you in Listing 21.5.

Example 21.5. Production and Staging Environment Tasks

desc "deploy to the production environment"
task :production do
   set :tag, "release-1.0" unless variables[:tag]
   set :domain, "the-rails-way.com"
   set :repository, "http://svn.nnovation.ca/svn/the-rails-
way/tags/#{tag}"
   set :rails_env, "production"
   set :app_server, "the-rails-way.net"
   set :secure_ssh_port, 8256

   role :app, "#{app_server}:#{secure_ssh_port}"
   role :web, "#{app_server}:#{secure_ssh_port}"
   role :db,  "#{app_server}:#{secure_ssh_port}", :primary => true
end

desc "deploy to the staging environment"
task :staging do
  set :domain, "staging.the-rails-way.com"
  set :repository, "http://svn.nnovation.ca/svn/the-rails-way/trunk"
  set :rails_env, "development"
  set :app_server, "staging.the-rails-way.com"
  set :secure_ssh_port, 8256

  role :app, "#{app_server}:#{secure_ssh_port}"
  role :web, "#{app_server}:#{secure_ssh_port}"
  role :db,  "#{app_server}:#{secure_ssh_port}", :primary => true
end

Thus, we can deploy very concisely and without worrying about command-line parameters.

$ cap staging deploy    # trunk to staging
$ cap production deploy # tags/release-1.0 to production

Now I don’t know about you, but those repetitive role assignments smell a little. How about we just move them out of the tasks, perhaps below the two task definitions? We tried, and learned an important lesson when Capistrano reported that app_server and secure_ssh_port weren’t understood. Is it a matter of scoping? Is our execution order off?

The answer is that both scoping and execution order are coming into play. When you move the role assignments out of the task’s do..end blocks into the main body of the script, you are changing the timing of their evaluation. Those lines will in fact execute prior to the code that is inside of the task definition. So in this particular case, the role assignments would get executed before an app_server value is set.

Luckily for us, the solution is pretty simple and simplifies our code nicely. We’ll do a Capistrano version of an extract method refactoring, except with an additional task instead of a method.

Define a task named :finalize_staging_init, and then add a call to it at the end of the staging and production tasks.

task :staging do
  set :domain, "staging.the-rails-way.com"
  set :repository, "http://svn.nnovation.ca/svn/the-rails-way/trunk"
  set :rails_env, "development"
  set :app_server, "staging.the-rails-way.com"
  set :secure_ssh_port, 8256

  finalize_staging_init
end

task :production do
  set :tag, "release-1.0" unless variables[:tag]
  set :domain, "the-rails-way.com"
  set :repository, "http://svn.nnovation.ca/svn/the-rails-
way/tags/#{tag}"
  set :rails_env, "production"
  set :app_server, "the-rails-way.net"
  set :secure_ssh_port, 8256

 finalize_staging_init
end

task :finalize_staging_init do
  role :app, "#{app_server}:#{secure_ssh_port}"
  role :web, "#{app_server}:#{secure_ssh_port}"
  role :db,  "#{app_server}:#{secure_ssh_port}", :primary => true
end

Exercise #2: Managing Other Services

A typical Rails deployment scenario involves a cluster of Mongrels, and perhaps some additional processes such as backgroundrb, memcache, and search engine daemons.

The canned deploy:start, :stop, and :start take care of a single Mongrel instance fronted by Apache. In this exercise, as shown in Listing 21.6, we’re going to override the default tasks, and insert our own to manage a Mongrel cluster and a BackgrounDRb installation; however, note that this recipe doesn’t manage Apache, as I rarely bring it up or down. At this point you should know enough about Capistrano to understand the recipe and easily bake in Apache support.

Example 21.6. A Comprehensive Deploy Task

namespace :deploy do

  desc "Restart the Mongrel cluster and backgroundrb"
  task :restart, :roles => :app do
    stop
    start
  end

  desc "Start the mongrel cluster and backgroundrb"
  task :start, :roles => :app do
    start_mongrel
    start_backgroundrb
  end

  desc "Stop the mongrel cluster and backgroundrb"
  task :stop, :roles => :app do
    stop_mongrel
    stop_backgroundrb
  end

  desc "Start Mongrel"
  task :start_mongrel, :roles => :app do
    begin
      run "mongrel_cluster_ctl start -c #{app_mongrel_config_dir}"
    rescue RuntimeError => e
      puts e
      puts "Mongrel appears to be down already. "
    end
  end

  desc "Stop Mongrel"
  task :stop_mongrel, :roles => :app do
    begin
      run "mongrel_cluster_ctl stop -c #{app_mongrel_config_dir}"
    rescue RuntimeError => e
      puts e
      puts "Mongrel appears to be down already. "
    end
  end

  desc "Start the backgroundrb server"
  task :start_backgroundrb , :roles => :app do
    begin
      puts "starting brb in folder #{current_path}"
      run "cd #{current_path} && RAILS_ENV=#{rails_env} nohup
./script/backgroundrb start > /dev/null 2>&1"
    rescue RuntimeError => e
      puts e
      puts "Problems starting backgroundrb - running already?"
    end
  end

  desc "Stop the backgroundrb server"
  task :stop_backgroundrb , :roles => :app do
    begin
      puts "stopping brb in folder #{current_path}"
      run "cd #{current_path} && ./script/backgroundrb stop"
    rescue RuntimeError => e
      puts e
      puts "Backgroundrb appears to be down already."
    end
  end
end

Multiserver Deployments

Things are going great on your one server. But as life would have it, business is booming and you decide to build out a cluster of servers. Some are specialized to serve your application, while others are specialized to run the web server, and one is specifically for running asynchronous processing, and so forth. A booming business sure is nice, but traditional deployment is not fun at all. Deploying to 10 machines? Well, might you be tempted to call in sick on deployment day?

Not with Capistrano! It was built right from the beginning to handle multiserver deployments. Jamis really wanted to make deploying to 100 machines as easy as deploying to one. In fact, many people claim that is exactly where Capistrano shines brightest.

Capistrano succeeds so well in its multiserver handling that you won’t notice the difference either from the command line or in writing the task definitions. The secret sauce to multiserver deployments lies in the role[7] command, by which we can assign one or more deployment machines per task. When a task with many machines qualifies for simultaneous execution, each server assigned to the role will have the task executed—in parallel!

role :app, "uno.booming-biz.com", "dos.booming-biz.com"
role :db, "kahuna.booming-biz.com", :primary => true

Adding a second (or third) server to our deployment means it will automatically be executed by all qualifying tasks. More specifically this means that all referenced tasks will execute on your newly added server—all tasks that either specifically require the :app role plus all tasks that don’t indicate a qualifying role.

namespace :monitor
  task :exceptions :roles => :app do
    run "grep 'Exception' #{shared_path}/log/*.log"
 end
end

As an example, running cap monitor:exceptions will run against the entire :app role of machines you throw at it. Capistrano will grep all log files in parallel, streaming a merged result back to your terminal.

Capistrano’s adherence to the DRY principle means that you can scale your physical deployment with zero impact on deployment rules and very little impact on your deployment configuration.

What about the impact on your deployment process? More machines means that mistakes and unexpected errors have bigger negative consequences, right? Not necessarily, since Capistrano has a feature that is usually associated with databases, not deployment systems: Transactions!

Transactions

Although a failed or incomplete install on one deployment machine can be tough to restore, consider when you have many more, each with their own particular flavor of install failure. Capistrano provides tasks with a transaction infrastructure that wraps and protects key deployment commands. It also has unique on_rollback handlers to ensure that we can recover from a disastrous deployment scenario with as little collateral damage as possible.

For example, look at the code for the :update_code and :symlink tasks—both have on_rollback blocks that clean up their respective actions, if necessary. This bears some similarity to the up-and-down migration methods of ActiveRecord.

namespace :deploy do
task :update do
  transaction do
    update_code
    symlink
  end
end

task :update_code do
  on_rollback { run "rm -rf #{release_path}" }
  strategy.deploy!
  finalize_update
end

task :symlink, :except => { :no_release => true } do
  on_rollback do
    run "rm -f #{current_path}"
    run "ln -s #{previous_release} #{current_path}; true"
  end

  run "rm -f #{current_path} && ln -s #{release_path} #{current_path}"
end
end

The preceding example is yanked right from Capistrano’s source code[8]—the :update task uses your SCM strategy of choice to update the deployed codebase, and then sets up the symbolic links of the newly installed application to the ./current folder.

But what if our app failed to deploy at strategy.deploy!, perhaps because of problems connecting to the Subversion server? Would the deploy continue creating symlinks? Or perhaps the subversion deploy worked, but the symlinks failed to happen? Either scenario would leave our application in a fractured state. The problem would be compounded if we deployed successfully to one deployment machine, but failed on the second—problems would not necessarily be apparent right away!

task :update_code do
  on_rollback { run "rm -rf #{release_path}" }
  strategy.deploy!
  finalize_update
end

To mitigate the risks of failure and handle the greatest number of reasons for possible failure, the :update task is wrapped in a transaction. If a fault condition were to occur, each and every on_rollback block would be called, in reverse. That’s why each on_rollback block should be designed to algorithmically reverse the current task’s operation.

For example, the preceding on_rollback block removes all files possible created by both strategy.deploy! and finalize_update, correcting a potentially fractured deployment.

The transaction system employed by Capistrano isn’t like any that you may have encountered before. For example, it doesn’t keep track of local or remote object changes. The simple but effective transaction system puts you in control of the rollback. It should also be said that Capistrano doesn’t place migrations under transaction—DDL transactions aren’t widely supported by databases,[9] which makes it very difficult to roll back a failed migration.

Proxied Access to Deployment Machines

Real-world deployments often protect application servers through the use of secure proxies and firewalls. The result is that we won’t be able to SSH directly to the deployment machine. However, don’t let that stop you from using Capistrano, thanks to its support for proxied access to deployment machines using the :gateway setting:

set :gateway, 'gateway.booming-biz.com'
role :app, "192.168.1.100", "192.168.1.101"
role :db, "192.168.1.200", :primary => true

Setting a :gateway will cause all requests to tunnel securely to your roled machines through the specified gateway proxy machines. The assumption made by Capistrano is that the roled hosts are not directly accessible, and so to access them it must first establish a connection to gateway.booming-biz.com and establish SSH tunnels from there.

It’s magical—well, the magic of port forwarding anyway.[10] Other than making sure that the roled machines can be reached through TCP/IP, there’s very little you need to do to support the gateway capability. In fact if you’re using passwords to authenticate there’s nothing else—you will be prompted. But if you’re using PKI, you’ll have to add the public key of your gateway server to your nonpublic roled servers.

Conclusion

This chapter gave you a crash course in using Capistrano to automate your Rails deployment tasks. It should have also pointed you in the right direction to begin using Capistrano as your systems administration helper, given its ability to reliably automate tasks and execute tasks in parallel across one or dozens of remote servers.

References

1.

Switchtower, the original name, was changed to Capistrano in response to a trademark violation. For details see http://weblog.rubyonrails.org/2006/3/6/switchtower-is-now-capistrano.

2.

Since it is well documented on the web, but now obsolete, we omit coverage of Capistrano 1.x. The site capify.org provides ample upgrade instructions for developers wanting to migrate to the latest versions of Capistrano.

3.

This means “you,” Windows, although some have had success with cygwin.

4.

This probably doesn’t need to be said, but please, please consider PKI—you’re gonna seriously reduce the possibility of break-ins.

5.

Although Mongrel is today’s best-of-show choice, fastcgi can be as easily configured.

6.

http://shanesbrain.net/articles/2007/05/30/database-yml-management-with-capistrano-2-0

7.

You can also use the :host qualifier for a task, but rolling host assignments up to the role will simplify your life when you roll out more servers.

8.

Use gem environment to find the gem source, and pore over Capistrano’s source code. This is a great way to learn.

9.

MySQL doesn’t, but I understand that Postgres may support DDL transactions.

10.

For all the gory details, read Jamis Buck’s post: http://weblog.jamisbuck.org/2006/9/26/inside-capistrano-the-gateway-implementation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.142.144