Chapter 15. XML and Active Resource

Structure is nothing if it is all you got. Skeletons spook people if they try to walk around on their own. I really wonder why XML does not.

—Erik Naggum

XML doesn’t get much respect from the Rails community. It’s enterprisey. In the Ruby world that other markup language YAML (YAML Ain’t Markup Language) and data interchange format JSON (JavaScript Object Notation) get a heck of a lot more attention. However, use of XML is a fact of life for many projects, especially when it comes to interoperability with legacy systems. Luckily, Ruby on Rails gives us some pretty good functionality related to XML.

This chapter examines how to both generate and parse XML in your Rails applications, starting with a thorough examination of the to_xml method that most objects have in Rails.

15.1 The to_xml Method

Sometimes you just want an XML representation of an object, and Active Record models provide easy, automatic XML generation via the to_xml method. Let’s play with this method in the console and see what it can do.

I’ll fire up the console for my book-authoring sample application and find an Active Record object to manipulate.

image

There we go, a User instance. Let’s see that instance as its generic model, XML representation.

image

Ugh, that’s ugly. Ruby’s print, formatted XML function might help us out here.

image

Much better! So what do we have here? Looks like a fairly straightforward serialized representation of our User instance in XML.

15.1.1 Customizing to_xml Output

The standard processing instruction is at the top, followed by an element name corresponding to the class name of the object. The properties are represented as subelements, with non-string data fields including a type attribute. Mind you, this is the default behavior, and we can customize it with some additional parameters to the to_xml method.

We’ll strip down that XML representation of a user to just an email and login using the only parameter. It’s provided in a familiar options hash, with the value of the :only parameter as an array:

image

Following the familiar Rails convention, the only parameter is complemented by its inverse, except, which will exclude the specified properties. What if I want my user’s email and login as a snippet of XML that will be included in another document? Then let’s get rid of that pesky instruction, too, using the skip_instruct parameter.

image

We can change the root element in our XML representation of User and the indenting from two to four spaces by using the root and indent parameters, respectively.

image

By default Rails converts CamelCase and underscore attribute names to dashes as in created-at and client-id. You can force underscore attribute names by setting the dasherize parameter to false.

image

In the preceding output, the attribute type is included. This too can be configured using the skip_types parameter.

image

15.1.2 Associations and to_xml

So far we’ve only worked with a base Active Record and not with any of its associations. What if we wanted an XML representation of not just a book but also its associated chapters? Rails provides the :include parameter for just this purpose. The :include parameter will also take an array or associations to represent in XML.

image

image

Rails 3 has a much more useful to_xml method on core classes. Unlike Rails 2, arrays are easily serializable to XML, with element names inferred from the name of the Ruby type:

image

If you have mixed types in the array, this is also reflected in the XML output:

image

To construct a more semantic structure, the root option on to_xml triggers more expressive element names:

image

Ruby hashes are naturally representable in XML, with keys corresponding to element names, and their values corresponding to element contents. Rails automatically calls to_s on the values to get string values for them:

image


JoshG says ...

This simplistic serialization may not be appropriate for certain interoperability contexts, especially if the output must pass XML Schema (XSD) validation when the order of elements is often important. In Ruby 1.8.x, the Hash class does not order keys for enumeration. In Ruby 1.9.x, the Hash class uses insertion order. Neither of these may be adequate for producing output that matches an XSD. The section “The XML Builder” will discuss Builder::XmlMarkup to address this situation.


The :include option of to_xml is not used on Array and Hash objects.

15.1.3 Advanced to_xml Usage

By default, Active Record’s to_xml method only serializes persistent attributes into XML. However, there are times when transient, derived, or calculated values need to be serialized out into XML form as well. For example, our User model has a method that returns only draft timesheets:

image

To include the result of this method when we serialize the XML, we use the :methods parameter:

image

We could also set the methods parameter to an array of method names to be called.

15.1.4 Dynamic Runtime Attributes

In cases where we want to include extra elements unrelated to the object being serialized, we can pass to_xml a block, or use the :procs option.

If we are using the same logic applied to different to_xml calls, we can construct lambdas ahead of time and use one or more of them in the :procs option. They will be called with to_xml’s option hash, through which we access the underlying XmlBuilder. (XmlBuilder provides the principal means of XML generation in Rails.

image

image

Note that the :procs are applied to each top-level resource in the collection (or the single resource if the top level is not a collection). Use the sample application to compare the output with the output from the following:

image

To add custom elements only to the root node, to_xml will yield an XmlBuilder instance when given a block:

image

Unfortunately, both :procs and the optional block are hobbled by a puzzling limitation: The record being serialized is not exposed to the procs being passed in as arguments, so only data external to the object may be added in this fashion.

To gain complete control over the XML serialization of Rails objects, you need to override the to_xml method and implement it yourself.

15.1.5 Overriding to_xml

Sometimes you need to do something out of the ordinary when trying to represent data in XML form. In those situations, you can create the XML by hand.

image

This would give the following result:

image

Of course, you could just go ahead and use good Object Oriented design and use a class responsible for translating between your model and an external representation.

15.2 The XML Builder

Builder::XmlMarkup is the class used internally by Rails when it needs to generate XML. When to_xml is not enough and you need to generate custom XML, you will use Builder instances directly. Fortunately, the Builder API is one of the most powerful Ruby libraries available and is very easy to use, once you get the hang of it.

The API documentation says: “All (well, almost all) methods sent to an XmlMarkup object will be translated to the equivalent XML markup. Any method with a block will be treated as an XML markup tag with nested markup in the block.”

That is a very concise way of describing how Builder works, but it is easier to understand with some examples, again taken from Builder’s API documentation. The xm variable is a Builder::XmlMarkup instance:

image

A common use for Builder::XmlBuilder is to render XML in response to a request. Previously we talked about overriding to_xml on Active Record to generate our custom XML. Another way, though not as recommended, is to use an XML template.

We could alter our UsersController show method to use an XML template by changing it from:

image

Now Rails will look for a file called show.xml.builder in the RAILS_ROOT/views/ users directory. That file contains Builder::XmlMarkup code like

image

In this view, the variable xml is an instance of Builder::XmlMarkup. Just as in views, we have access to the instance variables we set in our controller, in this case @user. Using the Builder in a view can provide a convenient way to generate XML.

15.3 Parsing XML

Ruby has a full-featured XML library named REXML, and covering it in any level of detail is outside the scope of this book. If you have basic parsing needs, such as parsing responses from web services, you can use the simple XML parsing capability built into Rails.

15.3.1 Turning XML into Hashes

Rails lets you turn arbitrary snippets of XML markup into Ruby hashes, with the from_xml method that it adds to the Hash class.

To demonstrate, we’ll throw together a string of simplistic XML and turn it into a hash:

image

There are no options for from_xml. You can also pass it an IO object:

image

15.3.2 Typecasting

Typecasting is done by using a type attribute in the XML elements. For example, here’s the auto-generated XML for a User object.

image

As part of the to_xml method, Rails sets attributes called type that identify the class of the value being serialized. If we take this XML and feed it to the from_xml method, Rails will typecast the strings to their corresponding Ruby objects:

image

15.4 Active Resource

Web applications often need to serve users in front of web browsers and other systems via some API. Other languages accomplish this using SOAP or some form of XML-RPC, but Rails takes a simpler approach. In Chapter 3, REST, Resources, and Rails, we talked about building RESTful controllers and using respond_to to return different representations of resources. By doing so we could connect to http://localhost:3000/auctions.xml and get back an XML representation of all auctions in the system. We can now write a client to consume this data using Active Resource.

Active Resource is a standard part of the Rails framework. It has complete understanding of RESTful routing and XML representation, and is designed to look and feel much like Active Record.

15.4.1 List

The simplest Active Resource model would look something like this:

image

To get a list of auctions we would call its all method:

>> auctions = Auction.all

This will connect to http://localhost:3000/auctions.xml.

Active Resource can’t automatically filter the resources like you would with Active Record’s where method, but you can use :params to pass options to the server, which can then filter the results.

image

And then from the consumer application, you might do:

>> auctions = Auction.all(:params => { :reserve => 100 })

This method, however, could easily become unmanageable, since in reality you would want to filter out unsupported params. A much better solution when you want to filter your results is to define a custom collection method on the server, and query against that instead.1

image

It is then trivial to query this collection from Active Resource

>> Auction.all(:from => :open)

Active Resource also supports nested resource routes like this discussed in Chapter 3, “REST, Resources, and Rails,”.

image

And now from your consumer application, you can pull back all of the items for an auction:

>> Item.all(:params => {:auction_id => 1})

15.4.2 Show

Finding specific resources with Active Resource follows the same pattern as retrieving a collection. To fetch the auction with the id 1986, for instance, we can do:

>> Auction.find(1986)

If instead we just want to get the first auction, we can do:

>> Auction.first

You should note that Auction.first is equivalent to calling Auction.all.first (i.e., it will load http://localhost:3000/auctions.xml and then call first on the returned collection).

If we wanted to find the newest Auction, we can do something similar to the open example, but with a newest method.

image

Now we can retrieve the newest auction.

>> Auction.find(:one, :from => :newest)

You need to remember that unlike with Active Record, first is not the same as find(:one).

It’s also important to understand how a request to a nonexistent item is handled. If we tried to access an item with an id of -1 (there isn’t any such item), we would get an HTTP 404 status code back. This is exactly what Active Resource receives and raises a ResourceNotFound exception. Active Resource makes heavy use of the HTTP status codes as we’ll see throughout this chapter.

15.4.3 Create

Active Resource is not limited to just retrieving data; it can also create it. If we wanted to place a new bid on an item via Active Resource, we would do the following:

image

This would HTTP POST to the URL:

http://localhost:3000/auctions/3/items/6/bids.xml with the supplied data. In our controller, the following would exist:

image

If the bid is successfully created, the newly created bid is returned with an HTTP 201 status code and the Location header is set pointing to the location of the newly created bid. With the Location header set, we can determine what the newly created bid’s id is. For example:

image

If we tried to create the preceding bid again but without a dollar amount, we could interrogate the errors.

image

In this case a new Bid object is returned from the create method, but it’s not valid. If we try to see what its id is we also get a nil. We can see what caused the create to fail by calling the ActiveResource#errors method. This method behaves just like ActiveRecord#errors with one important exception. On ActiveRecord if we called Errors#on, we would get the error for that attribute. In the preceding example, we got a nil instead. The reason is that Active Resource, unlike Active Record, doesn’t know what attributes there are. Active Record does a SHOW FIELDS FROM <table> to get this, but Active Resource has no equivalent. The only way Active Resource knows an attribute exists is if we tell it. For example:

image

In this case we told Active Resource that there is an amount attribute through the create method. As a result we can now call Errors#on without a problem.

15.4.4 Update

Editing an Active Resource follows the same Active Record pattern.

image

If we set the amount to nil, ActiveResource.save would return false. In this case we could interrogate ActiveResource::Errors for the reason, just as we would with create. An important difference between Active Resource and Active Record is the absence of the save! and update! methods.

15.4.5 Delete

Removing an Active Resource can happen in two ways. The first is without instantiating the Active Resource

>> Bid.delete(1)

The other way requires instantiating the Active Resource first:

>> bid = Bid.find(1)
>> bid.destroy

15.4.6 Headers

Active Resource allows for the setting of HTTP headers on each request too. This can be done in two ways. The first is to set it as a variable:

image

This will cause every connection to the site to include the HTTP header: HTTP-X-FLAVOR: orange. In our controller, we could use the header value.

image

The second way to set the headers for an Active Resource is to override the headers method.

image

15.4.7 Customizing URLs

Active Resource assumes RESTful URLs, but that doesn’t always happen. Fortunately, you can customize the URL prefix and collection_name. Suppose we assume the following Active Resource class:

image

The following URLs will be used:

image

We could also change the element name used to generate XML. In the preceding Active Resource, a create of an OldAuctionSystem would look like the following in XML:

image

The element name can be changed with

image

which will produce:

image

One consequence of setting the element_name is that Active Resource will use the plural form to generate URLs. In this case it would be 'auctions' and not 'OldAuctionSystems'. To do this you will need to set the collection_name as well.

It is also possible to set the primary key field Active Resource uses with self.primary_key

image

15.4.8 Hash Forms

The methods find, create, save, and delete correspond to the HTTP methods of GET, POST, PUT, and DELETE, respectively. Active Resource has a method for each of these HTTP methods, too. They take the same arguments as find, create, save, and delete but return a hash of the XML received.

image

15.5 Active Resource Authentication

Active Resource comes with support for both HTTP Basic and HTTP Digest Authentication, as well as SSL authentication using X.509 certificates. Each has various compromises of simplicity, strength, interoperability, and infrastructure/system-administration support needs.

As with most HTTP clients and servers, MD5 is the only hashing algorithm supported in HTTP Digest. This is the only algorithm mentioned by RFC 2617, but Rails supports the extended properties of the RFC that strengthen the protocol despite the hashing algorithm used.2

Other authentication mechanisms, like OAuth, CAS, and Kerberos, can be found in HTTP servers, middleware, Ruby gems, and Rails plugins.

15.5.1 HTTP Basic Authentication

When using Basic Authentication, the credentials are sent in plain text and as such can be easily snooped. For this reason, an HTTPS connection should be used when using Basic Authentication.

Here is a basic model class that consumes a RESTful service to obtain data, and specifies credentials for an authenticated connection to the service:

image

You can also use URI-style credentials by putting them in the service’s URL. This is particularly useful if you have a fully-qualified URL in a configuration file that has been supplied by the service provider:

image

As soon as you supply any credential to the API, Active Resource will automatically attempt to authenticate on each connection. If the username and/or password is invalid, an ActiveResource::ClientError is generated and handled in the consuming application.

15.5.2 HTTP Digest Authentication

Setting the auth_type tells Active Resource to use Digest Authentication.

image

It’s as simple as that! Rails takes care of the rest (pardon the pun).

Dealing with only a hashed value (HA1 being the hash of colon-separated username, authentication realm, and password) is good, as your password is never transmitted — except perhaps when you (re-)set it. However, if the repository storing the HA1 is compromised, passwords will have to be reset (even if it’s just to the same password using a new secret or realm) as the HA1 could then be used by anyone to access your account on that server only. They still won’t know your password or be able to use the HA1 within another authentication realm. As such, despite its many known limitations and interoperability issues, Digest is definitely a step above Basic.

15.5.3 Certificate Authentication

A type of public key authentication, you may also hear this referred to as “client-side certificate authentication” and, when used in conjunction with username/password credentials, is a form of two-factor authentication as it involves something you have (the certificate) and something you know (the credentials).

In this form of SSL-based authentication, the server provides its certificate as usual (creating the SSL connection), and then the client provides its certificate so that the server continues with the SSL session.

15.5.4 Proxy Server Authentication

Sometimes you may find your Active Resource model may need to access a service on another network that is only accessible through a proxy server on your network (a “forward” proxy). This is often the case in your development environment where you may have to access the Internet through a proxy server, or perhaps an intranet application that needs data from the Internet.

In particularly thrifty enterprise networks (where Internet access is actively discouraged), the proxy server may even require authentication. It is far better to work with the infrastructure teams to remove the need for proxy authentication from selected machines (like your development workstation, and the production server even more so), and preferably no explicit proxy at all.


JoshG says ...

If the organization hasn’t made it to the 90’s yet with its Internet connectivity, or only trusts its information technologists as far as it can kick them, you may have bigger problems than configuring your Rails app.


To connect through your proxy server by providing it additional credentials either by providing a URI:

image

or using URI-style:

image

15.5.5 Authentication in the Web Service Controller

On the other side of the connection, the RESTful service that our Active Resource model is consuming, we can use the authentication built-in to Rails:

image

If the service is supporting HTTP Digest Authentication:

image

The authenticate_or_request_with_http_digest method will first try to authenticate using a HA1-style digest password (which is what our example above uses). If that fails, it will attempt to hash a plain text password and match it against the hash in the request.

Initial authentication of client certificates is done by whatever in your HTTP stack that negotiates the SSL session (e.g. httpd, nginx), not in your Rails application.

Depending on your infrastructure technology, you may have access to additional environment variables like SSL_CLIENT_CERT, REMOTE_USER, X-HTTP_AUTHORIZATION. These can be used for deeper authentication (e.g., comparing a certificate’s DN, email, and CN) and for authorization (to verify if an authenticated user is allowed to perform specific actions).

15.6 Conclusion

In practice, the to_xml and from_xml methods meet the XML handling needs for most situations that the average Rails developer will ever encounter. Their simplicity masks a great degree of flexibility and power, and in this chapter we attempted to explain them in sufficient detail to inspire your own exploration of XML handling in the Ruby world.

As a pair, the to_xml and from_xml methods also enabled the creation of a framework that makes tying Rails applications together using authenticated RESTful web services drop-dead easy. That framework is named Active Resource, and this chapter gave you a crash-course introduction to it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.61.30