Sessions allow an application to customize the responses sent out to different users. They do this by enabling us to store data about each session and use that data as the foundation for statefulness between requests. When used in conjunction with user authentication, session data also means that we can trust that a request comes from a particular user without having to reauthenticate on subsequent requests.
Strictly speaking, HTTP is a stateless protocol, and every request that comes to an HTTP server is treated like any other. However, at an application level we can transcend the limitations of statelessness using sessions. If we had to construct sessions from the ground up, we would have to design a way to distribute HTTP cookies with session IDs and then session data stored either client- or server-side. However, Merb’s built-in session support eliminates the need to do any of this work, typically streamlining it down to the selection of configuration options and interaction with a hash named session
.
Merb’s emphasis on hackabiliy is not lost, however, through the streamlining of session maintenance. In this chapter we’ll explore the construction of session containers and stores and find that building them poses little difficulty to the developer with such a need. One warning, though: Sessions also incidentally bring up security concerns, but we’ll point these out when they arrive and explain how to deal with them.
Assuming that sessions are enabled on a Merb application, a session ID is established for each capable browser that sends a request. This session ID is stored client-side as an encrypted value inside an HTTP cookie. All subsequent requests made by the user’s browser will include this cookie as a header, and Merb will decrypt the session ID from it. In this way we can establish and maintain the identity of the individual users of our application.
However, data related to each session must be stored separately, and Merb provides us with multiple storage options. These include saving data in memory, in a database, cached with memcache, or even within the HTTP cookie itself. Each of these options is appropriate under different conditions. We’ll go over when and how to use them later on.
Not all applications need sessions to the same degree and some do not need them at all. At the same time, different environments make different demands on how session data must be stored and accessed. For these reasons, our sessions must be configured.
Configuration is typically done in either config/init.rb
or one of the environment files in config/environments
. Remember that any configuration settings made in an environment file override what is in the init script. Consequently, the configuration in the init script tends to be the development configuration and acts only as a template for the other environments.
There are five basic values for configuration that can be set, but depending on the storage mechanism there may be additional settings to configure:
• session_store
—the type of session to use: cookie, memory, memcache, container, datamapper, etc.
• session_id_key
—defaults to _session_id
but can be useful when you need to differentiate sessions for the same domain
• session_expiry
—defaults to two weeks but its value can be set as an integer in seconds
• session_secret_key
—a string of at least 16 chars used for the encryption of the cookie session store
• default_cookie_domain
—the domain the cookies are for
Typically, some of these values are set from within config/init.rb
as follows:
The mechanics behind sessions are for the most part not necessary for an application developer to know. However, in the interest of understanding the tools we use, we’ll cover the topic in some detail. Before jumping in, you should know that there are two central classes to sessions storage: SessionContainer
and SessionStoreContainer
.
The fundamental structure storing individual session data in Merb is a session container. The class SessionContainer
inherits from Mash
, making its keys accessible by either symbol or equivalent string. The reason the Mash
class has been used here again is so that request parameter data may easily be stored within session containers without complication.
Nearly all the methods of SessionContainer
are effectively stubs. This is because the class is intended to be made more concrete through various subclasses of containers for particular storage mechanisms. We get a feel for this through the subclasses
class attribute, which is used upon inheritance to grow a list of subtypes.
The class methods setup
and generate
are used to indirectly initialize a new store with a unique session ID. Along with the instance methods finalize
and regenerate
, a more concrete container class has to define this method. However, both clear!
and session_id=
are basically in their final forms; clear
simply empties the mash and session_id=
changes the ID while making clear that a new cookie is needed. The method session_id=
has been built with cases where a session ID needs to be exposed and a new one used to avoid compromising the session.
Some session containers also need to be stored server-side. This is accomplished by making use of the class SessionStoreContainer
, which itself actually inherits from the class StoreContainer
. Let’s take a look at the added-on and no longer stubbed methods of this class:
The first thing you should notice is that a new class-inheritable attribute store
has been added. This bridges the container with its storage mechanism. There’s also a fingerprint, which is used to recognize the dirtiness of a yet-to-be-persisted container. Moving on, the class methods setup
and generate
have been filled in and are now ready to generate unique IDs and create store instances. A private class method retrieve
looks to find existing stores and is used by setup
to assure uniqueness. It’s also capable of moving stored session data from one session store type to another. The other two previously stubbed methods, finalize
and regenerate
, are also fleshed out. The former is used to persist data (note the use of store
) and the latter to refresh the session ID.
There are multiple storage mechanisms that implement the previously described session stores. We’ll go through each of these, starting off not necessarily with the simplest, but with the only one to use SessionContainer
directly, CookieSession
.
Cookie sessions have been designed to keep all of the session data client-side. This eliminates the need for a storage mechanism client-side but comes at the cost of limited session size and possible security implications. Let’s open up the source for this session type:
Above we see that CookieSession
inherits directly from SessionContainer
and does not use any store mechanisms. It also sets a limitation on size, 4K. This limitation is necessary as it is the largest cookie size allowed by some browsers. The use of OpenSSL
will become clear as we see the digest above used to encrypt the cookie session data.
The class methods generate
and setup
have been filled in above. The first simply generates a new cookie session, generating a session ID by way of UUID and encrypting it with the application’s configure secret session key. The setup
method does nearly that but passes in preexisting cookie values so that they may also be stored. The conversion of session data to a cookie format may be of interest:
Here we see that the data is serialized (by a method we have yet to encounter) and compounded with a message digest. If, however, the data stretches over the 4K max, an overflow error is raised. In practice, you should limit the size of session data when using cookie sessions mainly to storing things like related model object IDs.
Here we see the methods that serialize the session data for the cookie. Note that assuming users on the client side know enough to figure out that the data was encoded with Base64, the actual data of the session is clear. Consequently, do not under any circumstances store sensitive data in a cookie session. The inability of users to hijack other users’ cookie sessions, however, is made possible through the message digest generated through generate_digest
. This hash of the data incorporates our session secret, meaning that unless it is compromised client-altered session data, it should not be possible.
Above we witness how the unmarshaling of the data does in fact raise the error TamperedWithCookie
if the incoming digest does not hash out correctly. Note that we can also prevent this error from being raised (perhaps for testing) by setting a configuration value to ignore_tampered_cookies
. Though there are other methods to explore within this class, we’ve encountered those most fundamental and characteristic of cookie sessions themselves and will leave the other methods, if of interest, to the reader.
Memory sessions are possibly the second-easiest way to set up sessions and probably your best bet when doing application development and testing. However, they do require that all Merb workers behave in a thread-safe manner when accessing session data. Even more important for production environments, they prohibit the distribution of Merb workers across several physical devices. In any case, let’s explore the source behind memory sessions, taking away some lessons on thread safety as well as the structure of the more complex session stores:
The fundamental difference between session store containers and regular session containers is that the former must connect to some storage mechanism. Above we see this as MemorySessionStore
. Note that memory session stores are also time-limited and that we can set this limit with the configuration variable memory_session_ttl
.
With the initialization of the class MemorySessionStore
, we see a mutual exclusion lock that prevents simultaneous access of session data. Additionally, the time limitation comes into play and defaults to one hour. Let’s pull up the start_timer
method to see how it will work:
A new thread with a simple loop, sleeping for the required time, repeatedly calls up the reaping of sessions. During the reap, sessions that have not been updated in the specified number of seconds are deleted. Interestingly, the garbage collector is forcibly invoked to assure the reduction of memory used.
The final three methods are essential to any store, allowing sessions to be individually manipulated or deleted. Though there isn’t much of anything special in them, focus on the use of Mutex#synchronize
and the combined setting of a timestamp along with session data. If you need to create your own custom store, the methods shown above will serve well as an initial template, especially for thread safety.
The code behind a memcache session is actually remarkably simple, because really, memcache itself does all the work. Opening up the source, we find the following:
Notably absent is a definition of store
. This is because it is expected that you will define it from within a configuration (or preferably environment) file:
Pay special attention to the need to require the memcache gem as well as to have the code placed within an after_app_loads
block.
DataMapper sessions are an alternative way to accomplish production-suitable sessions without the need for a memcache server. This may suit the needs of smaller production environments where putting up memcache is not a possibility. However, we still strongly recommend not using database sessions if for no other reason than session data is retrieved frequently, and memcache is designed to handle that better. Nonetheless, as an exploration of the versatility of Merb sessions, let’s take a look at how it’s done:
There’s nothing special so far, but by now you should be capable of making your own store and connecting it to the container.
In this way, a DataMapper resource has effectively been created, but special attention has been given to the name of the table and the repository to use. These two variables can be altered within our configuration files.
The three properties above form the basis of DataMapper session storage. Note that created_at
has used its own Proc, meaning that no requirements on dm-timestamps
are made. Additionally, we see a good use of the database-admin-frightening property type Object
. It’s important that it’s not lazy-loaded since its primitive type is Text
, and we will want to access the session data whenever we pull up the record.
Finally, the three methods we’ve come to expect appear again, this time using the DataMapper retrieval methods to do their work.
Within Merb, sessions are glued onto their associated requests. To accomplish this, the RequestMixin
adds several methods to the Request
class. Only the first is of importance to the typical application developer, so we’ll take a look at it, letting the reader dig deeper if desired.
Note that if we are using multiple session stores, we can specify the desired one to check inside. If it’s not found, Merb falls back to the first session store defined. A practical use of the method from the application developer’s perspective would be the use of session data from inside router deferral blocks.
The controller also needs access to the session. This is likewise accomplished through a controller session mixin:
Here we see that the session
method simply delegates to the method we saw defined within the request mixin. It is somewhat intriguing that the RequestMixin
we saw a moment ago is actually embedded within the controller mixin. This kind of organization is typical among Merb request enhancements and minimizes redundancy. Anyway, as application developers, we have full access to the session data via this method, and treating it as a hash can set its values.
Merb handles sessions in such a way that application developers do not have to be aware of the storage mechanisms being used. Nonetheless, the limitations and security implications behind some session storage, mainly cookie storage, demand the developer’s awareness when they are used. Our exploration of the internals of both session containers and stores should make it easy to extend Merb sessions as needed for custom or composite storage mechanisms when the nearly standard production use of memcache will not do.
3.12.34.253