RUM Concerns and Trends

Watching end user activity does present some concerns and pitfalls to watch out for, from privacy to portability and beyond. We’ve already considered many of the privacy concerns in the section on WIA, so be sure to check there for details on data collection.

Cookie Encryption and Session Reassembly

Some websites store session attributes in encrypted cookies. Unfortunately, obfuscating personally identifiable information may make it hard to reassemble a user’s visit or to identify one user across several visits. Whenever the visitor changes the application state (for example, by adding something to a shopping cart) the entire encrypted cookie changes.

Your development team should separate the things you need to hide (such as an account number) from the things that you don’t (such as a session ID). Better yet, store session state on the servers rather than in cookies—it’s safer and makes the cookies smaller, improving performance. This is particularly true if your sessionization relies on the information in that cookie.

Privacy

RUM tools may extract content from the page to add business context to a visit record. While this is less risky than collecting an entire page for replay (as we do in some WIA tools), you still need to be careful about what you’re capturing.

When you implement your data collection strategy, you should ensure that someone with legal authority has reviewed it. In particular, pay attention to POST parameters, URI parameters, and cookies. You’ll need to decide on a basic approach to collection: either capture everything except what’s blocked, or block everything that’s not explicitly captured.

A permissive capture strategy might, for example, tell the RUM solution to blank out the POST parameter for “password.” Unless it’s explicitly blocked, it will be stored. Permissive capture means you may accidentally collect data you shouldn’t, but it also means that a transcript of the visit will contain everything the visitor submitted, making it easier to understand what went wrong during the visit.

On the other hand, a restrictive capture strategy will capture only what you tell it to. So you might, for example, collect the user’s account number, the checkout amount, and the number of items in a shopping cart. While this is the more secure approach (you won’t accidentally collect things you shouldn’t), it means you can’t go back and look for something else later on. Figure 10-11 shows an example of a restrictive capture configuration screen in a RUM tool—everything that isn’t explicitly captured has its value deleted from the visit record.

Configuring confidentiality policies in Coradiant TrueSight

Figure 10-11. Configuring confidentiality policies in Coradiant TrueSight


RIA Integration

We’ve looked at programmatic RUM using client-side JavaScript. More and more applications are written in browser plug-ins (like Flash and Silverlight) or even browser/desktop clients (Adobe AIR and Sun’s Java FX, for example.)

The methods described here for sending messages back to a hosted RUM service work just as well for RIAs. The application developer has to create events within the application that are sent back to the service. Episodes is a good model for this because it’s easily extensible. As part of their RUM offerings, some solutions provide JavaScript tags or ActionScript libraries that can also capture multimedia data like startup time, rebuffer count, rebuffer ratio, and so on.

Storage Issues

As we’ve noted, capturing user sessions generates a tremendous amount of information, particularly if those sessions include all of the content on the page itself. If you’re planning on running your own RUM, make sure your budget includes storage.

Many server-side RUM tools allow you to extract session logs so that they can be loaded into a business intelligence (BI) tool for further analysis (Figure 10-12).

Bulk data export in Coradiant’s TrueSight

Figure 10-12. Bulk data export in Coradiant’s TrueSight


With a hosted RUM service, it’s important to understand the granularity of the offering, specifically whether it can drill down to an individual page or object, as well as the length of time that the stored information is available. Some systems only store user session information for sessions that had problems or were excessively slow.

Exportability and Portability

RUM data must be portable. Whatever technology you deploy, you need to be sure you can take your data and move it around. Often, this will be in the form of a flat logfile (for searching) or a data warehouse (for segmentation and sharing with other departments).

With the advent of new tools for visualization and data exchange, you will often want to provide RUM in real time and in other formats. For example, if you want to stream user events to a dashboard as structured data, you’ll want a data feed of some kind, such as the one shown in Figure 10-13.

Raw data of individual object requests from a streaming API

Figure 10-13. Raw data of individual object requests from a streaming API


You may also want to overlay visitor information atop third-party visualization tools such as Google Earth, particularly if you’re trying to find a geographic pattern. For example, you may want to demonstrate that visitors who are prolific posters are in fact coming from a single region overseas and are polluting your community pages with blog spam, as is the case in Figure 10-14.

User visits showing performance and availability, visualized in Google Earth

Figure 10-14. User visits showing performance and availability, visualized in Google Earth


These kinds of export and visualization are especially important for gaining executive sponsorship and buy-in, since they present a complex pattern intuitively. When selecting a RUM solution, be sure you have access to real-time and exported data feeds.

Data Warehousing

Since we’re on the topic of data warehousing, let’s look at some of the characteristics your RUM solution needs to have if it is to work well with other analytical tools.

  • It must support regular exports so that the BI tool can extract data from it and put it into the warehouse at regular intervals. The BI tool must also be able to “recover” data it missed because of an outage.

  • It must mark session, page, and object records with universally unique identifiers. In this way, the BI tool can tell which objects belong to which pages and which pages belong to which sessions. Without a way of understanding this relationship, the BI tool won’t be capable of drilling down from a visit to its pages and components.

  • If the data includes custom fields (such as “password” or “shopping cart value”), the exported data must include headers that allow the BI tool to import the data cleanly, even when you create new fields or remove old ones.

We’ll look at consolidating many sources of monitoring data at the end of the book, in Chapter 17.

Network Topologies and the Opacity of the Load Balancer

A load balancer terminates the connection with clients and reestablishes its own, more efficient connection to each server. In doing so, it presents a single IP address to the Internet, even though each server has its own address. This means that the server’s identity is opaque to monitoring tools that are deployed in front of the load balancer, including inline monitoring devices and client-side monitoring.

To overcome this issue, some load balancers can insert a server identifier into the HTTP header that the RUM tool can read. This allows you to segment traffic by server even though the server’s IP address is hidden. We strongly suggest this approach, as it will allow you to narrow a problem down to a specific server much more quickly. You can use a similar technique to have the application server insert a server identifier, further enhancing your ability to troubleshoot problems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.45.62