Implementing own monitoring points

As soon as you experience the first performance bottleneck in your application, there are two steps to carry out: analyze the problem and fix it. If you application design is not completely flawed, fixing usually takes a fraction of the time needed for analyzing. The analysis is split in two parts. Finding the occurrence of the problem in your platform and reproducing the problem reliably in order to fix it. Most likely problems will occur on production because real live problems will never be completely found by lab testing.

After you have fixed the problem, hopefully the operations department or even you would raise the rhetorical question about making sure this does not happen again. The answer is simple: Monitoring. You should always be able to return reliable statistics from the core of the application instead of relying on external measuring points such as database query times or HTTP request and response times.

There should be data about cache hit and miss times of your custom caches. There should be very application specific data like the amount of API login calls, or the count and top 10 of your daily search terms or your daily revenue. Anything remotely helping to gather problems should be easy to monitor.

The Play framework ships with JAMon , which is short for Java Application Monitor and provides exactly the infrastructure needed in order to integrate such monitoring points as easy as possible.

The source code of the example is available at examples/chapter7/monitoring.

Getting ready

You should be familiar with the output of play status before starting this recipe. Play already offers the minimum, maximum, and average run times as well as hit count of all of your controllers, error pages, jobs, and even all your custom tags. An already existing bottleneck is not needed, but might help with your concrete problem.

The example supplied with this recipe is a query to another system, which should help us to make sure that there are no performance issues with accessing an external API.

How to do it...

Imagine the following job, which gets remote data, transforms it from JSON to an object, and stores it in the cache. Now the task is to find out whether the complex JSON parsing lasts so long or retrieving the content from the remote system. This job shows how to do it:

@Every("5s")
public class RemoteApiJob extends Job {

   public void doJob() {
      Logger.info("Running job");
      Monitor monitor = MonitorFactory.start("RemoteApiCall");
      WSRequest req = WS.url("http://some.URL");
      HttpResponse resp = req.get();
      monitor.stop();
      
      // some other stuff like parsing JSON/XML into entities
      // ...
   }
}

This job starts and stops a monitor, which sets this monitor to automatically account for min, max, and average content. If you want to count for cache hits or misses, this functionality is not needed and you could go for the following in your code:

String foo = Cache.get("anyKey", String.class);

String monitorName = foo == null ? "cacheMiss" : "cachehHit";
MonitorFactory.add(monitorName, "cnt", 1);

Cache.set("anyKey", "anyContent", "5mn");

Instead of string names for the monitors you could also use the MonKey and MonKeyItem interfaces, if you are used to this from other applications using JAMon. However, using simple strings such as cacheMiss and cacheHit to identify your monitoring names should be sufficient in most cases.

How it works...

As you can see due to the rather short examples, integration of new monitors is only a few lines of code. As already mentioned, there will mostly be two different kinds of monitors in your code. There are simple monitors, which are counters, like cache hits or misses or the amount of currently open orders in your e-commerce shop system. The other kind of counter is the one who needs a data set to provide a real value to the reader of the monitor. If your system suffers from a peak one or two minutes a day and you are only taking the request response values from this moment, you will not have a good statistical average, which might help you in finding bottlenecks. But when you access statistic values from a complete time duration, you might be able to analyze a peak load in a certain period which gives you an idea to only analyze this specific time period, instead of not being able to pinpoint the problem. Also, this allows you to measure performance decreases over time in a better way.

Both monitor types differ slightly in implementation. One can be called with monitor.add(), or you can directly use the MonitorFactory as done in the example. The other one needs a start() and a stop() method in order to calculate time differences correctly.

One of the bigger disadvantages of this example remains the same. Whenever you try to measure your code, you will clutter it when putting information such as monitors into it. You might be able to create some nice workaround via AOP and/or bytecode enhancement, but this will be a lot more work, for not too much change. Think about if this is really needed in your case.

You can get further statistics via the Monitor class. For example, the standard deviation, the last values for minimum, maximum, and average, or the date of the last access.

There's more...

Monitoring is often a crucially undervalued pinpoint in application development, as this is never a functional requirement. Make sure you know your application well enough in order to find out upcoming operations problems as soon as possible.

More about JAMon

Although, JAMon is currently not in active development (the last release is from 2007), it serves its purpose well. You can find the complete documentation of JAMon at http://jamonapi.sourceforge.net/.

See also

After adding new monitors, the next logical step is of course to really monitor such data. Read on for how to monitor this in Icinga and Munin.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.97.216