Chapter 5. Hibernate Cache

In this chapter, we will discuss various topics on Hibernate caching. One of the advantages of using Hibernate is its ability to cache entities to minimize trips to the database. Of course, to take advantage of this feature correctly, one must be fully aware of the pitfalls. For example, if you cache entities that can be updated by another application, then your cache may be invalid. In this chapter, we will discuss various cache levels, how to enable cache, strategies, and more:

  • Cache structure:
    • Cache scope
    • First-level cache
    • Second-level cache
    • Query cache
  • Caching benefits and pitfalls
  • Caching strategies:
    • Read only
    • Non-strict read write
    • Read write
    • Transactional
    • Object identity
  • Managing cache:
    • Remove cached entities
    • Cache modes
  • Cache metrics

Cache structure

You may not have realized but you have already been using at least the first-level cache that is managed by the Hibernate session. In addition to the first-level cache, Hibernate also offers the ability to set up a second-level cache. Furthermore, it's possible to cache the result of queries that are frequently executed, which are known as query caches. Let's discuss these here.

Cache scope

Before we discuss cache levels, we need to understand the scope of a cached entity:

  • A cached entity may only span the life of a transaction; this means the cached entity is only available for the thread that owns the persistence unit of work.
  • The next wider scope is the process, for example, your web application. An entity may be cached as long as the web application is running. In this case, the cached entity may be shared by multiple threads, perhaps even concurrently.
  • Finally, if you are in a clustered environment, entities can be cached on every node of the cluster. Clearly, when an entity changes, nodes have to be synchronized to ensure that the entities' values are the same across all the nodes.

Another important note to keep in mind is entity identity. The discussion that we had in Chapter 1 , Entity and Session on object identity and equality is also relevant here. It should be evident that when you access a cached entity in a transaction scope cache, you get back the same entity every time by reference, which means two references to the same entity are equal both from the Java and Persistence perspective. This may not always be the case in the process scope cache, and it's never the case in the cluster scope cache. We will return to this when we discuss caching strategies later in this chapter.

First-level cache

When an entity is placed in the persistence context, it is cached until the session is closed or until the entity is evicted from the session. This is the first-level cache whose scope is the unit of work. This is also known as the transaction-level cache. Clearly, this does not apply to stateless sessions, which were discussed in Chapter 1, Entity and Session, because, as you may recall, a stateless session does not have a persistence context.

An important feature of the Hibernate first-level cache is automated dirty checking. Hibernate knows the state of each entity compared to its state in the database at the time when the entity was fetched from storage. If the state of the entity doesn't change during a persistence unit of work, Hibernate will not attempt to synchronize this entity with the database when session is flushed. Even if you modify an entity and then change it back to its original state, Hibernate is smart enough to determine that it doesn't need to be synchronized with the database because the original value was restored.

Furthermore, the Hibernate first-level cache offers repeatable reads within the unit of work. If an entity already exists in the persistence context and the application tries to fetch this entity by ID, Hibernate simply fetches this from the first-level cache and no database queries are executed. You can access all entities in the first-level cache by ID without causing any database hits if they are present.

To demonstrate this, consider the following code block:

Query query = session.createQuery("from Person");
List<Person> persons = query.list();

for (Person person: persons) {
  Person newPerson = (Person) 
              session.get(Person.class, person.getId());
}

First, we fetch all the Person entities into the persistence context. However, later inside the for loop, we fetch each Person entity by its ID. In this case, as we fetch by the object ID, Hibernate checks to see whether this already exists in the first-level cache, and as this does exist, this will just return this entity from the first-level cache without making a trip to the database. It is important to realize that the fetch (query) and the look up via session.get occurred within the same unit of work.

The persistence context also checks the result set of a query execution against the cached entities. This doesn't mean that the query won't be executed. It's still executed. However, before Hibernate parses the rest of the query result, it checks to see whether this entity already exists by looking it up in the cache using the entity ID, and if it already exists, it won't process the rest of the result for this entity. The reason this is done is because pulling and keeping an entity in the persistence context requires some work, and Hibernate doesn't just maintain entities. There are lots of metadata associated with an entity that is maintained by Hibernate. So, this action by Hibernate is indeed a performance gain.

Second-level cache

In addition to the first-level cache, Hibernate offers the API the ability to utilize a second-level cache. Unlike the first-level cache, the second-level cache only stores entities that are marked as cacheable. The implementation of the second-level cache for the earlier versions of Hibernate was Ehcache; but now, Hibernate has moved to the OSGi model. Anyone can implement a second-level cache as long as they provide an implementation for org.hibernate.cache.spi.RegionFactory.

You can set up multiple cache regions; typically, the second-level cache has a few regions. Besides the region that stores the root entities, there are also cache regions to store collections, query, and update timestamps, which typically store only the ID of the cached entities. The region that holds the updated timestamp is used internally by Hibernate to determine whether an entry stored in cache is considered stale or not.

Typically, the second-level cache is implemented in a way that it only holds the values of entities, instead of a serialized version of each entity. (This is not a Java serialization. This may simply be a key-value map of an entity; it depends on provider's implementation.) Additionally, each value set may hold identifying references to the associated entities, which may or may not be present in the cache region. The internal of the cache implementation is vendor-specific, so this will not be discussed here. However, in this section, we will show you how to set up Ehcache as a second-level cache to work with Hibernate.

Hibernate provides support for Ehcache as the second-level cache provider, along with other providers. In order to set up Ehcache for Hibernate, you will need the following:

  • Cache provider interface
  • Ehcache implementation
  • Cache configuration
  • Cacheable entity configuration

These are discussed in the following sections.

Cache provider interface

Hibernate's interface for Ehcache is implemented in another module, called hibernate-ehcache, which is available as a jar, and it needs to be in your runtime class path. If you use Maven, you can simply add this as a dependency:

<dependency>
  <groupId>org.hibernate</groupId>
  <artifactId>hibernate-ehcache</artifactId>
  <version>${hibernateVersion}</version>
</dependency>

This module includes an implementation of RegionFactory, called EhCacheRegionFactory, which we'll use later in the configuration.

Ehcache implementation

The implementation of Ehcache itself (the part that performs caching functions, such as storage, access, and concurrency) is another module, which you also have to import into your project and runtime. A simple Maven dependency should take care of this:

<dependency>
  <groupId>org.ehcache</groupId>
  <artifactId>ehcache</artifactId>
  <version>${ehcacheVersion}</version>
</dependency>

Cache configuration

To configure a second-level cache for Hibernate using Ehcache, you will need to enable this in Hibernate and specify a cache region factory class by adding the following lines to your hibernate.cfg.xml Hibernate configuration file:

<property name="hibernate.cache.use_second_level_cache">
  true
</property>

<property name="hibernate.cache.region.factory_class">
  org.hibernate.cache.ehcache.EhCacheRegionFactory
</property>

Ehcache already comes with a default cache. However, you can further customize this. For example, you can define multiple cache regions, set a maximum number of entities in regions, and provide an expiration time for cached entities.

Consider the following example:

@Entity
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE, region="personCache")
public class Person {
  @Id
  @GeneratedValue
  private long id;
  private String firstname;
  private String lastname;
  private String ssn;
  private Date birthdate;
  // getters and setters
}

This entity is now declared as cacheable and it uses a region called personCache. (We will discuss cache strategy later in this chapter.) You can further configure this region in the ehcache.xml file:

<cache name="personCache"
  maxElementsInMemory="1000"
  eternal="false"
  timeToIdleSeconds="300"
  timeToLiveSeconds="600"
  overflowToDisk="true"
/>

There are other important topics when working with cache, especially a second-level cache, such as locking and transactions. We will discuss these later in this chapter. For now, let's explore query cache.

Query cache

In addition to the second-level cache, you can also enable query cache. This is especially useful for queries that are executed frequently throughout the execution of your application whose result doesn't change that often. For example, if you are writing a forum application, you can cache the result of the query that fetches the headlines of articles, as most of the time, the users are just reading the forum and if you have the list of headlines in your cache, you don't need to query the database to fetch the headlines of the articles. Of course, when a new article is posted or existing articles are updated, the query result set needs to expire or be refreshed. Luckily, Hibernate does this for us automatically. When the persistence layer commits any changes to the database in form of INSERT, UPDATE or DELETE, Hibernate will check the query cache result set. If an entity is inserted, updated, or deleted, Hibernate checks to see whether this entity is cached or could possibly be part of a query cache result set. If so, Hibernate will expire the cached data.

The query cache doesn't store entities or entity values. It only stores the ID for the entities and the values of non-entities that would be returned in the result set of the query. It uses the second-level cache to store the values of entities.

However, if the entity returned by the query is not marked as cacheable, Hibernate is forced to make additional trips to the database to fetch them. You should always examine the root and associated entities returned from a query before deciding to cache a query.

Query cache is not enabled by default. You'll have to enable it in your configuration and also when you execute the query. As query cache doesn't actually store any entities or values, it won't work by itself. So, you have to enable the second-level cache as well.

Consider the following example:

Query query = 
session.createQuery("from Person where firstname like 'J%'")
  .setCacheable(true);

Let's assume that you have enabled query cache in your Hibernate configuration file, as shown here:

<property name="hibernate.cache.use_query_cache">true</property>

When the query executes for the first time, it will store the entities returned from the database and save their values in a second-level cache. Any future executions of this same query will consult the query cache first. Furthermore, if you fetch any Person by ID and this happens to be in the second-level cache, either as a result of this or another query, it will be fetched from second level cache, instead of the database.

However, if you make any changes to the persistence context after the data is cached, Hibernate will expire the cache and any subsequent queries will be forced to make a database trip.

Consider the preceding code again. If the cached query is executed again, it will not cause a database hit. However, if you delete one of the cached entities, the cache will expire:

Query query = 
session1.createQuery("from Person where firstname like 'J%'")
    .setCacheable(true);

…

session2.delete(somePerson);

…

// cache is now expired, database hit
Query query = 
session3.createQuery("from Person where firstname like 'J%'")
    .setCacheable(true);
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.147.193