Chapter 10. Managing data

In this chapter

  • The lifecycle and states of objects
  • Working with the Java Persistence API
  • Working with detached state

You now understand how Hibernate and ORM solve the static aspects of the object/relational mismatch. With what you know so far, you can create a mapping between Java classes and an SQL schema, solving the structural mismatch problem. For a reminder of the problems you’re solving, see section 1.2.

An efficient application solution requires something more: you must investigate strategies for runtime data management. These strategies are crucial to the performance and correct behavior of your applications.

In this chapter, we discuss the life cycle of entity instances—how an instance becomes persistent, and how it stops being considered persistent—and the method calls and management operations that trigger these transitions. The JPA EntityManager is your primary interface for accessing data.

Before we look at the API, let’s start with entity instances, their life cycle, and the events that trigger a change of state. Although some of the material may be formal, a solid understanding of the persistence life cycle is essential.

Major new features in JPA 2

  • You can get a vendor-specific variation of the persistence manager API with EntityManager#unwrap(): for example, the org.hibernate.SessionAPI. Use EntityManagerFactory#unwrap() to obtain the org.hibernate.SessionFactory.
  • The new detach() operation provides fine-grained management of the persistence context, evicting individual entity instances.
  • From an existing EntityManager, you can obtain the EntityManager-Factory used to create the persistence context with getEntityManager-Factory().
  • The new static Persistence(Unit)Util helper methods determine whether an entity instance (or one of its properties) was fully loaded or is an uninitialized reference (Hibernate proxy or unloaded collection wrapper.)

10.1. The persistence life cycle

Because JPA is a transparent persistence mechanism—classes are unaware of their own persistence capability—it’s possible to write application logic that’s unaware whether the data it operates on represents persistent state or temporary state that exists only in memory. The application shouldn’t necessarily need to care that an instance is persistent when invoking its methods. You can, for example, invoke the Item#calculate-TotalPrice() business method without having to consider persistence at all (for example, in a unit test).

Any application with persistent state must interact with the persistence service whenever it needs to propagate state held in memory to the database (or vice versa). In other words, you have to call the Java Persistence interfaces to store and load data.

When interacting with the persistence mechanism that way, the application must concern itself with the state and life cycle of an entity instance with respect to persistence. We refer to this as the persistence life cycle: the states an entity instance goes through during its life. We also use the term unit of work: a set of (possibly) state-changing operations considered one (usually atomic) group. Another piece of the puzzle is the persistence context provided by the persistence service. Think of the persistence context as a service that remembers all the modifications and state changes you made to data in a particular unit of work (this is somewhat simplified, but it’s a good starting point).

We now dissect all these terms: entity states, persistence contexts, and managed scope. You’re probably more accustomed to thinking about what SQL statements you have to manage to get stuff in and out of the database; but one of the key factors of your success with Java Persistence is your understanding of state management, so stick with us through this section.

10.1.1. Entity instance states

Different ORM solutions use different terminology and define different states and state transitions for the persistence life cycle. Moreover, the states used internally may be different from those exposed to the client application. JPA defines four states, hiding the complexity of Hibernate’s internal implementation from the client code. Figure 10.1 shows these states and their transitions.

Figure 10.1. Entity instance states and their transitions

The state chart also includes the method calls to the EntityManager (and Query) API that trigger transitions. We discuss this chart in this chapter; refer to it whenever you need an overview. Let’s explore the states and transitions in more detail.

Transient state

Instances created with the new Java operator are transient, which means their state is lost and garbage-collected as soon as they’re no longer referenced. For example, new Item() creates a transient instance of the Item class, just like new Long() and new BigDecimal(). Hibernate doesn’t provide any rollback functionality for transient instances; if you modify the price of a transient Item, you can’t automatically undo the change.

For an entity instance to transition from transient to persistent state, to become managed, requires either a call to the EntityManager#persist() method or the creation of a reference from an already-persistent instance and enabled cascading of state for that mapped association.

Persistent state

A persistent entity instance has a representation in the database. It’s stored in the database—or it will be stored when the unit of work completes. It’s an instance with a database identity, as defined in section 4.2; its database identifier is set to the primary key value of the database representation.

The application may have created instances and then made them persistent by calling EntityManager#persist(). There may be instances that became persistent when the application created a reference to the object from another persistent instance that the JPA provider already manages. A persistent entity instance may be an instance retrieved from the database by execution of a query, by an identifier lookup, or by navigating the object graph starting from another persistent instance.

Persistent instances are always associated with a persistence context. You see more about this in a moment.

Removed state

You can delete a persistent entity instance from the database in several ways: For example, you can remove it with EntityManager#remove(). It may also become available for deletion if you remove a reference to it from a mapped collection with orphan removal enabled.

An entity instance is then in the removed state: the provider will delete it at the end of a unit of work. You should discard any references you may hold to it in the application after you finish working with it—for example, after you’ve rendered the removal-confirmation screen your users see.

Detached state

To understand detached entity instances, consider loading an instance. You call Entity-Manager#find() to retrieve an entity instance by its (known) identifier. Then you end your unit of work and close the persistence context. The application still has a handle—a reference to the instance you loaded. It’s now in the detached state, and the data is becoming stale. You could discard the reference and let the garbage collector reclaim the memory. Or, you could continue working with the data in the detached state and later call the merge() method to save your modifications in a new unit of work. We’ll discuss detachment and merging again later in this chapter, in a dedicated section.

You should now have a basic understanding of entity instance states and their transitions. Our next topic is the persistence context: an essential service of any Java Persistence provider.

10.1.2. The persistence context

In a Java Persistence application, an EntityManager has a persistence context. You create a persistence context when you call EntityManagerFactory#createEntityManager(). The context is closed when you call EntityManager#close(). In JPA terminology, this is an application-managed persistence context; your application defines the scope of the persistence context, demarcating the unit of work.

The persistence context monitors and manages all entities in persistent state. The persistence context is the centerpiece of much of the functionality of a JPA provider.

The persistence context allows the persistence engine to perform automatic dirty checking, detecting which entity instances the application modified. The provider then synchronizes with the database the state of instances monitored by a persistence context, either automatically or on demand. Typically, when a unit of work completes, the provider propagates state held in memory to the database through the execution of SQL INSERT, UPDATE, and DELETE statements (all part of the Data Modification Language [DML]). This flushing procedure may also occur at other times. For example, Hibernate may synchronize with the database before execution of a query. This ensures that queries are aware of changes made earlier during the unit of work.

The persistence context acts as a first-level cache; it remembers all entity instances you’ve handled in a particular unit of work. For example, if you ask Hibernate to load an entity instance using a primary key value (a lookup by identifier), Hibernate can first check the current unit of work in the persistence context. If Hibernate finds the entity instance in the persistence context, no database hit occurs—this is a repeatable read for an application. Consecutive em.find(Item.class, ITEM_ID) calls with the same persistence context will yield the same result.

This cache also affects results of arbitrary queries, executed for example with the javax.persistence.Query API. Hibernate reads the SQL result set of a query and transforms it into entity instances. This process first tries to resolve every entity instance in the persistence context by identifier lookup. Only if an instance with the same identifier value can’t be found in the current persistence context does Hibernate read the rest of the data from the result-set row. Hibernate ignores any potentially newer data in the result set, due to read-committed transaction isolation at the database level, if the entity instance is already present in the persistence context.

The persistence context cache is always on—it can’t be turned off. It ensures the following:

  • The persistence layer isn’t vulnerable to stack overflows in the case of circular references in an object graph.
  • There can never be conflicting representations of the same database row at the end of a unit of work. The provider can safely write all changes made to an entity instance to the database.
  • Likewise, changes made in a particular persistence context are always immediately visible to all other code executed inside that unit of work and its persistence context. JPA guarantees repeatable entity-instance reads.

The persistence context provides a guaranteed scope of object identity; in the scope of a single persistence context, only one instance represents a particular database row. Consider the comparison of references entityA == entityB. This is true only if both are references to the same Java instance on the heap. Now, consider the comparison entityA.getId().equals(entityB.getId()). This is true if both have the same database identifier value. Within one persistence context, Hibernate guarantees that both comparisons will yield the same result. This solves one of the fundamental O/R mismatch problems we introduced in section 1.2.3.

Would process-scoped identity be better?

For a typical web or enterprise application, persistence context-scoped identity is preferred. Process-scoped identity, where only one in-memory instance represents the row in the entire process (JVM), would offer some potential advantages in terms of cache utilization. In a pervasively multithreaded application, though, the cost of always synchronizing shared access to persistent instances in a global identity map is too high a price to pay. It’s simpler and more scalable to have each thread work with a distinct copy of the data in each persistence context.

The life cycle of entity instances and the services provided by the persistence context can be difficult to understand at first. Let’s look at some code examples of dirty checking, caching, and how the guaranteed identity scope works in practice. To do this, you work with the persistence manager API.

10.2. The EntityManager interface

Any transparent persistence tool includes a persistence manager API. This persistence manager usually provides services for basic CRUD (create, read, update, delete) operations, query execution, and controlling the persistence context. In Java Persistence applications, the main interface you interact with is the EntityManager, to create units of work.

10.2.1. The canonical unit of work

In Java SE and some EE architectures (if you only have plain servlets, for example), you get an EntityManager by calling EntityManagerFactory#createEntity-Manager(). Your application code shares the EntityManagerFactory, representing one persistence unit, or one logical database. Most applications have only one shared EntityManagerFactory.

You use the EntityManager for a single unit of work in a single thread, and it’s inexpensive to create. The following listing shows the canonical, typical form of a unit of work.

Listing 10.1. A typical unit of work

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

(The TM class is a convenience class bundled with the example code of this book. Here it simplifies the lookup of the standard UserTransaction API in JNDI. The JPA class provides convenient access to the shared EntityManagerFactory.)

Everything between tx.begin() and tx.commit() occurs in one transaction. For now, keep in mind that all database operations in transaction scope, such as the SQL statements executed by Hibernate, completely either succeed or fail. Don’t worry too much about the transaction code for now; you’ll read more about concurrency control in the next chapter. We’ll look at the same example again with a focus on the transaction and exception-handling code. Don’t write empty catch clauses in your code, though—you’ll have to roll back the transaction and handle exceptions.

Creating an EntityManager starts its persistence context. Hibernate won’t access the database until necessary; the EntityManager doesn’t obtain a JDBC Connection from the pool until SQL statements have to be executed. You can create and close an EntityManager without hitting the database. Hibernate executes SQL statements when you look up or query data and when it flushes changes detected by the persistence context to the database. Hibernate joins the in-progress system transaction when an EntityManager is created and waits for the transaction to commit. When Hibernate is notified (by the JTA engine) of the commit, it performs dirty checking of the persistence context and synchronizes with the database. You can also force dirty checking synchronization manually by calling EntityManager#flush() at any time during a transaction.

You decide the scope of the persistence context by choosing when to close() the EntityManager. You have to close the persistence context at some point, so always place the close() call in a finally block.

How long should the persistence context be open? Let’s assume for the following examples that you’re writing a server, and each client request will be processed with one persistence context and system transaction in a multithreaded environment. If you’re familiar with servlets, imagine the code in listing 10.1 embedded in a servlet’s service() method. Within this unit of work, you access the EntityManager to load and store data.

10.2.2. Making data persistent

Let’s create a new instance of an entity and bring it from transient into persistent state:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

You can see the same unit of work and how the Item instances changes state in figure 10.2.

Figure 10.2. Making an instance persistent in a unit of work

A new transient Item is instantiated as usual. Of course, you may also instantiate it before creating the EntityManager. A call to persist() makes the transient instance of Item persistent. It’s now managed by and associated with the current persistence context.

To store the Item instance in the database, Hibernate has to execute an SQL INSERT statement. When the transaction of this unit of work commits, Hibernate flushes the persistence context, and the INSERT occurs at that time. Hibernate may even batch the INSERT at the JDBC level with other statements. When you call -persist(), only the identifier value of the Item is assigned. Alternatively, if your identifier generator isn’t pre-insert, the INSERT statement will be executed immediately when persist() is called. You may want to review section 4.2.5.

Detecting entity state using the identifier

Sometimes you need to know whether an entity instance is transient, persistent, or detached. An entity instance is in persistent state if EntityManager#contains(e)returns true. It’s in transient state if PersistenceUnitUtil#getIdentifier(e)returns null. It’s in detached state if it’s not persistent, and Persistence-UnitUtil#getIdentifier(e) returns the value of the entity’s identifier property. You can get to the PersistenceUnitUtil from the EntityManagerFactory.

There are two issues to look out for. First, be aware that the identifier value may not be assigned and available until the persistence context is flushed. Second, Hibernate (unlike some other JPA providers) never returns null from Persistence-UnitUtil#getIdentifier() if your identifier property is a primitive (a long and not a Long).

It’s better (but not required) to fully initialize the Item instance before managing it with a persistence context. The SQL INSERT statement contains the values that were held by the instance at the point when persist() was called. If you don’t set the name of the Item before making it persistent, a NOT NULL constraint may be violated. You can modify the Item after calling persist(), and your changes will be propagated to the database with an additional SQL UPDATE statement.

If one of the INSERT or UPDATE statements made when flushing fails, Hibernate causes a rollback of changes made to persistent instances in this transaction at the database level. But Hibernate doesn’t roll back in-memory changes to persistent instances. If you change the Item#name after persist(), a commit failure won’t roll back to the old name. This is reasonable because a failure of a transaction is normally non-recoverable, and you have to discard the failed persistence context and EntityManager immediately. We’ll discuss exception handling in the next chapter.

Next, you load and modify the stored data.

10.2.3. Retrieving and modifying persistent data

You can retrieve persistent instances from the database with the EntityManager. For the next example, we assume you’ve kept the identifier value of the Item stored in the previous section somewhere and are now looking up the same instance in a new unit of work by identifier:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

Figure 10.3 shows this transition graphically.

Figure 10.3. Making an instance persistent in a unit of work

You don’t need to cast the returned value of the find() operation; it’s a generic method, and its return type is set as a side effect of the first parameter. The retrieved entity instance is in persistent state, and you can now modify it inside the unit of work.

If no persistent instance with the given identifier value can be found, find() returns null. The find() operation always hits the database if there was no hit for the given entity type and identifier in the persistence context cache. The entity instance is always initialized during loading. You can expect to have all of its values available later in detached state: for example, when rendering a screen after you close the persistence context. (Hibernate may not hit the database if its optional second-level cache is enabled; we’ll discuss this shared cache in section 20.2.)

You can modify the Item instance, and the persistence context will detect these changes and record them in the database automatically. When Hibernate flushes the persistence context during commit, it executes the necessary SQL DML statements to synchronize the changes with the database. Hibernate propagates state changes to the database as late as possible, toward the end of the transaction. DML statements usually create locks in the database that are held until the transaction completes, so Hibernate keeps the lock duration in the database as short as possible.

Hibernate writes the new Item#name to the database with an SQL UPDATE. By default, Hibernate includes all columns of the mapped ITEM table in the SQL UPDATE statement, updating unchanged columns to their old values. Hence, Hibernate can generate these basic SQL statements at startup, not at runtime. If you want to include only modified (or non-nullable for INSERT) columns in SQL statements, you can enable dynamic SQL generation as discussed in section 4.3.2.

Hibernate detects the changed name by comparing the Item with a snapshot copy it took before, when the Item was loaded from the database. If your Item is different from the snapshot, an UPDATE is necessary. This snapshot in the persistence context consumes memory. Dirty checking with snapshots can also be time consuming, because Hibernate has to compare all instances in the persistence context with their snapshot during flushing.

You may want to customize how Hibernate detects dirty state, using an extension point. Set the property hibernate.entity_dirtiness_strategy in your persistence.xml configuration file to a class name that implements org.hibernate.Custom-EntityDirtinessStrategy. See the Javadoc of this interface for more information. org.hibernate.Interceptor is another extension point used to customize dirty checking, by implementing its findDirty() method. You can find an example interceptor in section 13.2.2.

We mentioned earlier that the persistence context enables repeatable reads of entity instances and provides an object-identity guarantee:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

The first find() operation hits the database and retrieves the Item instance with a SELECT statement. The second find() is resolved in the persistence context, and the same cached Item instance is returned.

Sometimes you need an entity instance but you don’t want to hit the database.

10.2.4. Getting a reference

If you don’t want to hit the database when loading an entity instance, because you aren’t sure you need a fully initialized instance, you can tell the EntityManager to attempt the retrieval of a hollow placeholder—a proxy:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

  1. If the persistence context already contains an Item with the given identifier, that Item instance is returned by getReference() without hitting the database. Furthermore, if no persistent instance with that identifier is currently managed, Hibernate produces a hollow placeholder: a proxy. This means getReference() won’t access the database, and it doesn’t return null, unlike find().
  2. JPA offers PersistenceUnitUtil helper methods such as isLoaded() to detect whether you’re working with an uninitialized proxy.
  3. As soon as you call any method such as Item#getName() on the proxy, a SELECT is executed to fully initialize the placeholder. The exception to this rule is a mapped -database identifier getter method, such as getId(). A proxy may look like the real thing, but it’s only a placeholder carrying the identifier value of the entity instance it represents. If the database record no longer exists when the proxy is initialized, an Entity-NotFoundException is thrown. Note that the exception can be thrown when Item#getName() is called.
  4. Hibernate has a convenient static initialize() method that loads the proxy’s data.
  5. After the persistence context is closed, item is in detached state. If you don’t initialize the proxy while the persistence context is still open, you get a LazyInitializationException if you access the proxy. You can’t load data on demand once the persistence context is closed. The solution is simple: load the data before you close the persistence context.

We’ll have much more to say about proxies, lazy loading, and on-demand fetching in chapter 12.

If you want to remove the state of an entity instance from the database, you have to make it transient.

10.2.5. Making data transient

To make an entity instance transient and delete its database representation, call the remove() method on the EntityManager:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

  1. If you call find(), Hibernate executes a SELECT to load the Item. If you call get-Reference(), Hibernate attempts to avoid the SELECT and returns a proxy.
  2. Calling remove() queues the entity instance for deletion when the unit of work completes; it’s now in removed state. If remove() is called on a proxy, Hibernate executes a SELECT to load the data. An entity instance must be fully initialized during life cycle transitions. You may have life cycle callback methods or an entity listener enabled (see section 13.2), and the instance must pass through these interceptors to complete its full life cycle.
  3. An entity in removed state is no longer in persistent state. You can check this with the contains() operation.
  4. You can make the removed instance persistent again, cancelling the deletion.
  5. When the transaction commits, Hibernate synchronizes the state transitions with the database and executes the SQL DELETE. The JVM garbage collector detects that the item is no longer referenced by anyone and finally deletes the last trace of the data.

Figure 10.4 shows the same process.

Figure 10.4. Removing an instance in a unit of work

By default, Hibernate won’t alter the identifier value of a removed entity instance. This means the item.getId() method still returns the now outdated identifier value. Sometimes it’s useful to work with the “deleted” data further: for example, you might want to save the removed Item again if your user decides to undo. As shown in the example, you can call persist() on a removed instance to cancel the deletion before the -persistence -context is flushed. Alternatively, if you set the property hibernate.use_ identifier_ rollback to true in persistence.xml, Hibernate will reset the identifier value after removal of an entity instance. In the previous code example, the -identifier value is reset to the default value of null (it’s a Long). The Item is now the same as in transient state, and you can save it again in a new persistence context.

Java Persistence also offers bulk operations that translate into direct SQL DELETE statements without life cycle interceptors in the application. We’ll discuss these operations in section 20.1.

Let’s say you load an entity instance from the database and work with the data. For some reason, you know that another application or maybe another thread of your application has updated the underlying row in the database. Next, we’ll see how to refresh the data held in memory.

10.2.6. Refreshing data

The following example demonstrates refreshing a persistent entity instance:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

Item item = em.find(Item.class, ITEM_ID);
item.setName("Some Name");
<enter/>
// Someone updates this row in the database
<enter/>
String oldName = item.getName();
em.refresh(item);
assertNotEquals(item.getName(), oldName);

After you load the entity instance, you realize (how isn’t important) that someone else changed the data in the database. Calling refresh() causes Hibernate to execute a SELECT to read and marshal a whole result set, overwriting changes you already made to the persistent instance in application memory. If the database row no longer exists (someone deleted it), Hibernate throws an EntityNotFoundException on refresh().

Most applications don’t have to manually refresh in-memory state; concurrent modifications are typically resolved at transaction commit time. The best use case for refreshing is with an extended persistence context, which might span several request/response cycles and/or system transactions. While you wait for user input with an open persistence context, data gets stale, and selective refreshing may be required depending on the duration of the conversation and the dialogue between the user and the system. Refreshing can be useful to undo changes made in memory during a conversation, if the user cancels the dialogue. We’ll have more to say about refreshing in a conversation in section 18.3.

Another infrequently used operation is replication of an entity instance.

10.2.7. Replicating data

Replication is useful, for example, when you need to retrieve data from one database and store it in another. Replication takes detached instances loaded in one persistence context and makes them persistent in another persistence context. You usually open these contexts from two different EntityManagerFactory configurations, enabling two logical databases. You have to map the entity in both configurations.

The replicate() operation is only available on the Hibernate Session API. Here is an example that loads an Item instance from one database and copies it into another:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

tx.begin();
<enter/>
EntityManager emA = getDatabaseA().createEntityManager();
Item item = emA.find(Item.class, ITEM_ID);
<enter/>
EntityManager emB = getDatabaseB().createEntityManager();
emB.unwrap(Session.class)
    .replicate(item, org.hibernate.ReplicationMode.LATEST_VERSION);
<enter/>
tx.commit();
emA.close();
emB.close();

Connections to both databases can participate in the same system transaction.

ReplicationMode controls the details of the replication procedure:

  • IGNORE—Ignores the instance when there is an existing database row with the same identifier in the database.
  • OVERWRITE—Overwrites any existing database row with the same identifier in the database.
  • EXCEPTION—Throws an exception if there is an existing database row with the same identifier in the target database.
  • LATEST_VERSION—Overwrites the row in the database if its version is older than the version of the given entity instance, or ignores the instance otherwise. Requires enabled optimistic concurrency control with entity versioning (see section 11.2.2).

You may need replication when you reconcile data entered into different databases. An example case is a product upgrade: if the new version of your application requires a new database (schema), you may want to migrate and replicate the existing data once.

The persistence context does many things for you: automatic dirty checking, guaranteed scope of object identity, and so on. It’s equally important that you know some of the details of its management, and that you sometimes influence what goes on behind the scenes.

10.2.8. Caching in the persistence context

The persistence context is a cache of persistent instances. Every entity instance in persistent state is associated with the persistence context.

Many Hibernate users who ignore this simple fact run into an OutOfMemory-Exception. This is typically the case when you load thousands of entity instances in a unit of work but never intend to modify them. Hibernate still has to create a snapshot of each instance in the persistence context cache, which can lead to memory exhaustion. (Obviously, you should execute a bulk data operation if you modify thousands of rows—we’ll get back to this kind of unit of work in section 20.1.)

The persistence context cache never shrinks automatically. Keep the size of your persistence context to the necessary minimum. Often, many persistent instances in your context are there by accident—for example, because you needed only a few items but queried for many. Extremely large graphs can have a serious performance impact and require significant memory for state snapshots. Check that your queries return only data you need, and consider the following ways to control Hibernate’s caching behavior.

You can call EntityManager#detach(i) to evict a persistent instance manually from the persistence context. You can call EntityManager#clear() to detach all persistent entity instances, leaving you with an empty persistence context.

The native Session API has some extra operations you might find useful. You can set the entire persistence context to read-only mode. This disables state snapshots and dirty checking, and Hibernate won’t write modifications to the database:

Path: /examples/src/test/java/org/jpwh/test/fetching/ReadOnly.java

You can disable dirty checking for a single entity instance:

Path: /examples/src/test/java/org/jpwh/test/fetching/ReadOnly.java

A query with the org.hibernate.Query interface can return read-only results, which Hibernate doesn’t check for modifications:

Path: /examples/src/test/java/org/jpwh/test/fetching/ReadOnly.java

Thanks to query hints, you can also disable dirty checking for instances obtained with the JPA standard javax.persistence.Query interface:

Query query = em.createQuery(queryString)
    .setHint(
        org.hibernate.annotations.QueryHints.READ_ONLY,
        true
    );

Be careful with read-only entity instances: you can still delete them, and modifications to collections are tricky! The Hibernate manual has a long list of special cases you need to read if you use these settings with mapped collections. You’ll see more query examples in chapter 14.

So far, flushing and synchronization of the persistence context have occurred automatically, when the transaction commits. In some cases, you need more control over the synchronization process.

10.2.9. Flushing the persistence context

By default, Hibernate flushes the persistence context of an EntityManager and synchronizes changes with the database whenever the joined transaction is committed. All the previous code examples, except some in the last section, have used that strategy. JPA allows implementations to synchronize the persistence context at other times, if they wish.

Hibernate, as a JPA implementation, synchronizes at the following times:

  • When a joined JTA system transaction is committed
  • Before a query is executed—we don’t mean lookup with find() but a query with javax.persistence.Query or the similar Hibernate API
  • When the application calls flush() explicitly

You can control this behavior with the FlushModeType setting of an EntityManager:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

Here, you load an Item instance and change its name . Then you query the database, retrieving the item’s name . Usually Hibernate recognizes that data has changed in memory and synchronizes these modifications with the database before the query. This is the behavior of FlushModeType.AUTO, the default if you join the EntityManager with a transaction. With FlushModeType.COMMIT, you’re disabling flushing before queries, so you may see different data returned by the query than what you have in memory. The synchronization then occurs only when the transaction commits.

You can at any time, while a transaction is in progress, force dirty checking and synchronization with the database by calling EntityManager#flush().

This concludes our discussion of the transient, persistent, and removed entity states, and the basic usage of the EntityManager API. Mastering these state transitions and API methods is essential; every JPA application is built with these operations.

Next, we look at the detached entity state. We already mentioned some issues you’ll see when entity instances aren’t associated with a persistence context anymore, such as disabled lazy initialization. Let’s explore the detached state with some examples, so you know what to expect when you work with data outside of a persistence context.

10.3. Working with detached state

If a reference leaves the scope of guaranteed identity, we call it a reference to a detached entity instance. When the persistence context is closed, it no longer provides an -identity-mapping service. You’ll run into aliasing problems when you work with detached entity instances, so make sure you understand how to handle the identity of detached instances.

10.3.1. The identity of detached instances

If you look up data using the same database identifier value in the same persistence context, the result is two references to the same in-memory instance on the JVM heap. Consider the two units of work shown next.

Listing 10.2. Guaranteed scope of object identity in Java Persistence

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

In the first unit of work at begin() , you start by creating a persistence context and loading some entity instances. Because references a and b are obtained from the same persistence context, they have the same Java identity . They’re equal because by default equals() relies on Java identity comparison. They obviously have the same database identity . They reference the same Item instance, in persistent state, managed by the persistence context for that unit of work. The first part of this example finishes by committing the transaction and closing the persistence context .

References a and b are in detached state when the first persistence context is closed. You’re dealing with instances that live outside of a guaranteed scope of object identity.

You can see that a and c, loaded in a different persistence context, aren’t identical . The test for equality with a.equals(c) is also false . A test for database identity still returns true . This behavior can lead to problems if you treat entity instances as equal in detached state. For example, consider the following extension of the code, after the second unit of work has ended:

This example adds all three references to a Set. All are references to detached instances. Now, if you check the size of the collection—the number of elements—what result do you expect?

A Set doesn’t allow duplicate elements. Duplicates are detected by the Set; whenever you add a reference, the Item#equals() method is called automatically against all other elements already in the collection. If equals() returns true for any element already in the collection, the addition doesn’t occur.

By default, all Java classes inherit the equals() method of java.lang.Object. This implementation uses a double-equals (==) comparison to check whether two references refer to the same in-memory instance on the Java heap.

You may guess that the number of elements in the collection is two. After all, a and b are references to the same in-memory instance; they have been loaded in the same persistence context. You obtained reference c from another persistence context; it refers to a different instance on the heap. You have three references to two instances, but you know this only because you’ve seen the code that loaded the data. In a real application, you may not know that a and b are loaded in a different context than c. Furthermore, you obviously expect that the collection has exactly one element, because a, b, and c represent the same database row, the same Item.

Whenever you work with instances in detached state and you test them for equality (usually in hash-based collections), you need to supply your own implementation of the equals() and hashCode() methods for your mapped entity class. This is an important issue: if you don’t work with entity instances in detached state, no action is needed, and the default equals() implementation of java.lang.Object is fine. You rely on Hibernate’s guaranteed scope of object identity within a persistence context. Even if you work with detached instances: if you never check if they’re equal, you never put them in a Set or use them as keys in a Map, you don’t have to worry. If all you do is render a detached Item on the screen, you aren’t comparing it to anything.

Many developers new to JPA think they always have to provide a custom equality routine for all entity classes, but this isn’t the case. In section 18.3, we’ll show you an application design with an extended persistence context strategy. This strategy will also extend the scope of guaranteed object identity to span an entire conversation and several system transactions. Note that you still need the discipline not to compare detached instances obtained in two conversations!

Let’s assume that you want to use detached instances and that you have to test them for equality with your own method.

10.3.2. Implementing equality methods

You can implement equals() and hashCode() methods several ways. Keep in mind that when you override equals(), you always need to also override hashCode() so the two methods are consistent. If two instances are equal, they must have the same hash value.

A seemingly clever approach is to implement equals() to compare just the database identifier property, which is often a surrogate primary key value. Basically, if two Item instances have the same identifier returned by getId(), they must be the same. If getId() returns null, it must be a transient Item that hasn’t been saved.

Unfortunately, this solution has one huge problem: identifier values aren’t assigned by Hibernate until an instance becomes persistent. If a transient instance were added to a Set before being saved, then when you save it, its hash value would change while it’s contained by the Set. This is contrary to the contract of java.util.Set, breaking the collection. In particular, this problem makes cascading persistent state useless for mapped associations based on sets. We strongly discourage database identifier equality.

To get to the solution that we recommend, you need to understand the notion of a business key. A business key is a property, or some combination of properties, that is unique for each instance with the same database identity. Essentially, it’s the natural key that you would use if you weren’t using a surrogate primary key instead. Unlike a natural primary key, it isn’t an absolute requirement that the business key never changes—as long as it changes rarely, that’s enough.

We argue that essentially every entity class should have a business key, even if it includes all properties of the class (which would be appropriate for some immutable classes). If your user is looking at a list of items on screen, how do they differentiate between items A, B, and C? The same property, or combination of properties, is your business key. The business key is what the user thinks of as uniquely identifying a particular record, whereas the surrogate key is what the application and database systems rely on. The business key property or properties are most likely constrained UNIQUE in your database schema.

Let’s write custom equality methods for the User entity class; this is easier than comparing Item instances. For the User class, username is a great candidate business key. It’s always required, it’s unique with a database constraint, and it changes rarely, if ever.

Listing 10.3. Custom implementation of User equality

You may have noticed that the equals() method code always accesses the properties of the “other” reference via getter methods. This is extremely important, because the reference passed as other may be a Hibernate proxy, not the actual instance that holds the persistent state. You can’t access the username field of a User proxy directly. To initialize the proxy to get the property value, you need to access it with a getter method. This is one point where Hibernate isn’t completely transparent, but it’s good practice anyway to use getter methods instead of direct instance variable access.

Check the type of the other reference with instanceof, not by comparing the values of getClass(). Again, the other reference may be a proxy, which is a runtime-generated subclass of User, so this and other may not be exactly the same type but a valid super/subtype. You can find more about proxies in section 12.1.1.

You can now safely compare User references in persistent state:

tx.begin();
em = JPA.createEntityManager();
<enter/>
User a = em.find(User.class, USER_ID);
User b = em.find(User.class, USER_ID);
assertTrue(a == b);
assertTrue(a.equals(b));
assertEquals(a.getId(), b.getId());
<enter/>
tx.commit();
em.close();

In addition, of course, you get correct behavior if you compare references to instances in persistent and detached state:

For some other entities, the business key may be more complex, consisting of a combination of properties. Here are some hints that should help you identify a business key in your domain model classes:

  • Consider what attributes users of your application will refer to when they have to identify an object (in the real world). How do users tell the difference between one element and another if they’re displayed on the screen? This is probably the business key you’re looking for.
  • Every immutable attribute is probably a good candidate for the business key. Mutable attributes may be good candidates, too, if they’re updated rarely or if you can control the case when they’re updated—for example, by ensuring the instances aren’t in a Set at the time.
  • Every attribute that has a UNIQUE database constraint is a good candidate for the business key. Remember that the precision of the business key has to be good enough to avoid overlaps.
  • Any date or time-based attribute, such as the creation timestamp of the record, is usually a good component of a business key, but the accuracy of System.currentTimeMillis() depends on the virtual machine and operating system. Our recommended safety buffer is 50 milliseconds, which may not be accurate enough if the time-based property is the single attribute of a business key.
  • You can use database identifiers as part of the business key. This seems to contradict our previous statements, but we aren’t talking about the database identifier value of the given entity. You may be able to use the database identifier of an associated entity instance. For example, a candidate business key for the Bid class is the identifier of the Item it matches together with the bid amount. You may even have a unique constraint that represents this composite business key in the database schema. You can use the identifier value of the associated Item because it never changes during the life cycle of a Bid—the Bid constructor can require an already-persistent Item.

If you follow our advice, you shouldn’t have much difficulty finding a good business key for all your business classes. If you encounter a difficult case, try to solve it without considering Hibernate. After all, it’s purely an object-oriented problem. Notice that it’s almost never correct to override equals() on a subclass and include another property in the comparison. It’s a little tricky to satisfy the Object identity and equality requirements that equality be both symmetric and transitive in this case; and, more important, the business key may not correspond to any well-defined candidate natural key in the database (subclass properties may be mapped to a different table). For more information on customizing equality comparisons, see Effective Java, 2nd edition, by Joshua Bloch (Bloch, 2008), a mandatory book for all Java programmers.

The User class is now prepared for detached state; you can safely put instances loaded in different persistence contexts into a Set. Next, we’ll look at some examples that involve detached state, and you see some of the benefits of this concept.

Sometimes you might want to detach an entity instance manually from the persistence context.

10.3.3. Detaching entity instances

You don’t have to wait for the persistence context to close. You can evict entity instances manually:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

User user = em.find(User.class, USER_ID);
<enter/>
em.detach(user);
<enter/>
assertFalse(em.contains(user));

This example also demonstrates the EntityManager#contains() operation, which returns true if the given instance is in managed persistent state in this persistence context.

You can now work with the user reference in detached state. Many applications only read and render the data after the persistence context is closed.

Modifying the loaded user after the persistence context is closed has no effect on its persistent representation in the database. JPA allows you to merge any changes back into the database in a new persistence context, though.

10.3.4. Merging entity instances

Let’s assume you’ve retrieved a User instance in a previous persistence context, and now you want to modify it and save these modifications:

Path: /examples/src/test/java/org/jpwh/test/simple/SimpleTransitions.java

Consider the graphical representation of this procedure in figure 10.5. It’s not as difficult as it seems.

Figure 10.5. Making an instance persistent in unit of work

The goal is record the new username of the detached User. First, when you call merge(), Hibernate checks whether a persistent instance in the persistence context has the same database identifier as the detached instance you’re merging.

In this example, the persistence context is empty; nothing has been loaded from the database. Hibernate therefore loads an instance with this identifier from the database. Then, merge() copies the detached entity instance onto this loaded persistent instance. In other words, the new username you have set on the detached User is also set on the persistent merged User, which merge() returns to you.

Now discard the old reference to the stale and outdated detached state; the detachedUser no longer represents the current state. You can continue modifying the returned mergedUser; Hibernate will execute a single UPDATE when it flushes the persistence context during commit.

If there is no persistent instance with the same identifier in the persistence context, and a lookup by identifier in the database is negative, Hibernate instantiates a fresh User. Hibernate then copies your detached instance onto this fresh instance, which it inserts into the database when you synchronize the persistence context with the database.

If the instance you’re giving to merge() is not detached but rather is transient (it doesn’t have an identifier value), Hibernate instantiates a fresh User, copies the values of the transient User onto it, and then makes it persistent and returns it to you. In simpler terms, the merge() operation can handle detached and transient entity instances. Hibernate always returns the result to you as a persistent instance.

An application architecture based on detachment and merging may not call the persist() operation. You can merge new and detached entity instances to store data. The important difference is the returned current state and how you handle this switch of references in your application code. You have to discard the detachedUser and from now on reference the current mergedUser. Every other component in your application still holding on to detachedUser has to switch to mergedUser.

Can I reattach a detached instance?

The Hibernate Session API has a method for reattachment called saveOrUpdate(). It accepts either a transient or a detached instance and doesn’t return anything. The given instance will be in persistent state after the operation, so you don’t have to switch references. Hibernate will execute an INSERT if the given instance was transient or an UPDATE if it was detached. We recommend that you rely on merging instead, because it’s standardized and therefore easier to integrate with other frameworks. In addition, instead of an UPDATE, merging may only trigger a SELECT if the detached data wasn’t modified. If you’re wondering what the saveOrUpdateCopy()method of the Session API does, it’s the same as merge() on the EntityManager.

If you want to delete a detached instance, you have to merge it first. Then call remove() on the persistent instance returned by merge().

We’ll look at detached state and merging again in chapter 18 and implement a more complex conversation between a user and the system using this strategy.

10.4. Summary

  • We discussed the most important strategies and some optional ones for interacting with entity instances in a JPA application.
  • You learned about the life cycle of entity instances and how they become persistent, detached, and removed.
  • The most important interface in JPA is the EntityManager.
  • In most applications, data isn’t stored and loaded in isolation. Hibernate is typically integrated in a multiuser application, and the database is accessed concurrently in many threads.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.79.176