Chapter 4. Mapping persistent classes

In this chapter

  • Understanding entities and value type concepts
  • Mapping entity classes with identity
  • Controlling entity-level mapping options

This chapter presents some fundamental mapping options and explains how to map entity classes to SQL tables. We show and discuss how you can handle database identity and primary keys, and how you can use various other metadata settings to customize how Hibernate loads and stores instances of your domain model classes. All mapping examples use JPA annotations. First, though, we define the essential distinction between entities and value types, and explain how you should approach the object/relational mapping of your domain model.

Major new feature in JPA 2

You can globally enable escaping of all names in generated SQL statements with the <delimited-identifiers> element in the persistence.xml configuration file.

4.1. Understanding entities and value types

When you look at your domain model, you’ll notice a difference between classes: some of the types seem more important, representing first-class business objects (the term object is used here in its natural sense). Examples are the Item, Category, and User classes: these are entities in the real world you’re trying to represent (refer back to figure 3.3 for a view of the example domain model). Other types present in your domain model, such as Address, String, and Integer, seem less important. In this section, we look at what it means to use fine-grained domain models and making the distinction between entity and value types.

4.1.1. Fine-grained domain models

A major objective of Hibernate is support for fine-grained and rich domain models. It’s one reason we work with POJOs. In crude terms, fine-grained means more classes than tables.

For example, a user may have a home address in your domain model. In the database, you may have a single USERS table with the columns HOME_STREET, HOME_CITY, and HOME_ZIPCODE. (Remember the problem of SQL types we discussed in section 1.2.1?)

In the domain model, you could use the same approach, representing the address as three string-valued properties of the User class. But it’s much better to model this using an Address class, where User has a homeAddress property. This domain model achieves improved cohesion and greater code reuse, and it’s more understandable than SQL with inflexible type systems.

JPA emphasizes the usefulness of fine-grained classes for implementing type safety and behavior. For example, many people model an email address as a string-valued property of User. A more sophisticated approach is to define an EmailAddress class, which adds higher-level semantics and behavior—it may provide a prepareMail() method (it shouldn’t have a sendMail() method, because you don’t want your domain model classes to depend on the mail subsystem).

This granularity problem leads us to a distinction of central importance in ORM. In Java, all classes are of equal standing—all instances have their own identity and life cycle. When you introduce persistence, some instances may not have their own identity and life cycle but depend on others. Let’s walk through an example.

4.1.2. Defining application concepts

Two people live in the same house, and they both register user accounts in Caveat-Emptor. Let’s call them John and Jane.

An instance of User represents each account. Because you want to load, save, and delete these User instances independently, User is an entity class and not a value type. Finding entity classes is easy.

The User class has a homeAddress property; it’s an association with the Address class. Do both User instances have a runtime reference to the same Address instance, or does each User instance have a reference to its own Address? Does it matter that John and Jane live in the same house?

In figure 4.1, you can see how two User instances share a single Address instance (this is a UML object diagram, not a class diagram). If Address is supposed to support shared runtime references, it’s an entity type. The Address instance has its own life, you can’t delete it when John removes his User account—Jane might still have a reference to the Address.

Figure 4.1. Two User instances have a reference to a single Address.

Now let’s look at the alternative model where each User has a reference to its own homeAddress instance, as shown in figure 4.2. In this case, you can make an instance of Address dependent on an instance of User: you make it a value type. When John removes his User account, you can safely delete his Address instance. Nobody else will hold a reference.

Figure 4.2. Two User instances each have their own dependent Address.

Hence, we make the following essential distinction:

  • You can retrieve an instance of entity type using its persistent identity: for example, a User, Item, or Category instance. A reference to an entity instance (a pointer in the JVM) is persisted as a reference in the database (a foreign key–constrained value). An entity instance has its own life cycle; it may exist independently of any other entity. You map selected classes of your domain model as entity types.
  • An instance of value type has no persistent identifier property; it belongs to an entity instance. Its lifespan is bound to the owning entity instance. A value type instance doesn’t support shared references. The most obvious value types are all JDK-defined classes such as String, Integer, and even primitives. You can also map your own domain model classes as value types: for example, Address and MonetaryAmount.

If you read the JPA specification, you’ll find the same concept. But value types in JPA are called basic property types or embeddable classes. We come back to this in the next chapter; first our focus is on entities.

Identifying entities and value types in your domain model isn’t an ad hoc task but follows a certain procedure.

4.1.3. Distinguishing entities and value types

You may find it helpful to add stereotype (a UML extensibility mechanism) information to your UML class diagrams so you can immediately recognize entities and value types. This practice also forces you to think about this distinction for all your classes, which is a first step to an optimal mapping and well-performing persistence layer. Figure 4.3 shows an example.

Figure 4.3. Diagramming stereotypes for entities and value types

The Item and User classes are obvious entities. They each have their own identity, their instances have references from many other instances (shared references), and they have independent lifespans.

Marking the Address as a value type is also easy: a single User instance references a particular Address instance. You know this because the association has been created as a composition, where the User instance has been made fully responsible for the life cycle of the referenced Address instance. Therefore, Address instances can’t be referenced by anyone else and don’t need their own identity.

The Bid class could be a problem. In object-oriented modeling, this is marked as a composition (the association between Item and Bid with the diamond). Thus, an Item is the owner of its Bid instances and holds a collection of references. At first, this seems reasonable, because bids in an auction system are useless when the item they were made for is gone.

But what if a future extension of the domain model requires a User#bids collection, containing all bids made by a particular User? Right now, the association between Bid and User is unidirectional; a Bid has a bidder reference. What if this was bidirectional?

In that case, you have to deal with possible shared references to Bid instances, so the Bid class needs to be an entity. It has a dependent life cycle, but it must have its own identity to support (future) shared references.

You’ll often find this kind of mixed behavior; but your first reaction should be to make everything a value typed class and promote it to an entity only when absolutely necessary. Try to simplify your associations: persistent collections, for example, frequently add complexity without offering any advantages. Instead of mapping Item#bids and User#bids collections, you can write queries to obtain all the bids for an Item and those made by a particular User. The associations in the UML diagram would point from the Bid to the Item and User, unidirectionally, and not the other way. The stereotype on the Bid class would then be <<Value type>>. We come back to this point again in chapter 7.

Next, take your domain model diagram and implement POJOs for all entities and value types. You’ll have to take care of three things:

  • Shared references— Avoid shared references to value type instances when you write your POJO classes. For example, make sure only one User can reference an Address. You can make Address immutable with no public setUser() method and enforce the relationship with a public constructor that has a User argument. Of course, you still need a no-argument, probably protected constructor, as we discussed in the previous chapter, so Hibernate can also create an instance.
  • Life cycle dependencies— If a User is deleted, its Address dependency has to be deleted as well. Persistence metadata will include the cascading rules for all such dependencies, so Hibernate (or the database) can take care of removing the obsolete Address. You must design your application procedures and user interface to respect and expect such dependencies—write your domain model POJOs accordingly.
  • Identity— Entity classes need an identifier property in almost all cases. Value type classes (and of course JDK classes such as String and Integer) don’t have an identifier property, because instances are identified through the owning entity.

We come back to references, associations, and life cycle rules when we discuss more-advanced mappings throughout later chapters in this book. Object identity and identifier properties are our next topic.

4.2. Mapping entities with identity

Mapping entities with identity requires you to understand Java identity and equality before we can walk through an entity class example and its mapping. After that, we’ll be able to dig in deeper and select a primary key, configure key generators, and finally go through identifier generator strategies. First, it’s vital to understand the difference between Java object identity and object equality before we discuss terms like database identity and the way JPA manages identity.

4.2.1. Understanding Java identity and equality

Java developers understand the difference between Java object identity and equality. Object identity (==) is a notion defined by the Java virtual machine. Two references are identical if they point to the same memory location.

On the other hand, object equality is a notion defined by a class’s equals() method, sometimes also referred to as equivalence. Equivalence means two different (non-identical) instances have the same value—the same state. Two different instances of String are equal if they represent the same sequence of characters, even though each has its own location in the memory space of the virtual machine. (If you’re a Java guru, we acknowledge that String is a special case. Assume we used a different class to make the same point.)

Persistence complicates this picture. With object/relational persistence, a persistent instance is an in-memory representation of a particular row (or rows) of a database table (or tables). Along with Java identity and equality, we define database identity. You now have three methods for distinguishing references:

  • Objects are identical if they occupy the same memory location in the JVM. This can be checked with the a == b operator. This concept is known as object identity.
  • Objects are equal if they have the same state, as defined by the a.equals(Object b) method. Classes that don’t explicitly override this method inherit the implementation defined by java.lang.Object, which compares object identity with ==. This concept is known as object equality.
  • Objects stored in a relational database are identical if they share the same table and primary key value. This concept, mapped into the Java space, is known as database identity.

We now need to look at how database identity relates to object identity and how to express database identity in the mapping metadata. As an example, you’ll map an entity of a domain model.

4.2.2. A first entity class and mapping

We weren’t completely honest in the previous chapter: the @Entity annotation isn’t enough to map a persistent class. You also need an @Id annotation, as shown in the following listing.

Listing 4.1. Mapped Item entity class with an identifier property

Path: /model/src/main/java/org/jpwh/model/simple/Item.java

This is the most basic entity class, marked as “persistence capable” with the @Entity annotation, and with an @Id mapping for the database identifier property. The class maps by default to a table named ITEM in the database schema.

Every entity class has to have an @Id property; it’s how JPA exposes database identity to the application. We don’t show the identifier property in our diagrams; we assume that each entity class has one. In our examples, we always name the identifier property id. This is a good practice for your own project; use the same identifier property name for all your domain model entity classes. If you specify nothing else, this property maps to a primary key column named ID of the ITEM table in your database schema.

Hibernate will use the field to access the identifier property value when loading and storing items, not getter or setter methods. Because @Id is on a field, Hibernate will now enable every field of the class as a persistent property by default. The rule in JPA is this: if @Id is on a field, the JPA provider will access fields of the class directly and consider all fields part of the persistent state by default. You’ll see how to override this later in this chapter—in our experience, field access is often the best choice, because it gives you more freedom for accessor method design.

Should you have a (public) getter method for the identifier property? Well, the application often uses database identifiers as a convenient handle to a particular instance, even outside the persistence layer. For example, it’s common for web applications to display the results of a search screen to the user as a list of summaries. When the user selects a particular element, the application may need to retrieve the selected item, and it’s common to use a lookup by identifier for this purpose—you’ve probably already used identifiers this way, even in applications that rely on JDBC.

Should you have a setter method? Primary key values never change, so you shouldn’t allow modification of the identifier property value. Hibernate won’t update a primary key column, and you shouldn’t expose a public identifier setter method on an entity.

The Java type of the identifier property, java.lang.Long in the previous example, depends on the primary key column type of the ITEM table and how key values are produced. This brings us to the @GeneratedValue annotation and primary keys in general.

4.2.3. Selecting a primary key

The database identifier of an entity is mapped to some table primary key, so let’s first get some background on primary keys without worrying about mappings. Take a step back and think about how you identify entities.

A candidate key is a column or set of columns that you could use to identify a particular row in a table. To become the primary key, a candidate key must satisfy the following requirements:

  • The value of any candidate key column is never null. You can’t identify something with data that is unknown, and there are no nulls in the relational model. Some SQL products allow defining (composite) primary keys with nullable columns, so you must be careful.
  • The value of the candidate key column(s) is a unique value for any row.
  • The value of the candidate key column(s) never changes; it’s immutable.

Must primary keys be immutable?

The relational model defines that a candidate key must be unique and irreducible (no subset of the key attributes has the uniqueness property). Beyond that, picking a candidate key as the primary key is a matter of taste. But Hibernate expects a candidate key to be immutable when used as the primary key. Hibernate doesn’t support updating primary key values with an API; if you try to work around this requirement, you’ll run into problems with Hibernate’s caching and dirty-checking engine. If your database schema relies on updatable primary keys (and maybe uses ON UPDATE CASCADE foreign key constraints), you must change the schema before it will work with Hibernate.

If a table has only one identifying attribute, it becomes, by definition, the primary key. But several columns or combinations of columns may satisfy these properties for a particular table; you choose between candidate keys to decide the best primary key for the table. You should declare candidate keys not chosen as the primary key as unique keys in the database if their value is indeed unique (but maybe not immutable).

Many legacy SQL data models use natural primary keys. A natural key is a key with business meaning: an attribute or combination of attributes that is unique by virtue of its business semantics. Examples of natural keys are the US Social Security Number and Australian Tax File Number. Distinguishing natural keys is simple: if a candidate key attribute has meaning outside the database context, it’s a natural key, regardless of whether it’s automatically generated. Think about the application users: if they refer to a key attribute when talking about and working with the application, it’s a natural key: “Can you send me the pictures of item #123-abc?”

Experience has shown that natural primary keys usually cause problems in the end. A good primary key must be unique, immutable, and never null. Few entity attributes satisfy these requirements, and some that do can’t be efficiently indexed by SQL databases (although this is an implementation detail and shouldn’t be the deciding factor for or against a particular key). In addition, you should make certain that a candidate key definition never changes throughout the lifetime of the database. Changing the value (or even definition) of a primary key, and all foreign keys that refer to it, is a frustrating task. Expect your database schema to survive decades, even if your application won’t.

Furthermore, you can often only find natural candidate keys by combining several columns in a composite natural key. These composite keys, although certainly appropriate for some schema artifacts (like a link table in a many-to-many relationship), potentially make maintenance, ad hoc queries, and schema evolution much more difficult. We talk about composite keys later in the book, in section 9.2.1.

For these reasons, we strongly recommend that you add synthetic identifiers, also called surrogate keys. Surrogate keys have no business meaning—they have unique values generated by the database or application. Application users ideally don’t see or refer to these key values; they’re part of the system internals. Introducing a surrogate key column is also appropriate in the common situation when there are no candidate keys. In other words, (almost) every table in your schema should have a dedicated surrogate primary key column with only this purpose.

There are a number of well-known approaches to generating surrogate key values. The aforementioned @GeneratedValue annotation is how you configure this.

4.2.4. Configuring key generators

The @Id annotation is required to mark the identifier property of an entity class. Without the @GeneratedValue next to it, the JPA provider assumes that you’ll take care of creating and assigning an identifier value before you save an instance. We call this an application-assigned identifier. Assigning an entity identifier manually is necessary when you’re dealing with a legacy database and/or natural primary keys. We have more to say about this kind of mapping in a dedicated section, 9.2.1.

Usually you want the system to generate a primary key value when you save an entity instance, so you write the @GeneratedValue annotation next to @Id. JPA standardizes several value-generation strategies with the javax.persistence.Generation-Type enum, which you select with @GeneratedValue(strategy = ...):

  • GenerationType.AUTO—Hibernate picks an appropriate strategy, asking the SQL dialect of your configured database what is best. This is equivalent to @GeneratedValue() without any settings.
  • GenerationType.SEQUENCE—Hibernate expects (and creates, if you use the tools) a sequence named HIBERNATE_SEQUENCE in your database. The sequence will be called separately before every INSERT, producing sequential numeric values.
  • GenerationType.IDENTITY—Hibernate expects (and creates in table DDL) a special auto-incremented primary key column that automatically generates a numeric value on INSERT, in the database.
  • GenerationType.TABLE—Hibernate will use an extra table in your database schema that holds the next numeric primary key value, one row for each entity class. This table will be read and updated accordingly, before INSERTs. The default table name is HIBERNATE_SEQUENCES with columns SEQUENCE_NAME and SEQUENCE_NEXT_HI_VALUE. (The internal implementation uses a more complex but efficient hi/lo generation algorithm; more on this later.)

Although AUTO seems convenient, you need more control, so you usually shouldn’t rely on it and explicitly configure a primary key generation strategy. In addition, most applications work with database sequences, but you may want to customize the name and other settings of the database sequence. Therefore, instead of picking one of the JPA strategies, we recommend a mapping of the identifier with @GeneratedValue(generator = "ID_GENERATOR"), as shown in the previous example.

This is a named identifier generator; you are now free to set up the ID_GENERATOR configuration independently from your entity classes.

JPA has two built-in annotations you can use to configure named generators: @javax.persistence.SequenceGenerator and @javax.persistence.TableGenerator. With these annotations, you can create a named generator with your own sequence and table names. As usual with JPA annotations, you can unfortunately only use them at the top of a (maybe otherwise empty) class, and not in a package-info.java file.

Hibernate Feature

For this reason, and because the JPA annotations don’t give us access to the full Hibernate feature set, we prefer an alternative: the native @org.hibernate.annotations.GenericGenerator annotation. It supports all Hibernate identifier generator strategies and their configuration details. Unlike the rather limited JPA annotations, you can use the Hibernate annotation in a package-info.java file, typically in the same package as your domain model classes. The next listing shows a recommended configuration.

Listing 4.2. Hibernate identifier generator configured as package-level metadata

Path: /model/src/main/java/org/jpwh/model/package-info.java

This Hibernate-specific generator configuration has the following advantages:

  • The enhanced-sequence strategy produces sequential numeric values. If your SQL dialect supports sequences, Hibernate will use an actual database sequence. If your DBMS doesn’t support native sequences, Hibernate will manage and use an extra “sequence table,” simulating the behavior of a sequence. This gives you real portability: the generator can always be called before performing an SQL INSERT, unlike, for example, auto-increment identity columns, which produce a value on INSERT that has to be returned to the application afterward.
  • You can configure the sequence_name . Hibernate will either use an existing sequence or create it when you generate the SQL schema automatically. If your DBMS doesn’t support sequences, this will be the special “sequence table” name.
  • You can start with an initial_value that gives you room for test data. For example, when your integration test runs, Hibernate will make any new data insertions from test code with identifier values greater than 1000. Any test data you want to import before the test can use numbers 1 to 999, and you can refer to the stable identifier values in your tests: “Load item with id 123 and run some tests on it.” This is applied when Hibernate generates the SQL schema and sequence; it’s a DDL option.

You can share the same database sequence among all your domain model classes. There is no harm in specifying @GeneratedValue(generator = "ID_GENERATOR") in all your entity classes. It doesn’t matter if primary key values aren’t contiguous for a particular entity, as long as they’re unique within one table. If you’re worried about contention, because the sequence has to be called prior to every INSERT, we discuss a variation of this generator configuration later, in section 20.1.

Finally, you use java.lang.Long as the type of the identifier property in the entity class, which maps perfectly to a numeric database sequence generator. You could also use a long primitive. The main difference is what someItem.getId() returns on a new item that hasn’t been stored in the database: either null or 0. If you want to test whether an item is new, a null check is probably easier to understand for someone else reading your code. You shouldn’t use another integral type such as int or short for identifiers. Although they will work for a while (perhaps even years), as your database size grows, you may be limited by their range. An Integer would work for almost two months if you generated a new identifier each millisecond with no gaps, and a Long would last for about 300 million years.

Although recommended for most applications, the enhanced-sequence strategy as shown in listing 4.2 is just one of the strategies built into Hibernate.

4.2.5. Identifier generator strategies

Hibernate Feature

Following is a list of all available Hibernate identifier generator strategies, their options, and our usage recommendations. If you don’t want to read the whole list now, enable GenerationType.AUTO and check what Hibernate defaults to for your database dialect. It’s most likely sequence or identity—a good but maybe not the most efficient or portable choice. If you require consistent portable behavior, and identifier values to be available before INSERTs, use enhanced-sequence, as shown in the previous section. This is a portable, flexible, and modern strategy, also offering various optimizers for large datasets.

We also show the relationship between each standard JPA strategy and its native Hibernate equivalent. Hibernate has been growing organically, so there are now two sets of mappings between standard and native strategies; we call them Old and New in the list. You can switch this mapping with the hibernate.id.new_generator_mappings setting in your persistence.xml file. The default is true; hence the New mapping. Software doesn’t age quite as well as wine:

Generating identifiers before or after INSERT: what’s the difference?

An ORM service tries to optimize SQL INSERTs: for example, by batching several at the JDBC level. Hence, SQL execution occurs as late as possible during a unit of work, not when you call entityManager.persist(someItem). This merely queues the insertion for later execution and, if possible, assigns the identifier value. But if you now call someItem.getId(), you might get null back if the engine wasn’t able to generate an identifier before the INSERT. In general, we prefer pre-insert generation strategies that produce identifier values independently, before INSERT. A common choice is a shared and concurrently accessible database sequence. Autoincremented columns, column default values, or trigger-generated keys are only available after the INSERT.

  • native—Automatically selects other strategies, such as sequence or identity, depending on the configured SQL dialect. You have to look at the Javadoc (or even the source) of the SQL dialect you configured in persistence.xml. Equivalent to JPA GenerationType.AUTO with the Old mapping.
  • sequence—Uses a native database sequence named HIBERNATE_SEQUENCE. The sequence is called before each INSERT of a new row. You can customize the sequence name and provide additional DDL settings; see the Javadoc for the class org.hibernate.id.SequenceGenerator.
  • sequence-identity—Generates key values by calling a database sequence on insertion: for example, insert into ITEM(ID) values (HIBERNATE_SEQUENCE.nextval). The key value is retrieved after INSERT, the same behavior as the identity strategy. Supports the same parameters and property types as the sequence strategy; see the Javadoc for the class org.hibernate.id.Sequence-IdentityGenerator and its parent.
  • enhanced-sequence—Uses a native database sequence when supported; otherwise falls back to an extra database table with a single column and row, emulating a sequence. Defaults to name HIBERNATE_SEQUENCE. Always calls the database “sequence” before an INSERT, providing the same behavior independently of whether the DBMS supports real sequences. Supports an org.hibernate.id.enhanced.Optimizer to avoid hitting the database before each INSERT; defaults to no optimization and fetching a new value for each INSERT. You can find more examples in chapter 20. For all parameters, see the Javadoc for the class org.hibernate.id.enhanced.SequenceStyleGenerator. Equivalent to JPA GenerationType.SEQUENCE and GenerationType.AUTO with the New mapping enabled, most likely your best option of the built-in strategies.
  • seqhilo—Uses a native database sequence named HIBERNATE_SEQUENCE, optimizing calls before INSERT by combining hi/lo values. If the hi value retrieved from the sequence is 1, the next 9 insertions will be made with key values 11, 12, 13, ..., 19. Then the sequence is called again to obtain the next hi value (2 or higher), and the procedure repeats with 21, 22, 23, and so on. You can configure the maximum lo value (9 is the default) with the max_lo parameter. Unfortunately, due to a quirk in Hibernate’s code, you can not configure this strategy in @GenericGenerator. The only way to use it is with JPA GenerationType.SEQUENCE and the Old mapping. You can configure it with the standard JPA @SequenceGenerator annotation on a (maybe otherwise empty) class. See the Javadoc for the class org.hibernate.id.SequenceHiLoGenerator and its parent for more information. Consider using enhanced-sequence instead, with an optimizer.
  • hilo—Uses an extra table named HIBERNATE_UNIQUE_KEY with the same algorithm as the seqhilo strategy. The table has a single column and row, holding the next value of the sequence. The default maximum lo value is 32767, so you most likely want to configure it with the max_lo parameter. See the Javadoc for the class org.hibernate.id.TableHiLoGenerator for more information. We don’t recommend this legacy strategy; use enhanced-sequence instead with an optimizer.
  • enhanced-table—Uses an extra table named HIBERNATE_SEQUENCES, with one row by default representing the sequence, storing the next value. This value is selected and updated when an identifier value has to be generated. You can configure this generator to use multiple rows instead: one for each generator; see the Javadoc for org.hibernate.id.enhanced.TableGenerator. Equivalent to JPA GenerationType.TABLE with the New mapping enabled. Replaces the outdated but similar org.hibernate.id.MultipleHiLoPerTableGenerator, which is the Old mapping for JPA GenerationType.TABLE.
  • identity—Supports IDENTITY and auto-increment columns in DB2, MySQL, MS SQL Server, and Sybase. The identifier value for the primary key column will be generated on INSERT of a row. Has no options. Unfortunately, due to a quirk in Hibernate’s code, you can not configure this strategy in @GenericGenerator. The only way to use it is with JPA GenerationType.IDENTITY and the Old or New mapping, making it the default for GenerationType.IDENTITY.
  • increment—At Hibernate startup, reads the maximum (numeric) primary key column value of each entity’s table and increments the value by one each time a new row is inserted. Especially efficient if a non-clustered Hibernate application has exclusive access to the database; but don’t use it in any other scenario.
  • select—Hibernate won’t generate a key value or include the primary key column in an INSERT statement. Hibernate expects the DBMS to assign a (default in schema or by trigger) value to the column on insertion. Hibernate then retrieves the primary key column with a SELECT query after insertion. Required parameter is key, naming the database identifier property (such as id) for the SELECT. This strategy isn’t very efficient and should only be used with old JDBC drivers that can’t return generated keys directly.
  • uuid2—Produces a unique 128-bit UUID in the application layer. Useful when you need globally unique identifiers across databases (say, you merge data from several distinct production databases in batch runs every night into an archive). The UUID can be encoded either as a java.lang.String, a byte[16], or a java.util.UUID property in your entity class. Replaces the legacy uuid and uuid.hex strategies. You configure it with an org.hibernate.id.UUIDGeneration-Strategy; see the Javadoc for the class org.hibernate.id.UUIDGenerator for more details.
  • guid—Uses a globally unique identifier produced by the database, with an SQL function available on Oracle, Ingres, MS SQL Server, and MySQL. Hibernate calls the database function before an INSERT. Maps to a java.lang.String identifier property. If you need full control over identifier generation, configure the strategy of @GenericGenerator with the fully qualified name of a class that implements the org.hibernate.id.IdentityGenerator interface.

To summarize, our recommendations on identifier generator strategies are as follows:

  • In general, we prefer pre-insert generation strategies that produce identifier values independently before INSERT.
  • Use enhanced-sequence, which uses a native database sequence when supported and otherwise falls back to an extra database table with a single column and row, emulating a sequence.

We assume from now on that you’ve added identifier properties to the entity classes of your domain model and that after you complete the basic mapping of each entity and its identifier property, you continue to map the value-typed properties of the entities. We talk about value-type mappings in the next chapter. Read on for some special options that can simplify and enhance your class mappings.

4.3. Entity-mapping options

You’ve now mapped a persistent class with @Entity, using defaults for all other settings, such as the mapped SQL table name. The following section explores some class-level options and how you control them:

  • Naming defaults and strategies
  • Dynamic SQL generation
  • Entity mutability

These are options; you can skip this section and come back later when you have to deal with a specific problem.

4.3.1. Controlling names

Let’s first talk about the naming of entity classes and tables. If you only specify @Entity on the persistence-capable class, the default mapped table name is the same as the class name. Note that we write SQL artifact names in UPPERCASE to make them easier to distinguish—SQL is actually case insensitive. So the Java entity class Item maps to the ITEM table. You can override the table name with the JPA @Table annotation, as shown next.

Listing 4.3. @Table annotation overrides the mapped table name

Path: /model/src/main/java/org/jpwh/model/simple/User.java

@Entity
@Table(name = "USERS")
public class User implements Serializable {
<enter/>
    // ...
}

The User entity would map to the USER table; this is a reserved keyword in most SQL DBMSs. You can’t have a table with that name, so you instead map it to USERS. The @javax.persistence.Table annotation also has catalog and schema options, if your database layout requires these as naming prefixes.

If you really have to, quoting allows you to use reserved SQL names and even work with case-sensitive names.

Quoting SQL identifiers

From time to time, especially in legacy databases, you’ll encounter identifiers with strange characters or whitespace, or wish to force case sensitivity. Or, as in the previous example, the automatic mapping of a class or property would require a table or column name that is a reserved keyword.

Hibernate 5 knows the reserved keywords of your DBMS through the configured database dialect. Hibernate 5 can automatically put quotes around such strings when generating SQL. You can enable this automatic quoting with hibernate.auto_quote _keyword=true in your persistence unit configuration. If you’re using an older version of Hibernate, or you find that the dialect’s information is incomplete, you must still apply quotes on names manually in your mappings if there is a conflict with a keyword.

If you quote a table or column name in your mapping with backticks, Hibernate always quotes this identifier in the generated SQL. This still works in latest versions of Hibernate, but JPA 2.0 standardized this functionality as delimited identifiers with double quotes.

This is the Hibernate-only quoting with backticks, modifying the previous example:

@Table(name = "`USER`")

To be JPA-compliant, you also have to escape the quotes in the string:

@Table(name = ""USER"")

Either way works fine with Hibernate. It knows the native quote character of your dialect and now generates SQL accordingly: [USER] for MS SQL Server, 'USER' for MySQL, "USER" for H2, and so on.

If you have to quote all SQL identifiers, create an orm.xml file and add the setting <delimited-identifiers/> to its <persistence-unit-defaults> section, as shown in listing 3.8. Hibernate then enforces quoted identifiers everywhere.

You should consider renaming tables or columns with reserved keyword names whenever possible. Ad hoc SQL queries are difficult to write in an SQL console if you have to quote and escape everything properly by hand.

Next, you’ll see how Hibernate can help when you encounter organizations with strict conventions for database table and column names.

Implementing naming conventions
Hibernate Feature

Hibernate provides a feature that allows you to enforce naming standards automatically. Suppose that all table names in CaveatEmptor should follow the pattern CE_<table name>. One solution is to manually specify an @Table annotation on all entity classes. This approach is time-consuming and easily forgotten. Instead, you can implement Hibernate’s PhysicalNamingStrategy interface or override an existing implementation, as in the following listing.

Listing 4.4. PhysicalNamingStrategy, overriding default naming conventions

Path: /shared/src/main/java/org/jpwh/shared/CENamingStrategy.java

public class CENamingStrategy extends
    org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl {
<enter/>
    @Override
    public Identifier toPhysicalTableName(Identifier name,
                                          JdbcEnvironment context) {
        return new Identifier("CE_" + name.getText(), name.isQuoted());
    }
<enter/>
}

The overridden method toPhysicalTableName() prepends CE_ to all generated table names in your schema. Look at the Javadoc of the PhysicalNamingStrategy interface; it offers methods for custom naming of columns, sequences, and other artifacts.

You have to enable the naming-strategy implementation in persistence.xml:

<persistence-unit>name="CaveatEmptorPU">
    ...

    <properties>
        <property name="hibernate.physical_naming_strategy"
                  value="org.jpwh.shared.CENamingStrategy"/>
    </properties>
</persistence-unit>
<enter/>

A second option for naming customization is ImplicitNamingStrategy. Whereas the physical naming strategy acts at the lowest level, when schema artifact names are ultimately produced, the implicit-naming strategy is called before. If you map an entity class and don’t have an @Table annotation with an explicit name, the implicit-naming strategy implementation is asked what the table name should be. This is based on factors such as the entity name and class name. Hibernate ships with several strategies to implement legacy- or JPA-compliant default names. The default strategy is ImplicitNamingStrategyJpaCompliantImpl.

Let’s have a quick look at another related issue, the naming of entities for queries.

Naming entities for querying

By default, all entity names are automatically imported into the namespace of the query engine. In other words, you can use short class names without a package prefix in JPA query strings, which is convenient:

List result = em.createQuery("select i from Item i")
                .getResultList();

This only works when you have one Item class in your persistence unit. If you add another Item class in a different package, you should rename one of them for JPA if you want to continue using the short form in queries:

package my.other.model;
@javax.persistence.Entity(name = "AuctionItem")
public class Item {
    // ...
}

The short query form is now select i from AuctionItem i for the Item class in the my.other.model package. Thus you resolve the naming conflict with another Item class in another package. Of course, you can always use fully qualified long names with the package prefix.

This completes our tour of the naming options in Hibernate. Next, we discuss how Hibernate generates the SQL that contains these names.

4.3.2. Dynamic SQL generation

Hibernate Feature

By default, Hibernate creates SQL statements for each persistent class when the persistence unit is created, on startup. These statements are simple create, read, update, and delete (CRUD) operations for reading a single row, deleting a row, and so on. It’s cheaper to store these in memory up front, instead of generating SQL strings every time such a simple query has to be executed at runtime. In addition, prepared statement caching at the JDBC level is much more efficient if there are fewer statements.

How can Hibernate create an UPDATE statement on startup? After all, the columns to be updated aren’t known at this time. The answer is that the generated SQL statement updates all columns, and if the value of a particular column isn’t modified, the statement sets it to its old value.

In some situations, such as a legacy table with hundreds of columns where the SQL statements will be large for even the simplest operations (say, only one column needs updating), you should disable this startup SQL generation and switch to dynamic statements generated at runtime. An extremely large number of entities can also impact startup time, because Hibernate has to generate all SQL statements for CRUD up front. Memory consumption for this query statement cache will also be high if a dozen statements must be cached for thousands of entities. This can be an issue in virtual environments with memory limitations, or on low-power devices.

To disable generation of INSERT and UPDATE SQL statements on startup, you need native Hibernate annotations:

@Entity
@org.hibernate.annotations.DynamicInsert
@org.hibernate.annotations.DynamicUpdate
public class Item {
    // ...
}

By enabling dynamic insertion and updates, you tell Hibernate to produce the SQL strings when needed, not up front. The UPDATE will only contain columns with updated values, and the INSERT will only contain non-nullable columns.

We talk again about SQL generation and customizing SQL in chapter 17. Sometimes you can avoid generating an UPDATE statement altogether, if your entity is immutable.

4.3.3. Making an entity immutable

Hibernate Feature

Instances of a particular class may be immutable. For example, in CaveatEmptor, a Bid made for an item is immutable. Hence, Hibernate never needs to execute UPDATE statements on the BID table. Hibernate can also make a few other optimizations, such as avoiding dirty checking, if you map an immutable class as shown in the next example. Here, the Bid class is immutable and instances are never modified:

@Entity
@org.hibernate.annotations.Immutable
public class Bid {
    // ...
}

A POJO is immutable if no public setter methods for any properties of the class are exposed—all values are set in the constructor. Hibernate should access the fields directly when loading and storing instances. We talked about this earlier in this chapter: if the @Id annotation is on a field, Hibernate will access the fields directly, and you are free to design your getter and setter methods as you see fit. Also, remember that not all frameworks work with POJOs without setter methods; JSF, for example, doesn’t access fields directly to populate an instance.

When you can’t create a view in your database schema, you can map an immutable entity class to an SQL SELECT query.

4.3.4. Mapping an entity to a subselect

Hibernate Feature

Sometimes your DBA won’t allow you to change the database schema; even adding a new view might not be possible. Let’s say you want to create a view that contains the identifier of an auction Item and the number of bids made for that item.

Using a Hibernate annotation, you can create an application-level view, a read-only entity class mapped to an SQL SELECT:

Path: /model/src/main/java/org/jpwh/model/advanced/ItemBidSummary.java

When an instance of ItemBidSummary is loaded, Hibernate executes your custom SQL SELECT as a subselect:

Path: /examples/src/test/java/org/jpwh/test/advanced/MappedSubselect.java

ItemBidSummary itemBidSummary = em.find(ItemBidSummary.class, ITEM_ID);
// select * from (

//      select i.ID as ITEMID, i.ITEM_NAME as NAME, ...
// ) where ITEMID = ?

You should list all table names referenced in your SELECT in the @org.hibernate.annotations.Synchronize annotation. (At the time of writing, Hibernate has a bug tracked under issue HHH-8430[1] that makes the synchronized table names case sensitive.) Hibernate will then know it has to flush modifications of Item and Bid instances before it executes a query against ItemBidSummary:

1

Path: /examples/src/test/java/org/jpwh/test/advanced/MappedSubselect.java

Note that Hibernate doesn’t flush automatically before a find() operation—only before a Query is executed, if necessary. Hibernate detects that the modified Item will affect the result of the query, because the ITEM table is synchronized with ItemBid-Summary. Hence, a flush and the UPDATE of the ITEM row are necessary to avoid the query returning stale data.

4.4. Summary

  • Entities are the coarser-grained classes of your system. Their instances have an independent life cycle and their own identity, and many other instances can reference them.
  • Value types, on the other hand, are dependent on a particular entity class. A value type instance is bound to its owning entity instance, and only one entity instance can reference it—it has no individual identity.
  • We looked at Java identity, object equality, and database identity, and at what makes good primary keys. You learned which generators for primary key values Hibernate provides out of the box, and how to use and extend this identifier system.
  • We discussed some useful class mapping options, such as naming strategies and dynamic SQL generation.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.133.54