© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
L. Jungmann et al.Pro Jakarta Persistence in Jakarta EE 10https://doi.org/10.1007/978-1-4842-7443-9_4

4. Object-Relational Mapping

Lukas Jungmann1  , Mike Keith2, Merrick Schincariol3 and Massimo Nardone4
(1)
Prague, Czech Republic
(2)
Ottawa, ON, Canada
(3)
Almonte, ON, Canada
(4)
HELSINKI, Finland
 

The largest part of an API that persists objects to a relational database ends up being the object-relational mapping (ORM) component. The topic of ORM usually includes everything from how the object state is mapped to the database columns to how to issue queries across the objects. We are focusing this chapter primarily on how to define and map entity state to the database, emphasizing the simple manner in which it can be done.

This chapter introduces the basics of mapping fields to database columns and then goes on to show how to map and automatically generate entity identifiers. We go into some detail about different kinds of relationships and illustrate how they are mapped from the domain model to the data model.

The most important ORM features are
  • Idiomatic persistence: By enabling to write the persistence classes using object-oriented classes

  • High performance: By enabling fetching and locking techniques

  • Reliable: By enabling stability for Jakarta Persistence programmers

Figure 4-1 shows the Jakarta Persistence ORM architecture.
Figure 4-1

Jakarta Persistence ORM architecture

Persistence Annotations

We have shown in previous chapters how annotations have been used extensively both in the Enterprise Beans and Jakarta Persistence specifications. We discuss persistence and mapping metadata in significant detail, and because we use annotations to explain the concepts, it is worth reviewing a few things about the annotations before we get started.

Persistence annotations can be applied at three different levels: class, method, and field. To annotate any of these levels, the annotation must be placed in front of the code definition of the artifact being annotated. In some cases, we put them on the same line just before the class, method, or field; in other cases, we put them on the line above. The choice is based completely on the preferences of the person applying the annotations, and we think it makes sense to do one thing in some cases and the other in other cases. It depends on how long the annotation is and what the most readable format seems to be.

The Jakarta Persistence annotations were designed to be readable, easy to specify, and flexible enough to allow different combinations of metadata. Most annotations are specified as siblings instead of being nested inside each other, meaning that multiple annotations can annotate the same class, field, or property instead of having annotations embedded within other annotations. As with all trade-offs, the piper must be paid, however, and the cost of flexibility is that many possible permutations of top-level metadata will be syntactically correct but semantically invalid. The compiler will be of no use, but the provider runtime will often do some basic checking for improper annotation groupings. The nature of annotations, however, is that when they are unexpected, they will often just not get noticed at all. This is worth remembering when attempting to understand behavior that might not match what you thought you specified in the annotations. It could be that one or more of the annotations are being ignored.

The mapping annotations can be categorized as being in one of two categories: logical annotations and physical annotations. The annotations in the logical group are those that describe the entity model from an object modeling view. They are tightly bound to the domain model and are the sort of metadata that you might want to specify in UML or any other object modeling language or framework. The physical annotations relate to the concrete data model in the database. They deal with tables, columns, constraints, and other database-level artifacts that the object model might never be aware of otherwise.

We use both types of annotations throughout the examples and to demonstrate the mapping metadata. Understanding and being able to distinguish between these two levels of metadata will help you make decisions about where to declare metadata, and where to use annotations and XML. As you will see in Chapter 13, there are XML equivalents to all the mapping annotations described in this chapter, giving you the freedom to use the approach that best suits your development needs .

Accessing Entity State

The mapped state of an entity must be accessible to the provider at runtime, so that when it comes time to write the data out, it can be obtained from the entity instance and stored in the database. Similarly, when the state is loaded from the database, the provider runtime must be able to insert it into a new entity instance. The way the state is accessed in the entity is called the access mode.

In Chapter 2, you learned that there are two different ways to specify persistent entity state: you can either annotate the fields or annotate the JavaBean-style properties. The mechanism that you use to designate the persistent state is the same as the access mode that the provider uses to access that state. If you annotate fields, the provider will get and set the fields of the entity using reflection. If the annotations are set on the getter methods of properties, those getter and setter methods will be invoked by the provider to access and set the state.

Field Access

Annotating the fields of the entity will cause the provider to use field access to get and set the state of the entity. Getter and setter methods might or might not be present, but if they are present, they are ignored by the provider. All fields must be declared as either protected, package, or private. Public fields are disallowed because it would open up the state fields to access by any unprotected class in the VM. Doing so is not just an obviously bad practice but could also defeat the provider implementation. Of course, the other qualifiers do not prevent classes within the same package or hierarchy from doing the same thing, but there is an obvious trade-off between what should be constrained and what should be recommended. Other classes must use the methods of an entity in order to access its persistent state, and even the entity class itself should only really manipulate the fields directly during initialization.

The example in Listing 4-1 shows the Employee entity being mapped using field access. The @Id annotation indicates not only that the id field is the persistent identifier or primary key for the entity but also that field access should be assumed. The name and salary fields are then defaulted to being persistent, and they get mapped to columns of the same name.
@Entity
public class Employee {
    @Id private long id;
    private String name;
    private long salary;
    public long getId() { return id; }
    public void setId(long id) { this.id = id; }
    public String getName() { return name; }
    public void setName(String name) { this.name = name; }
    public long getSalary() { return salary; }
    public void setSalary(long salary) { this.salary = salary; }
}
Listing 4-1

Using Field Access

Property Access

When property access mode is used, the same contract as for JavaBeans applies, and there must be getter and setter methods for the persistent properties. The type of property is determined by the return type of the getter method and must be the same as the type of the single parameter passed into the setter method. Both methods must have either public or protected visibility. The mapping annotations for a property must be on the getter method.

In Listing 4-2, the Employee class has an @Id annotation on the getId() getter method, so the provider will use property access to get and set the state of the entity. The name and salary properties will be made persistent by virtue of the getter and setter methods that exist for them, and will be mapped to NAME and SALARY columns, respectively. Note that the salary property is backed by the wage field, which does not share the same name. This goes unnoticed by the provider because by specifying property access, we are telling the provider to ignore the entity fields and use only the getter and setter methods for naming.
@Entity
public class Employee {
    private long id;
    private String name;
    private long wage;
    @Id public long getId() { return id; }
    public void setId(long id) { this.id = id; }
    public String getName() { return name; }
    public void setName(String name) { this.name = name; }
    public long getSalary() { return wage; }
    public void setSalary(long salary) { this.wage = salary; }
}
Listing 4-2

Using Property Access

Mixed Access

It is also possible to combine field access with property access within the same entity hierarchy, or even within the same entity. This will not be a very common occurrence, but can be useful, for example, when an entity subclass is added to an existing hierarchy that uses a different access type. Adding an @Access annotation with a specified access mode on the subclass entity will cause the default access type to be overridden for that entity subclass.

The @Access annotation is also useful when you need to perform a simple transformation to the data when reading from or writing to the database. Usually you will want to access the data through field access, but in this case you will define a getter/setter method pair to perform the transformation and use property access for that one attribute. In general, there are three essential steps to add a persistent field or property to be accessed differently from the default access mode for that entity.

Consider an Employee entity that has a default access mode of FIELD, but the database column stores the area code as part of the phone number, and we only want to store the area code in the entity phoneNum field if it is not a local number. We can add a persistent property that transforms it accordingly on reads and writes.

The first thing that must be done is to explicitly mark the default access mode for the class by annotating it with the @Access annotation and indicating the access type. Unless this is done, it will be undefined if both fields and properties are annotated. We would tag our Employee entity as having FIELD access:
@Entity
@Access(AccessType.FIELD)
public class Employee { ... }
The next step is to annotate the additional field or property with the @Access annotation, but this time specifying the opposite access type from what was specified at the class level. It might seem a little redundant, for example, to specify the access type of AccessType.PROPERTY on a persistent property because it is obvious by looking at it that it is a property, but doing so indicates that what you are doing is not an oversight but a conscious exception to the default case:
@Access(AccessType.PROPERTY) @Column(name="PHONE")
protected String getPhoneNumberForDb() { ... }
The final point to remember is that the corresponding field or property to the one being made persistent must be marked as transient so that the default accessing rules do not cause the same state to be persisted twice. For example, because we are adding a persistent property to an entity for which the default access type is through fields, the field in which the persistent property state is being stored in the entity must be annotated with @Transient:
@Transient private String phoneNum;
Listing 4-3 shows the complete Employee entity class annotated to use property access for only one property.
@Entity
@Access(AccessType.FIELD)
public class Employee {
    public static final String LOCAL_AREA_CODE = "613";
    @Id private long id;
    @Transient private String phoneNum;
    ...
    public long getId() { return id; }
    public void setId(long id) { this.id = id; }
    public String getPhoneNumber() { return phoneNum; }
    public void setPhoneNumber(String num) { this.phoneNum = num; }
    @Access(AccessType.PROPERTY) @Column(name="PHONE")
    protected String getPhoneNumberForDb() {
        if (phoneNum.length() == 10)
            return phoneNum;
        else
            return LOCAL_AREA_CODE + phoneNum;
    }
    protected void setPhoneNumberForDb(String num) {
        if (num.startsWith(LOCAL_AREA_CODE))
            phoneNum = num.substring(3);
        else
            phoneNum = num;
    }
    ...
}
Listing 4-3

Using Combined Access

Mapping to a Table

You saw in Chapter 2 that in the simplest case, mapping an entity to a table does not need any mapping annotations at all. Only the @Entity and @Id annotations need to be specified to create and map an entity to a database table.

In those cases, the default table name, which was just the unqualified name of the entity class, was perfectly suitable. If it happens that the default table name is not the name that you like, or if a suitable table that contains the state already exists in your database with a different name, you must specify the name of the table. You do this by annotating the entity class with the @Table annotation and including the name of the table using the name element. Many databases have terse names for tables. Listing 4-4 shows an entity that is mapped to a table that has a name that’s different from its class name.
@Entity
@Table(name="EMP")
public class Employee { ... }
Listing 4-4

Overriding the Default Table Name

Tip

Default names are not specified to be either uppercase or lowercase. Most databases are not case-sensitive, so it won’t generally matter whether a vendor uses the case of the entity name or converts it to uppercase. In Chapter 10, we discuss how to delimit database identifiers when the database is set to be case-sensitive.

The @Table annotation provides the ability to not only name the table that the entity state is being stored in but also to name a database schema or catalog. The schema name is commonly used to differentiate one set of tables from another and is indicated by using the schema element. Listing 4-5 shows an Employee entity that is mapped to the EMP table in the HR schema.
@Entity
@Table(name="EMP", schema="HR")
public class Employee { ... }
Listing 4-5

Setting a Schema

When specified, the schema name will be prepended to the table name when the persistence provider goes to the database to access the table. In this case, the HR schema will be prepended to the EMP table each time the table is accessed.

Tip

Some vendors might allow the schema to be included in the name element of the table without having to specify the schema element, such as in @Table(name="HR.EMP"). Support for inlining the name of the schema with the table name is nonstandard.

Some databases support the notion of a catalog. For these databases, the catalog element of the @Table annotation can be specified. Listing 4-6 shows a catalog being explicitly set for the EMP table.
@Entity
@Table(name="EMP", catalog="HR")
public class Employee { ... }
Listing 4-6

Setting a Catalog

Mapping Simple Types

Simple Java types are mapped as part of the immediate state of an entity in its fields or properties. The list of persistable types is quite lengthy and includes pretty much every built-in type that you would want to persist. They include the following:
  • Primitive Java types: byte, int, short, long, boolean, char, float, and double

  • Wrapper classes of primitive Java types: Byte, Integer, Short, Long, Boolean, Character, Float, and Double

  • Byte and character array types: byte[], Byte[], char[], and Character[]

  • Large numeric types: java.math.BigInteger and java.math.BigDecimal

  • Strings: java.lang.String

  • Java temporal types: java.util.Date and java.util.Calendar

  • JDBC temporal types: java.sql.Date, java.sql.Time, and java.sql.Timestamp

  • Enumerated types: Any system or user-defined enumerated type

  • Serializable objects: Any system or user-defined serializable type

Sometimes the type of the database column being mapped to is not exactly the same as the Java type. In almost all cases, the provider runtime can convert the type returned by JDBC into the correct Java type of the attribute. If the type from the JDBC layer cannot be converted to the Java type of the field or property, an exception will normally be thrown, although it is not guaranteed.

Tip

When the persistent type does not match the JDBC type, some providers might choose to take proprietary action or make a best guess to convert between the two. In other cases, the JDBC driver might be performing the conversion on its own.

When persisting a field or property, the provider looks at the type and ensures that it is one of the persistable types listed earlier. If it is on the list, the provider will persist it using the appropriate JDBC type and pass it through to the JDBC driver. At that point, if the field or property is not serializable, the result is unspecified. The provider might choose to throw an exception or just try to pass the object through to JDBC. You will see in Chapter 10 how converters can be used to extend the list of types that can be persisted in Jakarta Persistence.

An optional @Basic annotation can be placed on a field or property to explicitly mark it as being persistent. This annotation is mostly for documentation purposes and is not required for the field or property to be persistent. If it is not there, then it is implicitly assumed in the absence of any other mapping annotation. Because of the annotation, mappings of simple types are called basic mappings, whether the @Basic annotation is actually present or is just being assumed.

Note

Now that you have seen how you can persist either fields or properties and how they are virtually equivalent in terms of persistence, we will just call them attributes. An attribute is a field or property of a class, and we will use the term attribute from now on to avoid having to continually refer to fields or properties in specific terms.

Column Mappings

The @Basic annotation (or assumed basic mapping in its absence) can be thought of as a logical indication that a given attribute is persistent. The physical annotation that is the companion annotation to the basic mapping is the @Column annotation. Specifying @Column on the attribute indicates specific characteristics of the physical database column that the object model is less concerned about. In fact, the object model might never even need to know to which column it is mapped, and the column name and physical mapping metadata can be located in a separate XML file.

A number of annotation elements can be specified as part of @Column, but most of them apply only to schema generation and are covered later in the book. The only one that is of consequence is the name element, which is just a string that specifies the name of the column that the attribute has been mapped to. This is used when the default column name is not appropriate or does not apply to the schema being used. You can think of the name element of the @Column annotation as a means of overriding the default column name that would have otherwise been applied. The example in Listing 4-7 shows how to override the default column name for an attribute.
@Entity
public class Employee {
    @Id
    @Column(name="EMP_ID")
    private long id;
    private String name;
    @Column(name="SAL")
    private long salary;
    @Column(name="COMM")
    private String comments;
    // ...
}
Listing 4-7

Mapping Attributes to Columns

To put these annotations in context, let’s look at the full table mapping represented by this entity. The first thing to notice is that no @Table annotation exists on the class, so the default table name of EMPLOYEE will be applied to it.

Next, note that @Column can be used with @Id mappings as well as with basic mappings. The id field is being overridden to map to the EMP_ID column instead of the default ID column. The name field is not annotated with @Column, so the default column name NAME would be used to store and retrieve the employee name. The salary and comments fields, however, are annotated to map to the SAL and COMM columns, respectively. The Employee entity is therefore mapped to the table shown in Figure 4-2.
Figure 4-2

EMPLOYEE entity table

Lazy Fetching

On occasion , it will be known ahead of time that certain portions of an entity will be seldom accessed. In these situations, you can optimize the performance when retrieving the entity by fetching only the data that you expect to be frequently accessed; the remainder of the data can be fetched only when or if it is required. There are many names for this kind of feature, including lazy loading, deferred loading, lazy fetching, on-demand fetching, just-in-time reading, indirection, and others. They all mean pretty much the same thing, which is just that some data might not be loaded when the object is initially read from the database, but will be fetched only when referenced or accessed.

The fetch type of a basic mapping can be configured to be lazily or eagerly loaded by specifying the fetch element in the corresponding @Basic annotation. The FetchType-enumerated type defines the values for this element, which can be either EAGER or LAZY. Setting the fetch type of a basic mapping to LAZY means that the provider might defer loading the state for that attribute until it is referenced. The default is to load all basic mappings eagerly. Listing 4-8 shows an example of overriding a basic mapping to be lazily loaded.
@Entity
public class Employee {
    // ...
    @Basic(fetch=FetchType.LAZY)
    @Column(name="COMM")
    private String comments;
    // ...
}
Listing 4-8

Lazy Field Loading

We are assuming in this example that applications will seldom access the comments in an employee record, so we mark it as being lazily fetched. Note that in this case, the @Basic annotation is not only present for documentation purposes but is also required in order to specify the fetch type for the field. Configuring the comments field to be fetched lazily will allow an Employee instance returned from a query to have the comments field empty. The application does not have to do anything special to get it, however. By simply accessing the comments field, it will be transparently read and filled in by the provider if it was not already loaded.

Before you use this feature, you should be aware of a few pertinent points about lazy attribute fetching. First and foremost, the directive to lazily fetch an attribute is meant only to be a hint to the persistence provider to help the application achieve better performance. The provider is not required to respect the request because the behavior of the entity is not compromised if the provider goes ahead and loads the attribute. The converse is not true, though, because specifying that an attribute be eagerly fetched might be critical to being able to access the entity state once the entity is detached from the persistence context. We discuss detachment more in Chapter 6 and explore the connection between lazy loading and detachment.

Second, on the surface it might appear that this is a good idea for certain attributes of an entity, but in practice it is almost never a good idea to lazily fetch simple types. There is little to be gained in returning only part of a database row unless you are certain that the state will not be accessed in the entity later on. The only times when lazy loading of a basic mapping should be considered are when there are many columns in a table (e.g., dozens or hundreds) or when the columns are large (e.g., very large character strings or byte strings). It could take significant resources to load the data, and not loading it could save quite a lot of effort, time, and resources. Unless either of these two cases is true, in the majority of cases, lazily fetching a subset of object attributes will end up being more expensive than eagerly fetching them.

Lazy fetching is quite relevant when it comes to relationship mappings, though, so we discuss this topic later in the chapter.

Large Objects

A common database term for a character or byte-based object that can be very large (up to the gigabyte range) is a large object, or LOB for short. Database columns that can store these types of large objects require special JDBC calls to be accessed from Java. To signal to the provider that it should use the LOB methods when passing and retrieving this data to and from the JDBC driver, an additional annotation must be added to the basic mapping. The @Lob annotation acts as the marker annotation to fulfill this purpose and might appear in conjunction with the @Basic annotation, or it might appear when @Basic is absent and implicitly assumed to be on the mapping.

Because the @Lob annotation is really just qualifying the basic mapping, it can also be accompanied by a @Column annotation when the name of the LOB column needs to be overridden from the assumed default name.

LOBs come in two flavors in the database: character large objects, called CLOBs, and binary large objects, or BLOBs. As their names imply, a CLOB column holds a large character sequence, and a BLOB column can store a large byte sequence. The Java types mapped to BLOB columns are byte[], Byte[], and Serializable types, while char[], Character[], and String objects are mapped to CLOB columns. The provider is responsible for making this distinction based on the type of the attribute being mapped.

An example of mapping an image to a BLOB column is shown in Listing 4-9. Here, the PIC column is assumed to be a BLOB column to store the employee picture that is in the picture field. We have also marked this field to be loaded lazily, a common practice applied to LOBs that do not get referenced often.
@Entity
public class Employee {
    @Id
    private long id;
    @Basic(fetch=FetchType.LAZY)
    @Lob @Column(name="PIC")
    private byte[] picture;
    // ...
}
Listing 4-9

Mapping a BLOB Column

Enumerated Types

Another of the simple types that might be treated specially is the enumerated type. The values of an enumerated type are constants that can be handled differently depending on the application needs.

As with enumerated types in other languages, the values of an enumerated type in Java have an implicit ordinal assignment that is determined by the order in which they were declared. This ordinal cannot be modified at runtime and can be used to represent and store the values of the enumerated type in the database. Interpreting the values as ordinals is the default way that providers will map enumerated types to the database, and the provider will assume that the database column is an integer type.

Consider the following enumerated type:
public enum EmployeeType {
    FULL_TIME_EMPLOYEE,
    PART_TIME_EMPLOYEE,
    CONTRACT_EMPLOYEE
}
The ordinals assigned to the values of this enumerated type at compile time would be 0 for FULL_TIME_EMPLOYEE, 1 for PART_TIME_EMPLOYEE, and 2 for CONTRACT_EMPLOYEE. In Listing 4-10, we define a persistent field of this type.
@Entity
public class Employee {
    @Id private long id;
    private EmployeeType type;
    // ...
}
Listing 4-10

Mapping an Enumerated Type Using Ordinals

You can see that mapping EmployeeType is trivially easy to the point where you don’t actually have to do anything at all. The defaults are applied, and everything will just work. The type field will get mapped to an integer TYPE column, and all full-time employees will have an ordinal of 0 assigned to them. Similarly, the other employees will have their types stored in the TYPE column accordingly.

If an enumerated type changes, however, then we have a problem. The persisted ordinal data in the database will no longer apply to the correct value. In this example, if the company benefits policy changed and we started giving additional benefits to part-time employees who worked more than 20 hours per week, we would want to differentiate between the two types of part-time employees. By adding a PART_TIME_BENEFITS_EMPLOYEE value after PART_TIME_EMPLOYEE, we would be causing a new ordinal assignment to occur, where our new value would get assigned the ordinal of 2 and CONTRACT_EMPLOYEE would get 3. This would have the effect of causing all the contract employees on record to suddenly become part-time employees with benefits, clearly not the result that we were hoping for.

We could go through the database and adjust all the Employee entities to have their correct type, but if the employee type is used elsewhere, then we would need to make sure that they were all fixed as well. This is not a good maintenance situation to be in.

A better solution would be to store the name of the value as a string instead of storing the ordinal. This would isolate us from any changes in declaration and allow us to add new types without having to worry about the existing data. We can do this by adding an @Enumerated annotation on the attribute and specifying a value of STRING.

The @Enumerated annotation actually allows an EnumType to be specified, and the EnumType is itself an enumerated type that defines values of ORDINAL and STRING. While it is somewhat ironic that an enumerated type is being used to indicate how the provider should represent enumerated types, it is wholly appropriate. Because the default value of @Enumerated is ORDINAL, specifying @Enumerated(ORDINAL) is useful only when you want to make this mapping explicit.

In Listing 4-11, we are storing strings for the enumerated values. Now the TYPE column must be a string-based type, and all of the full-time employees will have the string FULL_TIME_EMPLOYEE stored in their corresponding TYPE column.
@Entity
public class Employee {
    @Id
    private long id;
    @Enumerated(EnumType.STRING)
    private EmployeeType type;
    // ...
}
Listing 4-11

Mapping an Enumerated Type Using Strings

Note that using strings will solve the problem of inserting additional values in the middle of the enumerated type, but it will leave the data vulnerable to changes in the names of the values. For instance, if we wanted to change PART_TIME_EMPLOYEE to PT_EMPLOYEE, then we would be in trouble. This is a less likely problem, though, because changing the names of an enumerated type would cause all the code that uses the enumerated type to have to change also. This would be a bigger bother than reassigning values in a database column.

In general, storing the ordinal is the best and most efficient way to store enumerated types as long as the likelihood of additional values inserted in the middle is not high. New values could still be added on the end of the type without any negative consequences .

One final note about enumerated types is that they are defined quite flexibly in Java. In fact, it is even possible to have values that contain state. There is currently no support within the Jakarta Persistence for mapping state contained within enumerated values. Neither is there support for the compromise position between STRING and ORDINAL of explicitly mapping each enumerated value to a dedicated numeric value different from its compiler-assigned ordinal value. More extensive enumerated support is being considered for future releases.

Temporal Types

Temporal types are the set of time-based types that can be used in persistent state mappings. The list of supported temporal types includes the three java.sql types—java.sql.Date, java.sql.Time, and java.sql.Timestamp—and the two java.util types, java.util.Date and java.util.Calendar.

The java.sql types are completely hassle-free. They act just like any other simple mapping type and do not need any special consideration. The two java.util types need additional metadata, however, to indicate which of the JDBC java.sql types to use when communicating with the JDBC driver. This is done by annotating them with the @Temporal annotation and specifying the JDBC type as a value of the TemporalType-enumerated type. There are three enumerated values of DATE, TIME, and TIMESTAMP to represent each of the java.sql types.

Listing 4-12 shows how java.util.Date and java.util.Calendar can be mapped to date columns in the database.
@Entity
public class Employee {
    @Id
    private long id;
    @Temporal(TemporalType.DATE)
    private Calendar dob;
    @Temporal(TemporalType.DATE)
    @Column(name="S_DATE")
    private Date startDate;
    // ...
}
Listing 4-12

Mapping Temporal Types

Like the other varieties of basic mappings, the @Column annotation can be used to override the default column name.

Transient State

Attributes that are part of a persistent entity but not intended to be persistent can either be modified with the transient modifier in Java or be annotated with the @Transient annotation. If either is specified, the provider runtime will not apply its default mapping rules to the attribute on which it was specified.

Transient fields are used for various reasons. One might be the case earlier on in the chapter when we mixed the access mode and didn’t want to persist the same state twice. Another might be when you want to cache some in-memory state that you don’t want to have to recompute, rediscover, or reinitialize. For example, in Listing 4-13 we are using a transient field to save the correct locale-specific word for Employee so that we print it correctly wherever it is being displayed. We have used the transient modifier instead of the @Transient annotation so that if the Employee gets serialized from one VM to another, then the translated name will get reinitialized to correspond to the locale of the new VM. In cases where the nonpersistent value should be retained across serialization, the annotation should be used instead of the modifier.
@Entity
public class Employee {
    @Id private long id;
    private String name;
    private long salary;
    transient private String translatedName;
    // ...
    public String toString() {
        if (translatedName == null) {
            translatedName =
                ResourceBundle.getBundle("EmpResources").getString("Employee");
        }
        return translatedName + ": " + id + " " + name;
    }
}
Listing 4-13

Using a Transient Field

Mapping the Primary Key

Every entity that is mapped to a relational database must have a mapping to a primary key in the table. You have already learned the basics of how the @Id annotation indicates the identifier of the entity. In this section, you explore simple identifiers and primary keys in a little more depth and learn how you can let the persistence provider generate unique identifier values.

Note

When an entity identifier is composed of only a single attribute, it's called a simple identifier.

Overriding the Primary Key Column

The same defaulting rules apply to ID mappings as to basic mappings, which is that the name of the column is assumed to be the same as the name of the attribute. Just as with basic mappings, the @Column annotation can be used to override the column name that the ID attribute is mapped to.

Primary keys are assumed to be insertable, but not nullable or updatable. When overriding a primary key column, the nullable and updatable elements should not be overridden. Only in the very specific circumstance of mapping the same column to multiple fields/relationships (as described in Chapter 10) should the insertable element be set to false.

Primary Key Types

Except for its special significance in designating the mapping to the primary key column, an ID mapping is almost the same as the basic mapping. The other main difference is that ID mappings are generally restricted to the following types:
  • Primitive Java types: byte, int, short, long, and char

  • Wrapper classes of primitive Java types: Byte, Integer, Short, Long, and Character

  • String: java.lang.String

  • Large numeric type: java.math.BigInteger

  • Temporal types: java.util.Date and java.sql.Date

Floating point types such as float and double are also permitted, as well as the Float and Double wrapper classes and java.math.BigDecimal, but they are discouraged because of the nature of rounding error and the untrustworthiness of the equals() operator when applied to them. Using floating types for primary keys is a risky endeavor and is definitely not recommended.

Identifier Generation

Sometimes applications do not want to be bothered with trying to define and ensure uniqueness in some aspect of their domain model and are content to let the identifier values be automatically generated for them. This is called ID generation and is specified by the @GeneratedValue annotation .

When ID generation is enabled, the persistence provider will generate an identifier value for every instance of that entity type. Once the identifier value is obtained, the provider will insert it into the newly persisted entity; however, depending on the way it is generated, it might not actually be present in the object until the entity has been inserted in the database. In other words, the application cannot rely on being able to access the identifier until after either a flush has occurred or the transaction has completed.

Applications can choose one of the four different ID generation strategies by specifying a strategy in the strategy element. The value can be any one of AUTO, TABLE, SEQUENCE, or IDENTITY enumerated values of the GenerationType-enumerated type.

Table and sequence generators can be specifically defined and then reused by multiple entity classes. These generators are named and are globally accessible to all the entities in the persistence unit.

Automatic ID Generation

If an application does not care what kind of generation is used by the provider but wants generation to occur, it can specify a strategy of AUTO. This means that the provider will use whatever strategy it wants to generate identifiers. Listing 4-14 shows an example of using automatic ID generation. This will cause an identifier value to be created by the provider and inserted into the id field of each Employee entity that gets persisted.

Tip

It is not explicitly required that the entity identifier field be an integral type, but it is typically the only type that AUTO will create. We recommend that long be used to accommodate the full extent of the generated identifier domain.

@Entity
public class Employee {
    @Id @GeneratedValue(strategy=GenerationType.AUTO)
    private long id;
    // ...
}
Listing 4-14

Using Auto ID Generation

There is a catch to using AUTO, though. The provider gets to pick its own strategy to store the identifiers, but it needs to have some kind of persistent resource in order to do so. For example, if it chooses a table-based strategy, it needs to create a table; if it chooses a sequence-based strategy, it needs to create a sequence. The provider can’t always rely on the database connection that it obtains from the server to have permissions to create a table in the database. This is normally a privileged operation that is often restricted to the DBA. There will need to be some kind of creation phase or schema generation to cause the resource to be created before the AUTO strategy is able to function.

The AUTO mode is really a generation strategy for development or prototyping. It works well as a means of getting you up and running more quickly when the database schema is being generated. In any other situation, it would be better to use one of the other generation strategies discussed in the later sections .

ID Generation Using a Table

The most flexible and portable way to generate identifiers is to use a database table. Not only will it port to different databases but it also allows for storing multiple different identifier sequences for different entities within the same table.

An ID generation table should have two columns. The first column is a string type used to identify the particular generator sequence. It is the primary key for all the generators in the table. The second column is an integral type that stores the actual ID sequence that is being generated. The value stored in this column is the last identifier that was allocated in the sequence. Each defined generator represents a row in the table.

The easiest way to use a table to generate identifiers is to simply specify the generation strategy to be TABLE in the strategy element:
@Id @GeneratedValue(strategy=GenerationType.TABLE)
private long id;

Because the generation strategy is indicated but no generator has been specified, the provider will assume a table of its own choosing. If schema generation is used, it will be created; if not, the default table assumed by the provider must be known and must exist in the database.

A more explicit approach would be to actually specify the table that is to be used for ID storage. This is done by defining a table generator that, contrary to what its name implies, does not actually generate tables. Rather, it is an identifier generator that uses a table to store the identifier values. We can define one by using a @TableGenerator annotation and then refer to it by name in the @GeneratedValue annotation:
@TableGenerator(name="Emp_Gen")
@Id @GeneratedValue(generator="Emp_Gen")
private long id;

Although we are showing the @TableGenerator annotating the identifier attribute, it can actually be defined on any attribute or class. Regardless of where it is defined, it will be available to the entire persistence unit. A good practice would be to define it locally on the ID attribute if only one class is using it but to define it in XML, as described in Chapter 13, if it will be used for multiple classes.

The name element globally names the generator, allowing us to reference it in the generator element of the @GeneratedValue annotation. This is functionally equivalent to the previous example where we simply said that we wanted to use table generation but did not specify the generator. Now we are specifying the name of the generator but not supplying any of the generator details, leaving them to be defaulted by the provider.

A further qualifying approach would be to specify the table details, as in the following:
@TableGenerator(name="Emp_Gen",
    table="ID_GEN",
    pkColumnName="GEN_NAME",
    valueColumnName="GEN_VAL")

We have included some additional elements after the name of the generator. Following the name are three elements—table, pkColumnName, and valueColumnName—that define the actual table that stores the identifiers for Emp_Gen.

The table element just indicates the name of the table. The pkColumnName element is the name of the primary key column in the table that uniquely identifies the generator, and the valueColumnName element is the name of the column that stores the actual ID sequence value being generated. In this case, the table is named ID_GEN, the name of the primary key column (the column that stores the generator names) is named GEN_NAME, and the column that stores the ID sequence values is named GEN_VAL.

The name of the generator becomes the value stored in the pkColumnName column for that row and is used by the provider to look up the generator to obtain its last allocated value.

In our example, we named our generator Emp_Gen so our table would look like the one in Figure 4-3.
Figure 4-3

Table for identifier generation

Note that the last allocated Employee identifier is 0, which tells us that no identifiers have been generated yet. An initialValue element representing the last allocated identifier can be specified as part of the generator definition, but the default setting of 0 will suffice in almost every case. This setting is used only during schema generation when the table is created. During subsequent executions, the provider will read the contents of the value column to determine the next identifier to give out.

To avoid updating the row for every single identifier that gets requested, an allocation size is used. This will cause the provider to preallocate a block of identifiers and then give out identifiers from memory as requested until the block is used up. Once this block is used up, the next request for an identifier triggers another block of identifiers to be preallocated, and the identifier value is incremented by the allocation size. By default, the allocation size is set to 50. This value can be overridden to be larger or smaller through the use of the allocationSize element when defining the generator.

Tip

The provider might allocate identifiers within the same transaction as the entity being persisted or in a separate transaction. It is not specified, but you should check your provider documentation to see how it can avoid the risk of deadlock when concurrent threads are creating entities and locking resources.

Listing 4-15 shows an example of defining a second generator to be used for Address entities but that uses the same ID_GEN table to store the identifier sequence. In this case, we are actually explicitly dictating the value we are storing in the identifier table’s primary key column by specifying the pkColumnvalue element. This element allows the name of the generator to be different from the column value, although doing so is rarely needed. The example shows an Address ID generator named Address_Gen but then defines the value stored in the table for Address ID generation as Addr_Gen. The generator also sets the initial value to 10000 and the allocation size to 100.
@TableGenerator(name="Address_Gen",
    table="ID_GEN",
    pkColumnName="GEN_NAME",
    valueColumnName="GEN_VAL",
    pkColumnValue="Addr_Gen",
    initialValue=10000,
    allocationSize=100)
@Id @GeneratedValue(generator="Address_Gen")
private long id;
Listing 4-15

Using Table ID Generation

If both Emp_Gen and Address_Gen generators were defined, then on application startup the ID_GEN table should look like Figure 4-4. As the application allocates identifiers, the values stored in the GEN_VAL column will increase.
Figure 4-4

Table for generating address and employee identifiers

If you haven’t used the automatic schema generation feature (discussed in Chapter 14), the table must already exist or be created in the database through some other means and be configured to be in this state when the application starts up for the first time. The following SQL could be applied to create and initialize this table:
CREATE TABLE id_gen (
    gen_name VARCHAR(80),
    gen_val INTEGER,
    CONSTRAINT pk_id_gen
        PRIMARY KEY (gen_name)
);
INSERT INTO id_gen (gen_name, gen_val) VALUES ('Emp_Gen', 0);
INSERT INTO id_gen (gen_name, gen_val) VALUES ('Addr_Gen', 10000);

ID Generation Using a Database Sequence

Many databases support an internal mechanism for ID generation called sequences. A database sequence can be used to generate identifiers when the underlying database supports them.

As you saw with table generators, if it is known that a database sequence should be used for generating identifiers, and you are not concerned that it be any particular sequence, specifying the generator type alone should be sufficient:
@Id @GeneratedValue(strategy=GenerationType.SEQUENCE)
private long id;
In this case, no generator is named, so the provider will use a default sequence object of its own choosing. Note that if multiple sequence generators are defined but not named, it is not specified whether they use the same default sequence or different ones. The only difference between using one sequence for multiple entity types and using one for each entity would be the ordering of the sequence numbers and possible contention on the sequence. The safer route would be to define a named sequence generator and refer to it in the @GeneratedValue annotation:
@SequenceGenerator(name="Emp_Gen", sequenceName="Emp_Seq")
@Id @GeneratedValue(generator="Emp_Gen")
private long getId;
Unless schema generation is enabled, it would require that the sequence be defined and already exist. The SQL to create such a sequence would be as follows:
CREATE SEQUENCE Emp_Seq
    MINVALUE 1
    START WITH 1
    INCREMENT BY 50

The initial value and allocation size can also be used in sequence generators and would need to be reflected in the SQL to create the sequence. Note that the default allocation size is 50, just as it is with table generators. If schema generation is not being used, and the sequence is being manually created, the INCREMENT BY clause would need to be configured to match the allocationSize element or default allocation size of the corresponding @SequenceGenerator annotation.

ID Generation Using Database Identity

Some databases support a primary key identity column, sometimes referred to as an autonumber column. Whenever a row is inserted into the table, the identity column will get a unique identifier assigned to it. It can be used to generate the identifiers for objects, but once again is available only when the underlying database supports it. Identity is often used when database sequences are not supported by the database or because a legacy schema has already defined the table to use identity columns. They are generally less efficient for object-relational identifier generation because they cannot be allocated in blocks and because the identifier is not available until after commit time.

To indicate that IDENTITY generation should occur, the @GeneratedValue annotation should specify a generation strategy of IDENTITY. This will indicate to the provider that it must reread the inserted row from the table after an insert has occurred. This will allow it to obtain the newly generated identifier from the database and put it into the in-memory entity that was just persisted:
@Id @GeneratedValue(strategy=GenerationType.IDENTITY)
private long id;

There is no generator annotation for IDENTITY because it must be defined as part of the database schema definition for the primary key column of the entity. Because each entity primary key column defines its own identity characteristic, IDENTITY generation cannot be shared across multiple entity types.

Another difference, hinted at earlier, between using IDENTITY and other ID generation strategies is that the identifier will not be accessible until after the insert has occurred. Although no guarantee is made about the accessibility of the identifier before the transaction has completed, it is at least possible for other types of generation to eagerly allocate the identifier. But when using identity, it is the action of inserting that causes the identifier to be generated. It would be impossible for the identifier to be available before the entity is inserted into the database, and because insertion of entities is most often deferred until commit time, the identifier would not be available until after the transaction has been committed .

Tip

If you use IDENTITY, make sure you are aware of what your persistence provider is doing and that it matches your requirements. Some providers eagerly insert (when the persist method is invoked) entities that are configured to use IDENTITY ID generation, instead of waiting until commit time. This will allow the ID to be available immediately, at the expense of premature locking and reduced concurrency. Some providers even have an option that allows you to configure which approach is used.

Relationships

If entities contained only simple persistent state, the business of object-relational mapping would be a trivial one, indeed. Most entities need to be able to reference, or have relationships with, other entities. This is what produces the domain model graphs that are common in business applications.

In the following sections, we explore the different kinds of relationships that can exist and show how to define and map them using Jakarta Persistence mapping metadata.

Relationship Concepts

Before we go off and start mapping relationships, let’s take a quick tour through some of the basic relationship concepts and terminology. Having a firm grasp on these concepts will make it easier to understand the remainder of the relationship mapping sections.

Roles

There is an old adage that says every story has three sides: yours, mine, and the truth. Relationships are kind of the same in that there are three different perspectives. The first is the view from one side of the relationship, the second is from the other side, and the third is from a global perspective that knows about both sides. The “sides” are called roles. In every relationship there are two entities that are related to one another, and each entity is said to play a role in the relationship.

Relationships are everywhere, so examples are not hard to come by. An employee has a relationship to the department that he or she works in. The Employee entity plays the role of working in the department, while the Department entity plays the role of having an employee working in it.

Of course, the role a given entity is playing differs according to the relationship, and an entity might be participating in many different relationships with many different entities. We can conclude, therefore, that any entity might be playing a number of different roles in any given model. If we think of an Employee entity, we realize that it does, in fact, play other roles in other relationships, such as the role of working for a manager in its relationship with another Employee entity, working on a project in its relationship with the Project entity, and so forth. Although there are no metadata requirements to declare the role an entity is playing, roles are nevertheless still helpful as a means of understanding the nature and structure of relationships.

Directionality

In order to have relationships at all, there has to be a way to create, remove, and maintain them. The basic way this is done is by an entity having a relationship attribute that refers to its related entity in a way that identifies it as playing the other role of the relationship. It is often the case that the other entity, in turn, has an attribute that points back to the original entity. When each entity points to the other, the relationship is bidirectional. If only one entity has a pointer to the other, the relationship is said to be unidirectional.

A relationship from an Employee to the Project that they work on would be bidirectional. The Employee should know its Project, and the Project should point to the Employee working on it. A UML model of this relationship is shown in Figure 4-5. The arrows going in both directions indicate the bidirectionality of the relationship.
Figure 4-5

Employee and Project in a bidirectional relationship

An Employee and its Address would likely be modeled as a unidirectional relationship because the Address is not expected to ever need to know its resident. If it did, of course, then it would need to become a bidirectional relationship. Figure 4-6 shows this relationship. Because the relationship is unidirectional, the arrow points from the Employee to the Address.
Figure 4-6

Employee in a unidirectional relationship with Address

As you will see later in the chapter, although they both share the same concept of directionality, the object and data models each see it a little differently because of the paradigm difference. In some cases, unidirectional relationships in the object model can pose a problem in the database model.

We can use the directionality of a relationship to help describe and explain a model, but when it comes to actually discussing it in concrete terms, it makes sense to think of every bidirectional relationship as a pair of unidirectional relationships. Instead of having a single bidirectional relationship of an Employee working on a Project, we would have one unidirectional “project” relationship where the Employee points to the Project they work on and another unidirectional “worker” relationship where the Project points to the Employee that works on it. Each of these relationships has an entity that is the source or referring role and the side that is the target or referred-to role. The beauty of this is that we can use the same terms no matter which relationship we are talking about and no matter what roles are in the relationship. Figure 4-7 shows how the two relationships have source and target entities, and how from each relationship perspective, the source and target entities are different.
Figure 4-7

Unidirectional relationships between Employee and Project

Cardinality

It isn’t very often that a project has only a single employee working on it. We would like to be able to capture the aspect of how many entities exist on each side of the same relationship instance. This is called the cardinality of the relationship. Each role in a relationship will have its own cardinality, which indicates whether there can be only one instance of the entity or many instances.

In our Employee and Department example, we might first say that one employee works in one department, so the cardinality of both sides would be one. But chances are that more than one employee works in the department, so we would make the relationship have a many cardinality on the Employee or source side, meaning that many Employee instances could each point to the same Department. The target or Department side would keep its cardinality of one. Figure 4-8 shows this many-to-one relationship. The “many” side is marked with an asterisk (*).
Figure 4-8

Unidirectional many-to-one relationship

In our Employee and Project example, we have a bidirectional relationship, or two relationship directions. If an employee can work on multiple projects, and a project can have multiple employees working on it, then we would end up with cardinalities of “many” on the sources and targets of both directions. Figure 4-9 shows the UML diagram of this relationship.
Figure 4-9

Bidirectional many-to-many relationship

As the saying goes, a picture is worth a thousand words, and describing these relationships in text is quite a lot harder than showing a picture. In words, though, this picture indicates the following:
  • Each employee can work on a number of projects.

  • Many employees can work on the same project.

  • Each project can have a number of employees working on it.

  • Many projects can have the same employee working on them.

Implicit in this model is the fact that there can be sharing of Employee and Project instances across multiple relationship instances.

Ordinality

A role can be further specified by determining whether or not it might be present at all. This is called the ordinality , and it serves to show whether the target entity needs to be specified when the source entity is created. Because the ordinality is really just a Boolean value, it is also referred to as the optionality of the relationship.

In cardinality terms, ordinality would be indicated by the cardinality being a range instead of a simple value, and the range would begin with 0 or 1 depending on the ordinality. It is simpler, though, to merely state that the relationship is either optional or mandatory. If optional, the target might not be present; if mandatory, a source entity without a reference to its associated target entity is in an invalid state.

Mappings Overview

Now that you know enough theory and have the conceptual background to be able to discuss relationships, we can go on to explaining and using relationship mappings.

Each one of the mappings is named for the cardinality of the source and target roles. As shown in the previous sections, a bidirectional relationship can be viewed as a pair of two unidirectional mappings. Each of these mappings is really a unidirectional relationship mapping, and if we take the cardinalities of the source and target of the relationship and combine them together in that order, permuting them with the two possible values of “one” and “many,” we end up with the following names given to the mappings:
  • Many-to-one

  • One-to-one

  • One-to-many

  • Many-to-many

These mapping names are also the names of the annotations that are used to indicate the relationship types on the attributes that are being mapped. They are the basis for the logical relationship annotations, and they contribute to the object modeling aspects of the entity. Like basic mappings, relationship mappings can be applied to either fields or properties of the entity.

Single-Valued Associations

An association from an entity instance to another entity instance (where the cardinality of the target is “one”) is called a single-valued association. The many-to-one and one-to-one relationship mappings fall into this category because the source entity refers to at most one target entity. We discuss these relationships and some of their variants first.

Many-to-One Mappings

In our cardinality discussion of the Employee and Department relationship (shown in Figure 4-8), we first thought of an employee working in a department, so we just assumed that it was a one-to-one relationship. However, when we realized that more than one employee works in the same department, we changed it to a many-to-one relationship mapping. It turns out that many-to-one is the most common mapping and is the one that is normally used when creating an association to an entity.

Figure 4-10 shows a many-to-one relationship between Employee and Department. Employee is the “many” side and the source of the relationship, and Department is the “one” side and the target. Once again, because the arrow points in only one direction, from Employee to Department, the relationship is unidirectional. Note that in UML, the source class has an implicit attribute of the target class type if it can be navigated to. For example, Employee has an attribute called department that will contain a reference to a single Department instance. The actual attribute is not shown in the Employee class but is implied by the presence of the relationship arrow.
Figure 4-10

Many-to-one relationship from Employee to Department

A many-to-one mapping is defined by annotating the attribute in the source entity (the attribute that refers to the target entity) with the @ManyToOne annotation. Listing 4-16 shows how the @ManyToOne annotation is used to map this relationship. The department field in Employee is the source attribute that is annotated.
@Entity
public class Employee {
    // ...
    @ManyToOne
    private Department department;
    // ...
}
Listing 4-16

Many-to-One Relationship from Employee to Department

We have included only the bits of the class that are relevant to our discussion, but you can see from the previous example that the code was rather anticlimactic. A single annotation was all that was required to map the relationship, and it turned out to be quite dull, really. Of course, when it comes to configuration, dull is beautiful.

The same kinds of attribute flexibility and modifier requirements that were described for basic mappings also apply to relationship mappings. The annotation can be present on either the field or property, depending on the strategy used for the entity .

Using Join Columns

In the database , a relationship mapping means that one table has a reference to another table. The database term for a column that refers to a key (usually the primary key) in another table is a foreign key column . In Jakarta Persistence, they’re called join columns, and the @JoinColumn annotation is the primary annotation used to configure these types of columns.

Note

Later in the chapter, we talk about join columns that are present in other tables called join tables. In Chapter 10, we cover a more advanced case of using a join table for single-valued associations.

Consider the EMPLOYEE and DEPARTMENT tables shown in Figure 4-11 that correspond to the Employee and Department entities. The EMPLOYEE table has a foreign key column named DEPT_ID that references the DEPARTMENT table. From the perspective of the entity relationship, DEPT_ID is the join column that associates the Employee and Department entities.
Figure 4-11

EMPLOYEE and DEPARTMENT tables

In almost every relationship, independent of source and target sides, one of the two sides will have the join column in its table. That side is called the owning side or the owner of the relationship. The side that does not have the join column is called the nonowning or inverse side.

Ownership is important for mapping because the physical annotations that define the mappings to the columns in the database (e.g., @JoinColumn) are always defined on the owning side of the relationship. If they are not there, the values are defaulted from the perspective of the attribute on the owning side.

Note

Although we have described the owning side as being determined by the data schema, the object model must indicate the owning side through the use of the relationship mapping annotations. The absence of the mappedBy element in the mapping annotation implies ownership of the relationship, while the presence of the mappedBy element means the entity is on the inverse side of the relationship. The mappedBy element is described in subsequent sections.

Many-to-one mappings are always on the owning side of a relationship, so if there is a @JoinColumn to be found in the relationship that has a many-to-one side, that is where it will be located. To specify the name of the join column, the name element is used. For example, the @JoinColumn(name="DEPT_ID") annotation means that the DEPT_ID column in the source entity table is the foreign key to the target entity table, whatever the target entity of the relationship happens to be.

If no @JoinColumn annotation accompanies the many-to-one mapping, a default column name will be assumed. The name that is used as the default is formed from a combination of both the source and target entities. It is the name of the relationship attribute in the source entity, which is department in our example, plus an underscore character (_), plus the name of the primary key column of the target entity. So if the Department entity were mapped to a table that had a primary key column named ID, the join column in the EMPLOYEE table would be assumed to be named DEPARTMENT_ID. If this is not actually the name of the column, the @JoinColumn annotation must be defined to override the default.

Going back to Figure 4-11, the foreign key column is named DEPT_ID instead of the defaulted DEPARTMENT_ID column name. Listing 4-17 shows the @JoinColumn annotation being used to override the join column name to be DEPT_ID.
@Entity
public class Employee {
    @Id private long id;
    @ManyToOne
    @JoinColumn(name="DEPT_ID")
    private Department department;
    // ...
}
Listing 4-17

Many-to-One Relationship Overriding the Join Column

Annotations allow us to specify @JoinColumn on either the same line as @ManyToOne or on a separate line, above or below it. By convention, the logical mapping should appear first, followed by the physical mapping. This makes the object model clear because the physical part is less important to the object model.

One-to-One Mappings

If only one employee could work in a department , we would be back to the one-to-one association again. A more realistic example of a one-to-one association, however, would be an employee who has a parking space. Assuming that every employee got assigned his or her own parking space, we would create a one-to-one relationship from Employee to ParkingSpace. Figure 4-12 shows this relationship.
Figure 4-12

One-to-one relationship from Employee to ParkingSpace

We define the mapping in a similar way to the way we define a many-to-one mapping, except that we use the @OneToOne annotation instead of a @ManyToOne annotation on the parkingSpace attribute. Just as with a many-to-one mapping, the one-to-one mapping has a join column in the database and needs to override the name of the column in a @JoinColumn annotation when the default name does not apply. The default name is composed the same way as for many-to-one mappings using the name of the source attribute and the target primary key column name.

Figure 4-13 shows the tables mapped by the Employee and ParkingSpace entities. The foreign key column in the EMPLOYEE table is named PSPACE_ID and refers to the PARKING_SPACE table.
Figure 4-13

EMPLOYEE and PARKING_SPACE tables

As it turns out, one-to-one mappings are almost the same as many-to-one mappings except that only one instance of the source entity can refer to the same target entity instance. In other words, the target entity instance is not shared among the source entity instances. In the database, this equates to having a uniqueness constraint on the source foreign key column (i.e., the foreign key column in the source entity table). If there were more than one foreign key value that was the same, it would contravene the rule that no more than one source entity instance can refer to the same target entity instance.

Listing 4-18 shows the mapping for this relationship. The @JoinColumn annotation has been used to override the default join column name of PARKINGSPACE_ID to be PSPACE_ID.
@Entity
public class Employee {
    @Id private long id;
    private String name;
    @OneToOne
    @JoinColumn(name="PSPACE_ID")
    private ParkingSpace parkingSpace;
    // ...
}
Listing 4-18

One-to-One Relationship from Employee to ParkingSpace

Bidirectional One-to-One Mappings

The target entity of the one-to-one often has a relationship back to the source entity; for example, ParkingSpace has a reference back to the Employee that uses it. When this is the case, it is called a bidirectional one-to-one relationship. As you saw previously, we actually have two separate one-to-one mappings, one in each direction, but the combination of the two is called a bidirectional one-to-one relationship. To make our existing one-to-one employee and parking space example bidirectional, we need only change the ParkingSpace to point back to the Employee. Figure 4-14 shows the bidirectional relationship.
Figure 4-14

One-to-one relationship between Employee and ParkingSpace

You already learned that the entity table that contains the join column determines the entity that is the owner of the relationship. In a bidirectional one-to-one relationship, both the mappings are one-to-one mappings, and either side can be the owner, so the join column might end up being on one side or the other. This would normally be a data modeling decision, not a Java programming decision, and it would likely be decided based on the most frequent direction of traversal.

Consider the ParkingSpace entity class shown in Listing 4-19. This example assumes the table mapping shown in Figure 4-13, and it assumes that Employee is the owning side of the relationship. We now have to add a reference from ParkingSpace back to Employee. This is achieved by adding the @OneToOne relationship annotation on the employee field. As part of the annotation, we must add a mappedBy element to indicate that the owning side is the Employee, not the ParkingSpace. Because ParkingSpace is the inverse side of the relationship, it does not have to supply the join column information.
@Entity
public class ParkingSpace {
    @Id private long id;
    private int lot;
    private String location;
    @OneToOne(mappedBy="parkingSpace")
    private Employee employee;
    // ...
}
Listing 4-19

Inverse Side of a Bidirectional One-to-One Relationship

The mappedBy element in the one-to-one mapping of the employee attribute of ParkingSpace is needed to refer to the parkingSpace attribute in the Employee class. The value of mappedBy is the name of the attribute in the owning entity that points back to the inverse entity.

The two rules, then, for bidirectional one-to-one associations are the following:
  • The @JoinColumn annotation goes on the mapping of the entity that is mapped to the table containing the join column, or the owner of the relationship. This might be on either side of the association.

  • The mappedBy element should be specified in the @OneToOne annotation in the entity that does not define a join column, or the inverse side of the relationship.

It would not be legal to have a bidirectional association that had mappedBy on both sides, just as it would be incorrect to not have it on either side. The difference is that if it were absent on both sides of the relationship, the provider would treat each side as an independent unidirectional relationship. This would be fine except that it would assume that each side was the owner and that each had a join column.

Bidirectional many-to-one relationships are explained later as part of the discussion of multivalued bidirectional associations .

Collection-Valued Associations

When the source entity references one or more target entity instances, a many-valued association or associated collection is used. Both the one-to-many and many-to-many mappings fit the criteria of having many target entities, and although the one-to-many association is the most frequently used, many-to-many mappings are useful as well when there is sharing in both directions.

One-to-Many Mappings

When an entity is associated with a Collection of other entities, it is most often in the form of a one-to-many mapping. For example, a department would normally have a number of employees. Figure 4-15 shows the Employee and Department relationship that we showed earlier in the section “Many-to-One Mappings,” only this time the relationship is bidirectional in nature.
Figure 4-15

Bidirectional Employee and Department relationship

As mentioned earlier, when a relationship is bidirectional, there are actually two mappings, one for each direction. A bidirectional one-to-many relationship always implies a many-to-one mapping back to the source, so in our Employee and Department example, there is a one-to-many mapping from Department to Employee and a many-to-one mapping from Employee back to Department. We could just as easily say that the relationship is bidirectional many-to-one if we were looking at it from the Employee perspective. They are equivalent because bidirectional many-to-one relationships imply a one-to-many mapping back from the target to source, and vice versa.

When a source entity has an arbitrary number of target entities stored in its collection, there is no scalable way to store those references in the database table that it maps to. How would it store an arbitrary number of foreign keys in a single row? Instead, it must let the tables of the entities in the collection have foreign keys back to the source entity table. This is why the one-to-many association is almost always bidirectional and the “one” side is not normally the owning side.

Furthermore, if the target entity tables have foreign keys that point back to the source entity table, the target entities should have many-to-one associations back to the source entity object. Having a foreign key in a table for which there is no association in the corresponding entity object model is not being true to the data model. It is nonetheless still possible to configure, though.

Let’s look at a concrete example of a one-to-many mapping based on the Employee and Department example shown in Figure 4-15. The tables for this relationship are exactly the same as those shown in Figure 4-11, which showed a many-to-one relationship. The only difference between the many-to-one example and this one is that we are now implementing the inverse side of the relationship. Because Employee has the join column and is the owner of the relationship, the Employee class is unchanged from Listing 4-16.

On the Department side of the relationship, we need to map the employees collection of Employee entities as a one-to-many association using the @OneToMany annotation. Listing 4-20 shows the Department class that uses this annotation. Note that because this is the inverse side of the relationship, we need to include the mappedBy element, just as we did in the bidirectional one-to-one relationship example.
@Entity
public class Department {
    @Id private long id;
    private String name;
    @OneToMany(mappedBy="department")
    private Collection<Employee> employees;
    // ...
}
Listing 4-20

One-to-Many Relationship

There are a couple of noteworthy points to mention about this class. The first is that a generic type-parameterized Collection is being used to store the Employee entities. This provides the strict typing that guarantees that only objects of type Employee will exist in the Collection. This is quite useful because it not only provides compile-time checking of our code but also saves us from having to perform cast operations when we retrieve the Employee instances from the collection.

Jakarta Persistence assumes the availability of generics; however, it is still perfectly acceptable to use a Collection that is not type-parameterized. We might just as well have defined the Department class without using generics but defining only a simple Collection type, as we would have done in releases of standard Java previous to Java SE 5 (except for JDK 1.0 or 1.1, when java.util.Collection was not even standardized!). If we did, we would need to specify the type of entity that will be stored in the Collection that is needed by the persistence provider. The code is shown in Listing 4-21 and looks almost identical, except for the targetEntity element that indicates the entity type.
@Entity
public class Department {
    @Id private long id;
    private String name;
    @OneToMany(targetEntity=Employee.class, mappedBy="department")
    private Collection employees;
    // ...
}
Listing 4-21

Using targetEntity

There are two important points to remember when defining bidirectional one-to-many (or many-to-one) relationships:
  • The many-to-one side should be the owning side, so the join column should be defined on that side.

  • The one-to-many mapping should be the inverse side, so the mappedBy element should be used.

Failing to specify the mappedBy element in the @OneToMany annotation will cause the provider to treat it as a unidirectional one-to-many relationship that is defined to use a join table (described later). This is an easy mistake to make and should be the first thing you look for if you see a missing table error with a name that has two entity names concatenated together .

Many-to-Many Mappings

When one or more entities are associated with a Collection of other entities, and the entities have overlapping associations with the same target entities, we must model it as a many-to-many relationship. Each of the entities on each side of the relationship will have a collection-valued association that contains entities of the target type. Figure 4-16 shows a many-to-many relationship between Employee and Project. Each employee can work on multiple projects, and each project can be worked on by multiple employees.
Figure 4-16

Bidirectional many-to-many relationship

A many-to-many mapping is expressed on both the source and target entities as a @ManyToMany annotation on the collection attributes. For example, in Listing 4-22 the Employee has a projects attribute that has been annotated with @ManyToMany. Likewise, the Project entity has an employees attribute that has also been annotated with @ManyToMany.
@Entity
public class Employee {
    @Id private long id;
    private String name;
    @ManyToMany
    private Collection<Project> projects;
    // ...
}
@Entity
public class Project {
    @Id private long id;
    private String name;
    @ManyToMany(mappedBy="projects")
    private Collection<Employee> employees;
    // ...
}
Listing 4-22

Many-to-Many Relationship Between Employee and Project

There are some important differences between this many-to-many relationship and the one-to-many relationship discussed earlier. The first is a mathematical inevitability: when a many-to-many relationship is bidirectional, both sides of the relationship are many-to-many mappings.

The second difference is that there are no join columns on either side of the relationship. You will see in the next section that the only way to implement a many-to-many relationship is with a separate join table. The consequence of not having any join columns in either of the entity tables is that there is no way to determine which side is the owner of the relationship. Because every bidirectional relationship has to have both an owning side and an inverse side, we must pick one of the two entities to be the owner. In this example, we picked Employee to be owner of the relationship, but we could have just as easily picked Project instead. As in every other bidirectional relationship, the inverse side must use the mappedBy element to identify the owning attribute.

Note that no matter which side is designated as the owner, the other side should include the mappedBy element; otherwise, the provider will think that both sides are the owner and that the mappings are separate unidirectional relationships .

Using Join Tables

Because the multiplicity of both sides of a many-to-many relationship is plural, neither of the two entity tables can store an unlimited set of foreign key values in a single entity row. We must use a third table to associate the two entity types. This association table is called a join table, and each many-to-many relationship must have one. They might be used for the other relationship types as well, but are not required and are therefore less common.

A join table consists simply of two foreign key or join columns to refer to each of the two entity types in the relationship. A collection of entities is then mapped as multiple rows in the table, each of which associates one entity with another. The set of rows that contain a given entity identifier in the source foreign key column represents the collection of entities related to that given entity.

Figure 4-17 shows the EMPLOYEE and PROJECT tables for the Employee and Project entities and the EMP_PROJ join table that associates them. The EMP_PROJ table contains only foreign key columns that make up its compound primary key. The EMP_ID column refers to the EMPLOYEE primary key, while the PROJ_ID column refers to the PROJECT primary key.
Figure 4-17

Join table for a many-to-many relationship

In order to map the tables described in Figure 4-17, we need to add some metadata to the Employee class that we have designated as the owner of the relationship. Listing 4-23 shows the many-to-many relationship with the accompanying join table annotations.
@Entity
public class Employee {
    @Id private long id;
    private String name;
    @ManyToMany
    @JoinTable(name="EMP_PROJ",
          joinColumns=@JoinColumn(name="EMP_ID"),
          inverseJoinColumns=@JoinColumn(name="PROJ_ID"))
    private Collection<Project> projects;
    // ...
}
Listing 4-23

Using a Join Table

The @JoinTable annotation is used to configure the join table for the relationship. The two join columns in the join table are distinguished by means of the owning and inverse sides. The join column to the owning side is described in the joinColumns element, while the join column to the inverse side is specified by the inverseJoinColumns element. You can see from Listing 4-23 that the values of these elements are actually @JoinColumn annotations embedded within the @JoinTable annotation. This provides the ability to declare all of the information about the join columns within the table that defines them. The names are plural for times when there might be multiple columns for each foreign key (either the owning entity or the inverse entity has a multipart primary key). This more complicated case is discussed in Chapter 10.

In our example, we fully specified the names of the join table and its columns because this is the most common case. But if we were generating the database schema from the entities, we would not actually need to specify this information. We could have relied on the default values that would be assumed and used when the persistence provider generates the table for us. When no @JoinTable annotation is present on the owning side, then a default join table named <Owner>_<Inverse> is assumed, where <Owner> is the name of the owning entity and <Inverse> is the name of the inverse or nonowning entity. Of course, the owner is basically picked at random by the developer, so these defaults will apply according to the way the relationship is mapped and whichever entity is designated as the owning side.

The join columns will be defaulted according to the join column defaulting rules that were previously described in the section “Using Join Columns.” The default name of the join column that points to the owning entity is the name of the attribute on the inverse entity that points to the owning entity, appended by an underscore and the name of the primary key column of the owning entity table. So in our example, the Employee is the owning entity, and the Project has an employees attribute that contains the collection of Employee instances. The Employee entity maps to the EMPLOYEE table and has a primary key column of ID, so the defaulted name of the join column to the owning entity would be EMPLOYEES_ID. The inverse join column would be likewise default to PROJECTS_ID.

It is fairly clear that the defaulted names of a join table and the join columns within it are not likely to match up with an existing table. This is why we mentioned that the defaults are really useful only if the database schema being mapped to was generated by the provider.

Unidirectional Collection Mappings

When an entity has a one-to-many mapping to a target entity, but the @OneToMany annotation does not include the mappedBy element, it is assumed to be in a unidirectional relationship with the target entity. This means that the target entity does not have a many-to-one mapping back to the source entity. Figure 4-18 shows a unidirectional one-to-many association between Employee and Phone.
Figure 4-18

Unidirectional one-to-many relationship

Consider the data model in Figure 4-19. There is no join column to store the association back from Phone to Employee. Therefore, we have used a join table to associate the Phone entity with the Employee entity.
Figure 4-19

Join table for a unidirectional one-to-many relationship

Similarly, when one side of a many-to-many relationship does not have a mapping to the other, it is a unidirectional relationship. The join table must still be used; the only difference is that only one of the two entity types actually uses the table to load its related entities or updates it to store additional entity associations.

In both of these two unidirectional collection-valued cases , the source code is similar to the earlier examples, but there is no attribute in the target entity to reference the source entity, and the mappedBy element will not be present in the @OneToMany annotation on the source entity. The join table must now be specified as part of the mapping. Listing 4-24 shows Employee with a one-to-many relationship to Phone using a join table.
@Entity
public class Employee {
    @Id private long id;
    private String name;
    @OneToMany
    @JoinTable(name="EMP_PHONE",
          joinColumns=@JoinColumn(name="EMP_ID"),
          inverseJoinColumns=@JoinColumn(name="PHONE_ID"))
    private Collection<Phone> phones;
    // ...
}
Listing 4-24

Unidirectional One-to-Many Relationship

Note that when generating the schema, default naming for the join columns is slightly different in the unidirectional case because there is no inverse attribute. The name of the join table would default to EMPLOYEE_PHONE and would have a join column named EMPLOYEE_ID after the name of the Employee entity and its primary key column. The inverse join column would be named PHONES_ID, which is the concatenation of the phones attribute in the Employee entity and the ID primary key column of the PHONE table .

Lazy Relationships

Previous sections showed how to configure an attribute to be loaded when it got accessed and not necessarily before. You learned that lazy loading at the attribute level is not normally very beneficial.

At the relationship level, however, lazy loading can be a big boon to enhancing performance. It can reduce the amount of SQL that gets executed, and speed up queries and object loading considerably.

The fetch mode can be specified on any of the four relationship mapping types. When not specified on a single-valued relationship, the related object is guaranteed to be loaded eagerly. Collection-valued relationships default to be lazily loaded, but because lazy loading is only a hint to the provider, they can be loaded eagerly if the provider decides to do so.

In bidirectional relationship cases, the fetch mode might be lazy on one side but eager on the other. This kind of configuration is actually quite common because relationships are often accessed in different ways depending on the direction from which navigation occurs.

An example of overriding the default fetch mode is if we don’t want to load the ParkingSpace for an Employee every time we load the Employee. Listing 4-25 shows the parkingSpace attribute configured to use lazy loading.
@Entity
public class Employee {
    @Id private long id;
    @OneToOne(fetch=FetchType.LAZY)
    private ParkingSpace parkingSpace;
    // ...
}
Listing 4-25

Changing the Fetch Mode on a Relationship

Tip

A relationship that is specified or defaulted to be lazily loaded might or might not cause the related object to be loaded when the getter method is used to access the object. The object might be a proxy, so it might take actually invoking a method on it to cause it to be faulted in.

Embedded Objects

An embedded object is one that is dependent on an entity for its identity. It has no identity of its own, but is merely part of the entity state that has been carved off and stored in a separate Java object hanging off of the entity. In Java, embedded objects appear similar to relationships in that they are referenced by an entity and appear in the Java sense to be the target of an association. In the database, however, the state of the embedded object is stored with the rest of the entity state in the database row, with no distinction between the state in the Java entity and that in its embedded object.

Tip

Although embedded objects are referenced by the entities that own them, they are not said to be in relationships with the entities. The term relationship can only be applied when both sides are entities.

If the database row contains all the data for both the entity and its embedded object, why have such an object anyway? Why not just define the fields of the entity to reference all its persistence state instead of splitting it up into one or more subobjects that are second-class persistent objects dependent on the entity for their existence?

This brings us back to the object-relational impedance mismatch we talked about in Chapter 1. Because the database record contains more than one logical type, it makes sense to make that separation explicit in the object model of the application even though the physical representation is different. You could almost say that the embedded object is a more natural representation of the domain concept than a simple collection of attributes on the entity. Furthermore, once you have identified a grouping of entity state that makes up an embedded object, you can share the same embedded object type with other entities that also have the same internal representation.1

An example of such reuse is address information. Figure 4-20 shows an EMPLOYEE table that contains a mixture of basic employee information as well as columns that correspond to the home address of the employee.
Figure 4-20

EMPLOYEE table with embedded address information

The STREET, CITY, STATE, and ZIP_CODE columns combine logically to form the address. In the object model, this is an excellent candidate to be abstracted into a separate Address-embedded type instead of listing each attribute on the entity class. The entity class would then simply have an address attribute pointing to an embedded object of type Address. Figure 4-21 shows how Employee and Address relate to each other. The UML composition association is used to denote that the Employee wholly owns the Address and that an instance of Address cannot be shared by any other object other than the Employee instance that owns it.
Figure 4-21

Employee and embedded Address

With this representation, not only is the address information neatly encapsulated within an object but if another entity such as Company also has address information, it can also have an attribute that points to its own embedded Address object. We describe this scenario in the next section.

An embedded type is marked as such by adding the @Embeddable annotation to the class definition. This annotation serves to distinguish the class from other regular Java types. Once a class has been designated as embeddable, then its fields and properties will be persistable as part of an entity. We might also want to define the access type of the embeddable object so it is accessed the same way regardless of which entity it is embedded in. Listing 4-26 shows the definition of the Address-embedded type.
@Embeddable @Access(AccessType.FIELD)
public class Address {
    private String street;
    private String city;
    private String state;
    @Column(name="ZIP_CODE")
    private String zip;
    // ...
}
Listing 4-26

Embeddable Address Type

To use this class in an entity, the entity needs to have only an attribute of the embeddable type. The attribute is optionally annotated with the @Embedded annotation to indicate that it is an embedded mapping. Listing 4-27 shows the Employee class using an embedded Address object.
@Entity
public class Employee {
    @Id private long id;
    private String name;
    private long salary;
    @Embedded private Address address;
    // ...
}
Listing 4-27

Using an Embedded Object

When the provider persists an instance of Employee, it will access the attributes of the Address object just as if they were present on the entity instance itself. Column mappings on the Address type really pertain to columns on the EMPLOYEE table, even though they are listed in a different type.

The decision to use embedded objects or entities depends on whether you think you will ever need to create relationships to them or from them. Embedded objects are not meant to be entities, and as soon as you start to treat them as entities, you should probably make them first-class entities instead of embedded objects if the data model permits it.

Tip

It is not portable to define embedded objects as part of inheritance hierarchies. Once they begin to extend one another, the complexity of embedding them increases, and the value for cost ratio decreases.

Before we got to our example, we mentioned that an Address class could be reused in both Employee and Company entities. Ideally we would like the representation shown in Figure 4-22. Even though both the Employee and Company classes comprise the Address class, this is not a problem because each instance of Address will be used by only a single Employee or Company instance.
Figure 4-22

Address shared by two entities

Given that the column mappings of the Address-embedded type apply to the columns of the containing entity, you might be wondering how sharing could be possible if the two entity tables have different column names for the same fields. Figure 4-23 demonstrates this problem. The COMPANY table matches the default and mapped attributes of the Address type defined earlier, but the EMPLOYEE table in this example has been changed to match the address requirements of a person living in Canada. We need a way for an entity to map the embedded object according to its own entity table needs, and we have one in the @AttributeOverride annotation.
Figure 4-23

EMPLOYEE and COMPANY tables

We use an @AttributeOverride annotation for each attribute of the embedded object that we want to override in the entity. We annotate the embedded field or property in the entity and specify in the name element the field or property in the embedded object that we are overriding. The column element allows us to specify the column that the attribute is being mapped to in the entity table. We indicate this in the form of a nested @Column annotation. If we are overriding multiple fields or properties, we can use the plural @AttributeOverrides annotation and nest multiple @AttributeOverride annotations inside of it. Note that since the @AttributeOverride annotation is @Repeatable, usage of @AttributeOverrides annotation is not mandatory.

Listing 4-28 shows an example of using Address in both Employee and Company. The Company entity uses the Address type without change, but the Employee entity specifies two attribute overrides to map the state and zip attributes of the Address to the PROVINCE and POSTAL_CODE columns of the EMPLOYEE table.
@Entity
public class Employee {
    @Id private long id;
    private String name;
    private long salary;
    @Embedded
    @AttributeOverride(name="state", column=@Column(name="PROVINCE")),
    @AttributeOverride(name="zip", column=@Column(name="POSTAL_CODE"))
    private Address address;
    // ...
}
@Entity
public class Company {
    @Id private String name;
    @Embedded
    private Address address;
    // ...
}
Listing 4-28

Reusing an Embedded Object in Multiple Entities

Summary

Mapping objects to relational databases is of critical importance to persistence applications. Dealing with the impedance mismatch requires a sophisticated suite of metadata. Jakarta Persistence not only provides this metadata but also facilitates easy and convenient development.

In this chapter, we went through the process of mapping entity state that included simple Java types, large objects, enumerated types, and temporal types. We also used the metadata to do meet-in-the-middle mapping to specific table names and columns.

We explained how identifiers are generated and described four different strategies of generation. You saw the different strategies in action and learned how to differentiate them from each other.

We then reviewed some of the relationship concepts and applied them to object-relational mapping metadata. We used join columns and join tables to map single-valued and collection-valued associations and went over some examples. We also discussed special types of objects called embeddables that are mapped but do not have identifiers and can exist only within persistent entities.

The next chapter discusses more of the intricacies of mapping collection-valued relationships, as well as how to map collections of nonentity objects. We delve into the different Collection types and the ways that these types can be used and mapped, and see how they affect the database tables that are being mapped to.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.5.86