8.3. Improving schema DDL

Customizing the DDL in your Hibernate application is something you'll usually consider only when you generate the database schema with Hibernate's toolset. If a schema already exists, such customizations won't affect the runtime behavior of Hibernate.

You can export DDL to a text file or execute it directly on your database whenever you run your integration tests. Because DDL is mostly vendor-specific, every option you put in your mapping metadata has the potential to bind the metadata to a particular database product—keep this in mind when applying the following features.

We separate DDL customization into two categories:

  • Naming automatically generated database objects, such as tables, columns, and constraints explicitly in mapping metadata, instead of relying on the automatic naming derived from the Java class and property names by Hibernate. We already discussed the built-in mechanism and options for quoting and extending names in chapter 4, section 4.3.5, "Quoting SQL identifiers." We next look at other options you can enable to beautify your generated DDL scripts.

  • Handling additional database objects, such as indexes, constraints, and stored procedures in your mapping metadata. Earlier in this chapter, you added arbitrary CREATE and DROP statements to XML mapping files with the <database-object> element. You can also enable the creation of indexes and constraints with additional mapping elements inside the regular class and property mappings.

8.3.1. Custom SQL names and datatypes

In listing 8.1, you add attributes and elements to the mapping of the Item class.

Listing 8-1. Additional elements in the Item mapping for hbm2ddl
<class name="Item" table="ITEMS">

    <id name="id" type="string">
        <column name="ITEM_ID" sql-type="char(32)"/> 
<generator class="uuid"/> </id> <property name="initialPrice" type="big_decimal"> <column name="INIT_PRICE"
not-null="true" precision="10" scale="2"/> </property> <property name="description" type="string" column="ITM_DESCRIPTION" length="4000"/>
<set name="categories" table="CATEGORY_ITEM" cascade="none"> <key> <column name="ITEM_ID" sql-type="char(32)"/>
</key> <many-to-many class="Category"> <column name="CATEGORY_ID" sql-type="char(32)"/> </many-to-many> </set> ... </class>

❶ The hbm2ddl exporter generates a VARCHAR typed column if a property (even the identifier property) is of mapping type string. You know that the identifier generator uuid always generates 32-character strings; therefore you switch to a CHAR SQL type and also set its size fixed at 32 characters. The <column> element is required for this declaration, because no attribute supports the SQL datatype on the <id> element.

❷ For decimal types, you can declare the precision and scale. This example creates the column as INIT_PRICE number(10,2) on an Oracle dialect; however, for databases that don't support types with decimal precision, a simple INIT_PRICE numeric (this is in HSQL) is produced.

❸ For the description field, you add DDL attributes on the <property> element instead of a nested <column> element. The DESCRIPTION column is generated as VARCHAR(4000)—a limitation of a variable character field in an Oracle database (in Oracle, it would be VARCHAR2(4000) in the DDL, but the dialect takes care of this).

❹ A <column> element can also be used to declare the foreign key fields in an association mapping. Otherwise, the columns of your association table CATEGORY_ITEM would be VARCHAR(32) instead of the more appropriate CHAR(32) type.

The same customization is possible in annotations, see listing 8.2.

Listing 8-2. Additional annotations for customization of DDL export
@Entity
@Table(name = "ITEMS")
public class Item {

    @Id
    @Column(name = "ITEM_ID", columnDefinition = "char(32)")
    @GeneratedValue(generator = "hibernate-uuid.hex")
    @org.hibernate.annotations.GenericGenerator(
            name = "hibernate-uuid.hex",
            strategy = "uuid.hex"
    )
    Private String id;

    @Column(name = "INIT_PRICE", nullable = false,
            precision = 10, scale = 2)
    BigDecimal initialPrice;

    @Column(name = "ITM_DESCRIPTION", length = 4000)
    Private String description;

    @ManyToMany
    @JoinTable(
        name = "CATEGORY_ITEM",
        joinColumns =
            { @JoinColumn(name = "ITEM_ID",
                          columnDefinition = "char(32)")
            },
        inverseJoinColumns =

            { @JoinColumn(name = "CATEGORY_ID",
                          columnDefinition = "char(32)")
            }
    )
    Private Set<Category> categories = new HashSet<Category>();

    ...
}

You have to use one Hibernate extension to name the nonstandard identifier generator. All other customizations of the generated SQL DDL are done with annotations of the JPA specification. One attribute deserves special attention: The columnDefinition isn't the same as sql-type in a Hibernate mapping file. It's more flexible: The JPA persistence provider appends the whole string after the column name in the CREATE TABLE statement, as in ITEM_ID char(32).

Customization of names and data types is the absolute minimum you should consider. We recommend that you always improve the quality of your database schema (and ultimately, the quality of the data that is stored) with the appropriate integrity rules.

8.3.2. Ensuring data consistency

Integrity rules are an important part of your database schema. The most important responsibility of your database is to protect your information and to guarantee that it's never in an inconsistent state. This is called consistency, and it's part of the ACID criteria commonly applied to transactional database management systems.

Rules are part of your business logic, so you usually have a mix of business-related rules implemented in your application code and in your database. Your application is written so as to avoid any violation of the database rules. However, it's the job of the database management system to never allow any false (in the business logic sense) information to be stored permanently—for example, if one of the applications accessing the database has bugs. Systems that ensure integrity only in application code are prone to data corruption and often degrade the quality of the database over time. Keep in mind that the primary purpose of most business applications is to produce valuable business data in the long run.

In contrast to ensuring data consistency in procedural (or object-oriented) application code, database-management systems allow you to implement integrity rules declaratively as part of your data schema. The advantages of declarative rules include fewer possible errors in code and a chance for the database-management system to optimize data access.

We identify four levels of rules:

  • Domain constraint—A domain is (loosely speaking, and in the database world) a datatype in a database. Hence, a domain constraint defines the range of possible values a particular datatype can handle. For example, an int datatype is usable for integer values. A char datatype can hold character strings: for example, all characters defined in ASCII. Because we mostly use datatypes that are built in to the database management system, we rely on the domain constraints as defined by the vendor. If you create user-defined datatypes (UDT), you'll have to define their constraints. If they're supported by your SQL database, you can use the (limited) support for custom domains to add additional constraints for particular datatypes.

  • Column constraint—Restricting a column to hold values of a particular domain is equivalent to adding a column constraint. For example, you declare in DDL that the INITIAL_PRICE column holds values of the domain MONEY, which internally uses the datatype number(10,2). You use the datatype directly most of the time, without defining a domain first. A special column constraint in an SQL database is NOT NULL.

  • Table constraint—An integrity rule that applies to a single row or several rows is a table constraint. A typical declarative table constraints is UNIQUE (all rows are checked for duplicate values). A sample rule affecting only a single row is "end date of an auction must be later than the start date."

  • Database constraint—If a rule applies to more than one table, it has database scope. You should already be familiar with the most common database constraint, the foreign key. This rule guarantees the integrity of references between rows, usually in separate tables, but not always (self-referencing foreign key constraints aren't uncommon).

Most (if not all) SQL database-management systems support the mentioned levels of constraints and the most important options in each. In addition to simple keywords, such as NOT NULL and UNIQUE, you can usually also declare more complex rules with the CHECK constraint that applies an arbitrary SQL expression. Still, integrity constraints are one of the weak areas in the SQL standard, and solutions from vendors can differ significantly.

Furthermore, nondeclarative and procedural constraints are possible with database triggers that intercept data-modification operations. A trigger can then implement the constraint procedure directly or call an existing stored procedure. Like DDL for stored procedures, you can add trigger declarations to your Hibernate mapping metadata with the <database-object> element for inclusion in the generated DDL.

Finally, integrity constraints can be checked immediately when a data-modification statement is executed, or the check can be deferred until the end of a transaction. The violation response in SQL databases is usually rejection, without any possibility of customization.

We now have a closer look at the implementation of integrity constraints.

8.3.3. Adding domains and column constraints

The SQL standard includes domains, which, unfortunately, not only are rather limited but also are often not supported by the DBMS. If your system supports SQL domains, you can use them to add constraints to datatypes:

create domain EMAILADDRESS as varchar
  constraint DOMAIN_EMAILADDRESS
  check ( IS_EMAILADDRESS(value) );

You can now use this domain identifier as a column type when creating a table:

create table USERS (
    ...
    USER_EMAIL EMAILADDRESS(255) not null,
    ...
);

The (relatively minor) advantage of domains in SQL is the abstraction of common constraints into a single location. Domain constraints are always checked immediately when data is inserted and modified. To complete the previous example, you also have to write the stored function IS_EMAILADDRESS (you can find many regular expressions to do this on the Web). Adding the new domain in a Hibernate mapping is simple as an sql-type:

<property name="email" type="string">
    <column name="USER_EMAIL"
            length="255"
            not-null="true"
            sql-type="EMAILADDRESS"/>
</property>

With annotations, declare your own columnDefinition:

@Column(name = "USER_EMAIL", length = 255,
        columnDefinition = "EMAILADDRESS(255) not null")
String email;

If you want to create and drop the domain declaration automatically with the rest of your schema, put it into a <database-object> mapping.

SQL supports additional column constraints. For example, the business rules allow only alphanumeric characters in user login names:

create table USERS (
    ...
    USERNAME varchar(16) not null
        check(regexp_like(USERNAME,'^[[:alpha:]]+$')),
    ...
);

You may not be able to use this expression in your DBMS unless it supports regular expressions. Single-column check constraints are declared in Hibernate mappings on the <column> mapping element:

<property name="username" type="string">
    <column name="USERNAME"
            length="16"
            not-null="true"
            check="regexp_like(USERNAME,'^[[:alpha:]]+$')"/>
</property>

Check constraints in annotations are available only as a Hibernate extension:

@Column(name = "USERNAME", length = 16,
        nullable = false, unique = true)
@org.hibernate.annotations.Check(
    constraints = "regexp_like(USERNAME,'^[[:alpha:]]+$')"
)
private String username;

Note that you have a choice: Creating and using a domain or adding a single-column constraint has the same effect. In the long run, domains are usually easier to maintain and more likely to avoid duplication.

Let's look at the next level of rules: single and multirow table constraints.

8.3.4. Table-level constraints

Imagine that you want to guarantee that a CaveatEmptor auction can't end before it started. You write the application code to prevent users from setting the startDate and endDate properties on an Item to wrong values. You can do this in the user interface or in the setter methods of the properties. In the database schema, you add a single-row table constraint:

create table ITEM (
    ...
    START_DATE timestamp not null,

    END_DATE timestamp not null,
    ...
    check (START_DATE < END_DATE)
);

Table constraints are appended in the CREATE TABLE DDL and can contain arbitrary SQL expressions. You include the constraint expression in your Hibernate mapping file on the <class> mapping element:

<class name="Item"
       table="ITEM"
       check="START_DATE &lt; END_DATE">

Note that the < character must be escaped as &lt; in XML. With annotations, you need to add a Hibernate extension annotation to declare check constraints:

@Entity
@org.hibernate.annotations.Check(
    constraints = "START_DATE < END_DATE"
)
public class Item { ... }

Multirow table constraints can be implemented with more complex expressions. You may need a subselect in the expression to do this, which may not be supported in your DBMS—check your product documentation first. However, there are common multirow table constraints you can add directly as attributes in Hibernate mappings. For example, you identify the login name of a User as unique in the system:

<property name="username" type="string">
    <column name="USERNAME"
            length="16"
            not-null="true"
            check="regexp_like(USERNAME,'^[[:alpha:]]+$')"
            unique="true"/>
</property>

Unique constraint declaration is also possible in annotation metadata:

@Column(name = "USERNAME", length = 16, nullable = false,
        unique = true)
@org.hibernate.annotations.Check(
    constraints = "regexp_like(USERNAME,'^[[:alpha:]]+$')"
)
private String username;

And, of course, you can do this in JPA XML descriptors (there is no check constraint, however):

<entity class="auction.model.User" access="FIELD">
    <attributes>
        ...
        <basic name="username">
            <column name="USERNAME"
                    length="16"
                    nullable="false"
                    unique="true"/>
        </basic>
    </attributes>
</entity>

The exported DDL includes the unique constraint:

create table USERS (
    ...
    USERNAME varchar(16) not null unique
        check(regexp_like(USERNAME,'^[[:alpha:]]+$')),
    ...
);

A unique constraint can also span several columns. For example, CaveatEmptor supports a tree of nested Category objects. One of the business rules says that a particular category can't have the same name as any of its siblings. Hence, you need a multicolumn multirow constraint that guarantees this uniqueness:

<class name="Category" table="CATEGORY">
    ...
    <property name="name">
      <column name="CAT_NAME"
              unique-key="unique_siblings"/>
    </property>

    <many-to-one name="parent" class="Category">
      <column name="PARENT_CATEGORY_ID"
              unique-key="unique_siblings"/>
    </many-to-one>
    ...
</class>

You assign an identifier to the constraint with the unique-key attribute so you can refer to it several times in one class mapping and group columns for the same constraint. However, the identifier isn't used in the DDL to name the constraint:

create table CATEGORY (
    ...
    CAT_NAME varchar(255) not null,
    PARENT_CATEGORY_ID integer,
    ...
    unique (CAT_NAME, PARENT_CATEGORY_ID)
);

If you want to create a unique constraint with annotations that spans several columns, you need to declare it on the entity, not on a single column:

@Entity
@Table(name = "CATEGORY",
   uniqueConstraints = {
        @UniqueConstraint(columnNames =
            {"CAT_NAME", "PARENT_CATEGORY_ID"} )
   }
)
public class Category { ... }

With JPA XML descriptors, multicolumn constraints are as follows:

<entity class="Category" access="FIELD">
    <table name="CATEGORY">
        <unique-constraint>
            <column-name>CAT_NAME</column-name>
            <column-name>PARENT_CATEGORY_ID</column-name>
        </unique-constraint>
    </table>
...

Completely custom constraints, including an identifier for the database catalog, can be added to your DDL with the <database-object> element:

<database-object>
    <create>
        alter table CATEGORY add constraint UNIQUE_SIBLINGS
            unique (CAT_NAME, PARENT_CATEGORY_ID);
    </create>
    <drop>
        drop constraint UNIQUE_SIBLINGS
    </drop>
</database-object>

This functionality isn't available in annotations. Note that you can add a Hibernate XML metadata file with all your custom database DDL objects in your annotation-based application.

Finally, the last category of constraints includes database-wide rules that span several tables.

8.3.5. Database constraints

You can create a rule that spans several tables with a join in a subselect in any check expression. Instead of referring only to the table on which the constraint is declared, you may query (usually for the existence or nonexistence of a particular piece of information) a different table.

Another technique to create a database-wide constraint uses custom triggers that run on insertion or update of rows in particular tables. This is a procedural approach that has the already-mentioned disadvantages but is inherently flexible.

By far the most common rules that span several tables are referential integrity rules. They're widely known as foreign keys, which are a combination of two things: a key value copy from a related row and a constraint that guarantees that the referenced value exists. Hibernate creates foreign key constraints automatically for all foreign key columns in association mappings. If you check the DDL produced by Hibernate, you may notice that these constraints also have automatically generated database identifiers—names that aren't easy to read and that make debugging more difficult:

alter table ITEM add constraint FK1FF7F1F09FA3CB90
    foreign key (SELLER_ID) references USERS;

This DDL declares the foreign key constraint for the SELLER_ID column in the ITEM table. It references the primary key column of the USERS table. You can customize the name of the constraint in the <many-to-one> mapping of the Item class with the foreign-key attribute:

<many-to-one name="seller"
         class="User"
         column="SELLER_ID"
         foreign-key="FK_SELLER_ID"/>

With annotations, use a Hibernate extension:

@ManyToOne
@JoinColumn(name = "SELLER_ID")
@org.hibernate.annotations.ForeignKey(name = "FK_SELLER_ID")
private User seller;

And a special syntax is required for foreign keys created for a many-to-many association:

@ManyToMany
@JoinTable(...)
@org.hibernate.annotations.ForeignKey(
    name = "FK_CATEGORY_ID",
    inverseName = "FK_ITEM_ID"
)
private Set<Category> categories...

If you want to automatically generate DDL that isn't distinguishable from what a human DBA would write, customize all your foreign key constraints in all your mapping metadata. Not only is this good practice, but it also helps significantly when you have to read exception messages. Note that the hbm2ddl exporter considers constraint names only for foreign keys that have been set on the noninverse side of a bidirectional association mapping.

Foreign key constraints also have features in SQL that your legacy schema may already utilize. Instead of immediately rejecting a modification of data that would violate a foreign key constraint, an SQL database can CASCADE the change to the referencing rows. For example, if a row that is considered a parent is deleted, all child rows with a foreign key constraint on the primary key of the parent row may be deleted as well. If you have or want to use these database-level cascading options, enable them in your foreign key mapping:

<class name="Item" table="ITEM">
    ...
    <set name="bids" cascade="save-update, delete">
        <key column="ITEM_ID" on-delete="cascade"/>
        <one-to-many class="Bid"/>
    </set>

</class>

Hibernate now creates and relies on a database-level ON CASCADE DELETE option of the foreign key constraint, instead of executing many individual DELETE statements when an Item instance is deleted and all bids have to be removed. Be aware that this feature bypasses Hibernate's usual optimistic locking strategy for versioned data!

Finally, unrelated to integrity rules translated from business logic, database performance optimization is also part of your typical DDL customization effort.

8.3.6. Creating indexes

Indexes are a key feature when optimizing the performance of a database application. The query optimizer in a database-management system can use indexes to avoid excessive scans of the data tables. Because they're relevant only in the physical implementation of a database, indexes aren't part of the SQL standard, and the DDL and available indexing options are specific for a particular product. However, the most common DDL for typical indexes can be embedded in a Hibernate mapping (that is, without the generic <database-object> element).

Many queries in CaveatEmptor will probably involve the endDate property of an auction Item. You can speed up these queries by creating an index for the column of this property:

<property   name="endDate"
            column="END_DATE"

            type="timestamp"
            index="IDX_END_DATE"/>

The automatically produced DDL now includes an additional statement:

create index IDX_END_DATE on ITEM (END_DATE);

The same functionality is available with annotations, as a Hibernate extension:

@Column(name = "END_DATE", nullable = false, updatable = false)
@org.hibernate.annotations.Index(name = "IDX_END_DATE")
private Date endDate;

You can create a multicolumn index by setting the same identifier on several property (or column) mappings. Any other index option, such as UNIQUE INDEX (which creates an additional multirow table-level constraint), the indexing method (common are btree, hash, and binary), and any storage clause (for example, to create the index in a separate tablespace) can be set only in completely custom DDL with <database-object>.

A multicolumn index with annotations is defined at the entity level, with a custom Hibernate annotation that applies additional attributes to table mapping:

@Entity
@Table(name="ITEMS")
@org.hibernate.annotations.Table(
  appliesTo = "ITEMS",  indexes =
    @org.hibernate.annotations.Index(
     name = "IDX_INITIAL_PRICE",
     columnNames = { "INITIAL_PRICE", "INITIAL_PRICE_CURRENCY" }
    )
)
public class Item { ... }

Note that @org.hibernate.annotations.Table isn't a replacement for @javax. perisistence.Table, so if you need to override the default table name, you still need the regular @Table.

We recommend that you get the excellent book SQL Tuning by Dan Tow (Tow, 2003) if you want to learn efficient database-optimization techniques and especially how indexes can get you closer to the best-performing execution plan for your queries.

One mapping we have shown a few times in this chapter is <database-object>. It has some other options that we haven't discussed yet.

8.3.7. Adding auxiliary DDL

Hibernate creates the basic DDL for your tables and constraints automatically; it even creates sequences if you have a particular identifier generator. However, there is some DDL that Hibernate can't create automatically. This includes all kinds of highly vendor-specific performance options and any other DDL that is relevant only for the physical storage of data (tablespaces, for example).

One reason this kind of DDL has no mapping elements or annotations is that there are too many variations and possibilities—nobody can or wants to maintain more than 25 database dialects with all the possible combinations of DDL. A second, much more important reason, is that you should always ask your DBA to finalize your database schema. For example, did you know that indexes on foreign key columns can hurt performance in some situations and therefore aren't automatically generated by Hibernate? We recommend that DBAs get involved early and verify the automatically generated DDL from Hibernate.

A common process, if you're starting with a new application and new database, is to generate DDL with Hibernate automatically during development; database performance concerns shouldn't and usually don't play an important role at that stage. At the same time (or later, during testing), a professional DBA verifies and optimizes the SQL DDL and creates the final database schema. You can export the DDL into a text file and hand it to your DBA.

Or—and this is the option you've seen a few times already—you can add customized DDL statements to your mapping metadata:

<database-object>
    <create>
        [CREATE statement]
    </create>
    <drop>
        [DROP statement]
    </drop>

    <dialect-scope name="org.hibernate.dialect.Oracle9Dialect"/>
    <dialect-scope name="org.hibernate.dialect.OracleDialect"/>

</database-object>

The <dialect-scope> elements restrict the custom CREATE or DROP statements to a particular set of configured database dialects, which is useful if you're deploying on several systems and need different customizations.

If you need more programmatic control over the generated DDL, implement the AuxiliaryDatabaseObject interface. Hibernate comes bundled with a convenience implementation that you can subclass; you can then override methods selectively:

package auction.persistence;

import org.hibernate.mapping.*;

import org.hibernate.dialect.Dialect;
import org.hibernate.engine.Mapping;

public class CustomDDLExtension
    extends AbstractAuxiliaryDatabaseObject {

    public CustomDDLExtension() {
        addDialectScope("org.hibernate.dialect.Oracle9Dialect");
    }

    public String sqlCreateString(Dialect dialect,
                                  Mapping mapping,
                                  String defaultCatalog,
                                  String defaultSchema) {
        return "[CREATE statement]";
    }

    public String sqlDropString(Dialect dialect,
                                String defaultCatalog,
                                String defaultSchema) {

        return "[DROP statement]";
    }
}

You can add dialect scopes programmatically and even access some mapping information in the sqlCreateString() and sqlDropString() methods. This gives you a lot of flexibility regarding how you create and write your DDL statements. You have to enable this custom class in your mapping metadata:

<database-object>
    <definition class="auction.persistence.CustomDDLExtension"/>
    <dialect-scope name="org.hibernate.dialect.OracleDialect"/>
</database-object>

Additional dialect scopes are cumulative; the previous examples all apply to two dialects.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.253.239