7.1. Object Relational Mismatch

This section of the chapter walks through a simple example and brings to light a few of the common differences between the object world and the relational world.

Let's assume that your application keeps records for the employees of a company. Each employee is a person who has a few attributes such as name, ID, role, manager's name, and contact information. Each person's contact information could include his or her email, phone, fax, and instant messaging ID.

For starters, let's create the database schema and the object model for the "employee" entity and its associated information. The first problem that you will immediately encounter is that of granularity.

7.1.1. Coarse-Grained or Fine-Grained

Following object-oriented principles, you may end up creating three objects, namely Employee, ContactInfo, and Role, with definitions like this:

public class Employee
{
    private int employeeId;
    private String name;
    private Set roles;
    private String managerName;
    private ContactInfo contactInfo;

    //accessor methods

}

public class ContactInfo
{

    private String email;
    private String phone;
    private String fax;
    private String im;

    //accessor methods

}

public class Role
{
    Private int roleId;
    private String description;
    private String businessUnit;

    //accessor methods

}

On the other hand, you will create two tables to keep the relational schema optimized. The Data Description Language (DDL) statements for the two table creation operations are like so:

create table employees (
    id INT NOT NULL PRIMARY KEY,
    name VARCHAR(100),
    manager_name VARCHAR(100),
    email VARCHAR(75),
    phone VARCHAR(75),
    fax VARCHAR(75),
    im VARCHAR(75)
);

create table roles (
    role_id INT NOT NULL PRIMARY KEY,
    description VARCHAR(100),
    business_unit VARCHAR(100),
    employee_id INT NOT NULL,
     FOREIGN KEY (employee_id) REFERENCES employees(id)
);

These DDL statements are for MySQL. Other databases may have a slightly different syntax.

Immediately, you will notice that the contact information is stored differently under the two schemes. In the object-oriented world, the contact information is encapsulated in a separate class, which can be considered finer grained than the Employee class. In the relational world, attributes of the contact information are stored as columns within the employees table. Why is this so?

Data is selected and manipulated in the relational world using SQL. SQL selects are more efficient when data resides in a single table as opposed to when it resides in multiple tables. Concurrently fetching data from multiple tables leads to join operations that are expensive and less efficient.

This is one point of difference between object representations and relational tables. Objects can represent data at varying levels of granularity. For example, the Employee and the ContactInfo class contain data at different levels of granularity. Relational tables, on the other hand, have no notion of granularity. They are all made up of the constituent columns. The only deviation from this rule comes in the form of custom data types or user-defined types, which, by the way, are not supported by all relational databases.

Using user-defined types, one could define a complex data type for the contact information. Then, instead of four separate columns, you could keep all the information on email, phone, fax, and instant messaging within a single column. However, "custom types" compromise on portability and performance, so sticking to multiple columns appears as the best choice.

In addition to differences in terms of granularity, there are some major points of divergence when it comes to object-oriented concepts like inheritance and polymorphism.

7.1.2. The Cases of Inheritance and Polymorphism

So far our model does not talk about inheritance and polymorphism, which are central concepts in the object-oriented paradigm. Now, I will introduce these ideas into the existing model for the employee records management application.

Say that there are two subtypes for the Role type, namely: LineResource and SupportStaff. LineResource represents all those employees who are directly involved in the production and delivery of the products and services of the company. Every other supporting member is classified as a SupportStaff.

Using objects, it's elementary to depict this relationship, as both LineResource and SupportStaff extend from the Role class. These extensions can contain additional attributes and override standard behavior. The employee record management program methods can work with the Role super class and a specific subclass can be wired in depending on the context or business rules, thereby incorporating polymorphic behavior.

These ideas of inheritance and polymorphism get fuzzy and complex when applied to the relational databases. First, relational systems have no notion of hierarchy among tables. There are no super- and subtables in a relational database management system. Views or virtual tables can pretend to create such a hierarchy but they aren't the same, because they don't represent a type and its subtype. In general, the notion of type has no relevance in the world of the relational tables.

Second, polymorphic behavior implementation implies the existence of a foreign key relationship with multiple subtypes in the same place. That is, one foreign key will need to define a constraint that holds for the Role type and its subtypes LineResource and SupportStaff. Such foreign key semantics are not supported by relational systems. At best, one could use a database procedure to impose such a foreign key constraint.

So, there are some major disconnects when it comes to inheritance and polymorphism. This problem is pervasive, even in the context of defining associations between the entities and navigating from one to the other.

7.1.3. Entity Relationships

Entities are related to each other through defined associations and foreign key constraints. When you access data, you traverse these interrelationships to get the entire relevant data set.

Taking the case of employees and their roles, one can safely assume that a many-to-many relationship could exist between the two. An employee could wear multiple hats and so be associated with multiple roles, and many employees could be classified under the same role. (I am not restricting to only two subtypes of a Role, as defined in the last section. That was just an example. Role types could be structured in many different ways, including classifications based on business functions and hierarchy.)

The many-to-many cardinality in the relationship poses no complications in the world of objects. All one needs to do is to use collections instead of individual entities when referring to the association. For example, the Employee class could access its roles using the following methods:

public Set getRoles() {
    return roles;
}

public void setRoles(Set roles) {
    this.roles = roles;
}

For Java 5 and above, which support generics, these methods would work with a Set<Role> type.

Tables representing one-to-one and one-to-many (or many-to-one when viewed from the other side) relationships simply involve appropriate foreign key definitions. However, things get tricky when many-to-many associations exist. The foreign key constraint method isn't appropriate. In relational databases, a many-to-many cardinality demands the creation of a new table, called the association or the link table, which maps the two entities. So, a many-to-many employees-to-roles association will imply creation of a table on the following lines:

create table employees_roles (
    id INT NOT NULL,
    role_id INT NOT NULL,
    FOREIGN KEY (id) REFERENCES employees(id),
    FOREIGN KEY (role_id) REFERENCES roles(role_id)
);

Such tables have no counterparts in the object world because objects don't treat many-to-many relationships any differently from one-to-many relationships.

So far, I have always started with the object viewpoint and then shown how the particular idea translates to the world of relational tables. This may have given you the false impression that the object viewpoint is more flexible and robust than its relational counterpart. However, this is not true at all and that is why I wanted to state it explicitly here. Relational databases serve well in more than one way and are a time-tested preferred way of storing and accessing data.

One place where they distinctly outsmart the object approach is accessing associated or networked entities. Let's consider the employees and roles association once again. Using SQL, one could retrieve all the associated roles of a given employee whose ID is 1, with a single query, like so:

select er.id, er.role_id from employees_roles er, employees e
where er.id = e.id
and e.id = 1;

In the object-oriented approach, to get the same data set you would first invoke a method as follows: anEmployee.getRoles(). This would give you a set of all roles associated with an Employee. Then you would iterate over this set and call aRole.getRoleId() for each member of this set. This is when you would get the list of all role IDs associated with an employee. So, if there were n roles associated with an employee, you would invoke n + 1 selects to get to the required data. Compare this with the single select statement using SQL and you can immediately see how traversing the object network may not be efficient when it comes to retrieving associations that have one-to-many or many-to-many cardinality.

This inefficiency is amplified because of the additional overhead that object orientation imposes at the time the association is defined. In the earlier example, the relationship between employees and roles could be bidirectional, meaning that you will not only query all roles for an employee but may also query all employees that map to a role. In terms of implementation, this means accessor methods will be defined in an Employee to fetch roles, and similar counterparts would be defined in a Role to fetch all employees. You saw the getter and setter for roles in an Employee class earlier in this section. The getter and setter used in a Role class to fetch the set of employees might look like this:

public Set getEmployees() {
    return employees;

}

public void setEmployees(Set employees) {
    this.employees = employees;
}

In SQL, you do not need to define the same relationship twice. Apart from these differences, the two sides have another major point of difference and that relates to the way they uniquely identify a record.

7.1.4. What Defines Identity?

In relational databases, a row in a table is uniquely identified by its primary key. A simple primary key maps to a single column, whereas a composite primary key involves a combination of multiple columns.

Rows in a database table map to object instances. Object instances have two notions of identity, which are based on either of the following:

  • Reference

  • Value

When evaluated in terms of "reference" two objects are same when they both point to the same instance. Such objects return "true" when compared using the "==" operator.

When evaluated in terms of "value," two objects are same when the values of their attributes are identical. Usually, the equals and the hashCode methods implement the notion of equality by value.

The pertinent question, then, is which object identity accurately represents the primary key based identity that the relational model adheres to. The straight answer is that neither reference- nor value-based object identity maps to primary key–based identity.

However, ORM tools like Hibernate provide a way to accommodate the differences and facilitate the interaction between the two models. ORM tools also reduce the problems arising out of the discrepancies stated earlier in this chapter.

While it's very tempting to indulge further in the discussion on object relational mismatch, you should have enough information to understand the differences between the two paradigms.

It's time now to explore the Java Persistence API and Hibernate and see how these options help objects interact successfully with relational tables.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.228.138