Chapter 9. The Database Layer

At its core, Drupal is a tool for creating, maintaining, and displaying information stored in a database. Drupal database management is a crucial part of the entire system, as all aspects of a site are stored in the database.

It's pretty important, then, that the parts of Drupal that touch the database be pretty solid and efficient.

Prior to version 7 of Drupal, the system was pretty much locked into a single database product—MySQL. There were some successful implementations of Postgres/SQL, but virtually all development was done against MySQL and the vast majority of current implementations use MySQL.

In many ways, using Drupal with PHP is similar to the way we created our classic ASP programs. Either we'd write our own data abstraction layer or, more likely, we'd just create direct SQL statements to the database. I remember my first ASP database application—where the underlying database was Microsoft Access. I connected using the ASP ADODB.Connection object. If you started that far back, perhaps you'll recognize the code in Listing 9-1.

Example 9.1. Classic ASP Code Using the ADODB Object

<%@ LANGUAGE = "VBScript" %>
<XMLNEWS>
<%
set objConnection = Server.CreateObject("ADODB.Connection")
DBFile = Server.MapPath("dataNewsfeed.mdb")
objConnection.Open("driver={Microsoft Access Driver (*.mdb)}; DBQ=" & DBFile)
set RS = objConnection.Execute("SELECT * FROM News " &
    "WHERE Date > #" & DaysAgo(5) & "# " &
    "ORDER BY Date DESC, Location, Headline")
do until RS.EOF
    %>
    <ITEM DATE="<%=RS("Date")%>" LOCATION="<%=RS("Location")%>">
        <URL><%=RS("Url")%></URL>
        <HEADLINE><%=RS("Headline")%></HEADLINE>
        <ABSTRACT><%=RS("Abstract")%></ABSTRACT>
    </ITEM>
    <%
    RS.MoveNext
Loop
objConnection.Close
%>
</XMLNEWS>

This code created an XML document with news information, sort of a precursor to RSS. Notice that the database connection is expressed as a connection to a physical file, and that the SQL select statement is concatenated into a string that is passed to the connection via the execute method.

There are substantial security risks using this approach, the most common of which is SQL injection. SQL injection is the attempt to exploit common shortcuts that programmers take with their code for malicious purposes. The problem is illustrated in Figure 9-1.

Sometimes, SQL injection can be used for good (Source gizmodo.com)

Figure 9.1. Sometimes, SQL injection can be used for good (Source gizmodo.com)

Someone looking for exploits would assume that a programmer did not escape any table names, and possibly did not sanitize the input.

In case you're not familiar with how this works, I'll explain. In this license plate example, the end of the banner, the area under the driver's-side headlight, is, I assume, partially obscured. I expect it says TABLE LICENCE; -- (probably using the British spelling).

Someone looking to exploit this vulnerability could make an educated guess that the query is something like:

SELECT registration-status FROM licence WHERE plate-num = '%s';

The string variable, %s, would be inserted from the OCR performed by the traffic scanner. So the result would be disastrous for the police:

SELECT registration-status FROM licence WHERE plate-num = 'ZU 066', 0, 0); DROP DATABASE TABLE LICENCE; --'

This is a sort of humorous example, and the owner definitely gets extra nerd points for this one. But the point is that SQL injection attacks are very common, and are easy to prevent if you just do a little sanitizing.

Tip

Better yet, don't use direct SQL calls at all! I'll discuss this later in the chapter.

The ASP code shown above is pretty primitive by today's standards, but Drupal code before version 7 wasn't much more advanced. Fortunately, things have changed.

Data Abstraction

One of the key goals of Drupal 7 was to abstract the database layer so other database engines could be used without major rewrites of core and contributed modules.

If you are a .NET programmer, you're probably familiar with ADO.NET, a facility that provides an abstraction between your application and your data. It works by establishing an object that represents the data, along with whatever metadata is required to understand the data itself. Getting data out of your database becomes a simple task of establishing a connection to the database, including a connection string, and then executing a set of methods against a DataTable object, which is an abstraction of the underlying database tables or views.

In the Drupal world, the corollary to ADO.NET is PDO, which stands for PHP Data Objects. PDO is a data access object that separates the physical database from the application that uses it. The PDO object provides methods for managing transactions, reporting error conditions, and an interface for preparing and executing statements.

It is important to note, however, that PDO is not a database abstraction layer. PDO does not rewrite any SQL statements or provide emulation for features that might be missing in a particular database implementation. For that, you need to write an abstraction layer that sits on top of PDO. And that's exactly what has been done for Drupal 7.

Note

Versions of Drupal prior to version 7 had a set of functions that were referred to as a data abstraction layer. This was mostly a set of routines to clean up and rewrite SQL statements to normalize for certain installation variables and provide some level of protection against SQL injection. However, this abstraction layer did not provide an easy path to alternative database engines.

The Database API

When you did the initial configuration of your site, the setup program modified Drupal's initialization file, settings.php. One of the things you did was provide the name of your database and the logon credentials. This is the first step in accessing the database from your Drupal installation.

The database connection information is contained in the $databases array, which is loaded from settings.php when the bootstrapper starts your page. The array is represented in Listing 9-2.

Example 9.2. The $databases Array in settings.php

$databases = array (
  'default' => array (
    'default' => array (
'driver' => 'mysql',
      'database' => 'drupal7',
      'username' => 'brian',
      'password' => 'super-secret-password',
      'host' => 'localhost',
      'port' => '',
      'prefix' => '',
    ),
  ),
);

This is the default structure for the LAMP stack, which includes the MySQL database engine. As you can probably guess, it's possible to define more database instances in the settings file. For example, you might want to set up a master/slave configuration in order to provide some level of high availability or scaling. In this case, your $databases array might look like the code in Listing 9-3.

Example 9.3. Defining a Master/Slave Configuration

$databases['default']['default'] = array(
  'driver' => 'mysql',
  'database' => 'drupal7master',
  'username' => 'username',
  'password' => 'super-secret-master',
  'host' => 'dbserver1',
);
$databases['default']['slave'][] = array(
  'driver' => 'mysql',
  'database' => 'drupal7firstslave',
  'username' => 'firstslave',
  'password' => 'super-secret-slave-one',
  'host' => 'dbserver2',
);
$databases['default']['slave'][] = array(
  'driver' => 'mysql',
  'database' => 'drupal7secondslave',
  'username' => 'secondslave',
  'password' => 'super-secret-slave-two',
  'host' => 'dbserver3',
);

Both of these cases use the mysql driver. In Appendix A, I'll show you how to use Microsoft SQL Server as a database engine instead of MySQL.

In the next section, I'll describe the Drupal 7 database layer from the programmer's perspective.

Programming Against the Database API

In this section, I will cover the db_query() function, and then show an alternate technique for accessing data in a database-agnostic way. This assumes your database is configured and connected, and that the Drupal databases have been created using the installation scripts. You should have a nice clean environment ready to develop modules against.

There are two basic ways to select data from the database: static and dynamic.

Static Queries

A static query is a holdover from the days before Drupal 7's database abstraction layer. I recommend you avoid using static queries for reasons I'll mention later, but since there are plenty of static queries in the Drupal 7 core and contributed modules, let's take a look at them here.

The quickest way to get data out of a table is to use the db_query() function and pass a SQL string directly to it, then process the results that come back. This is shown in Listing 9-4.

Example 9.4. The db_query() Function

$result = db_query("SELECT nid AS my_nid, title FROM {node}");
  foreach ($result as $record) {
    $nid = $record->my_nid;
    $title = $record->title;
  }

The result is an object that contains information about the query, including the records that were retrieved from the database.

Note

Drupal has already connected to the database, so the db_query() function has all of the information it needs to get the data you desire. All that's required is a SQL query and you can then process the records.

To get the most out of the db_query() function, you should be aware of some its features. Some of these are optional, but others are required if your code is to have maximum security and extensibility.

Prefixing

In the db_query() function, all table names must be wrapped in curly braces. This is required so that Drupal can add a prefix string to the table name if it needs to. It's possible to share a single database instance with several Drupal installations. Insuch cases, you can establish a prefix that will be placed on each database table to distinguish which instance uses which set of tables. In your SELECT statement, then, you just need to put the generic name of the table, and if you add curly braces, the db_query() function will apply the appropriate table prefix for your site.

Placeholders

Placeholders are abstractions you can place in your query string to indicate the value of variables that will be replaced when the query string is parsed by the db_query() function. This abstraction allows the database to differentiate between raw SQL syntax and values provided by the user, thereby avoiding SQL injection.

Placeholders are set apart from the query by prefixing a string with a colon. The value of all placeholders must then be defined by creating a keyed array with their values, as shown in Listing 9-5.

Example 9.5. Placeholders in Action

global $user;
$uid = $user->uid;
$result = db_query (
'SELECT n.nid, n.title, n.created '
  . 'FROM {node} n '
  . 'WHERE n.uid = :uid AND n.created > :created'
  , array(
    ':uid' => $uid,
    ':created' => REQUEST_TIME - 3600,
    )
  );

  foreach ($result as $record) {
    $nid = $record->nid;
    $title = $record->title;
    $created = $record->created;
  }

This returns all nodes that have been created in the past hour (3,600 seconds). The REQUEST_TIME variable is a handy shortcut for the time the page is being processed (i.e., now).

Here are some important safety tips when using placeholders:

  • You can have virtually any number of placeholders in your query, but make sure they each have a unique name and are given a value in the array.

  • The order of the placeholder strings declared in the query need not match the order in which they are defined in the array.

  • Placeholders beginning with "db_" are reserved by Drupal for its internal use.

  • Placeholders should not be escaped or quoted regardless of their type. Since they are passed to the database separately, the database server can deal with them intelligently.

  • You can specify a placeholder that consists of a simple array that will be expanded into a comma-separated list. Listing 9-6 illustrates this feature.

Example 9.6. Using Arrays in a Placeholder

$result = db_query (
  'SELECT title, created '
  . 'FROM {node} '
  . 'WHERE uid IN (:uids)'
  , array(
      ':uids' => array(1, 3, 5, 7, 11),
      )
    );

foreach ($result as $record) {
  $title = $record->title;
  $created = $record->created;
}

Fetching Result Sets

The most common technique for accessing the record set returned by the query is by using the foreach() loop as in the preceding examples. There are other ways to fetch the records, though, depending on your situation. And there are three different functions to explicitly fetch the next record, depending on how you want to deal with the data.

  • $record = $result->fetch(); loads a record object just as if you had used the foreach() loop technique.

  • $record = $result->fetchObject(); loads the record into a stdClass object.

  • $record = $result->fetchAssoc(); loads the record as an associative array.

  • $record = $result->fetchField($column_index); loads just the field with the corresponding column number.

  • $records = $result->fetchAll(); loads all records into an array of stdClass objects.

  • $record_count = $result->rowCount(); returns the number of rows in the record set.

Be sure to check that the return value from the fetch functions is not FALSE before proceeding, which indicates you've reached the end of the record set.

One technique you'll see as you explore the Drupal core code is a shortcut made possible by PHP's support for chaining method calls. Listing 9-7 shows an example of how to access just a single field by querying the database and chaining the methods.

Example 9.7. Chaining Method Calls

// Get a keyed array of all nodes
$nodes = db_query("SELECT nid, title FROM {node}")->fetchAllKeyed();

// Extract just the title of a node
$title = db_query(
  'SELECT title FROM {node} '
  . 'WHERE nid = :nid',
  array(':nid' => $nid))->fetchField();

Now, why don't I like static queries? Because they send SQL statements from your module to the database. The designers of Drupal's database layer have done a pretty good job of sanitizing the database inputs using the escaped fields, so there's not much to worry about security-wise. But there is the issue of cross-database portability.

Consider a simple query to select the first ten records. For MySQL, the LIMIT keyword is used in the predicate:

SELECT title FROM table LIMIT 0, 10

For SQL Server, the TOP keyword is used:

SELECT TOP 10 title FROM table

If you are accustomed to working with MySQL, your query might look like the example in Listing 9-8.

Example 9.8. Using the LIMIT Keyword

$nodes = db_query('SELECT title FROM {node} LIMIT 0, 10')->fetchObject();

But what would happen if this query was executed against an underlying SQL Server database? It would break.

For this reason, the database layer has a range query function that can be used to act as a layer between your code and the database engine. The syntax for using the query is shown in Listing 9-9.

Example 9.9. Using the db_query_range Function

$result = db_query_range('SELECT title FROM {node}', 0, 10);

Drupal 7 encapsulates this functionality, along with much more, into a more abstract query syntax, dynamic queries, which I'll describe in the next section.

Tip

To maximize the portability of your code, use dynamic queries instead of static queries.

Dynamic Queries

These queries are built dynamically by Drupal and can be optimized for the underlying database engine. They eliminate the problem of cross-database incompatibility and truly make your code database-agnostic. This is important if you are moving to a database other than MySQL.

It is so important that the use of static queries for anything except SELECT queries is deprecated. Use them and some members of the community will be frustrated when you code doesn't workas it should for them.

Dynamic Select Query

You can create dynamic queries for SELECT, INSERT, UPDATE, DELETE and MERGE queries. The API has been written to allow as much reuse as possible between the query types, so let's first take a look at the methods available for SELECT.

Tip

There are a number of specialized functions for retrieving many common pieces of information from the database. For example, the user_load() function gets information about a particular user while the node_load() function gets everything you ever need to know about a particular piece of managed content. I suggest you use these built-in functions if they are available rather than building a db_select() function to do the same thing.

Now that I've said that, you'll see in my examples below that I do exactly what I just told you not to do. The reason is that the tables are available and filled with data on a freshly installed system so I can illustrate these database functions.

Before doing any query, check to be sure that there is not already a specialized function that provides what you need.

Simple Select Query

Creating a dynamic query for a SELECT requires providing the table name and some optional parameters as shown in Listing 9-10.

Example 9.10. Dynamic Query for SELECT

$query = db_select('node', 'n'),
$query
  ->condition('n.uid', 0, '<>')
  ->fields('n', array('nid', 'title', 'type', 'created'))
  ->range(0, 5);
$result = $query->execute();

foreach ($result as $record) {
  $title = $record->title;
  $type = $record->type;
  $created = $record->created;
  print($title . ' (' . $type . ') created: '
    . date("D, j M, Y a\t G:i", $created) . '<br>'),
}

This gives us the output shown in Figure 9-2.

Simple SELECT using a dynamic query

Figure 9.2. Simple SELECT using a dynamic query

The db_select() function is where we indicate the table we are using. There is no need to wrap the name in curly braces as in the preceding static query because the function already knows that's the table name. The second parameter is an alias for the table.

The condition property is pretty self-explanatory; I want nodesonly where the user ID is not zero.

The fields property lists the fields I want returned, and starts with the alias declared in the db_select() function.

Finally, the range property limits the record set to the first five records.

This simple example will produce different SELECT statements depending on the database engine. For MySQL, the query will look like this:

SELECT n.nid, n.title, n.type, n.created FROM node n WHERE n.uid <> 0 LIMIT 0, 5;

For SQL Server, it will generate a different SELECT statement:

SELECT TOP 5 n.nid, n.title, n.type, n.created FROM node n WHERE n.uid <> 0;

So we've solved the problem of cross-database incompatibility and made our code easier to read, test, and debug at the same time.

Joins

When you join two or more tables, a number of properties can define how the join is to be done.

  • The innerJoin() property does an inner join of two tables. The join() property is a convenience property that does the same thing.

  • The leftJoin() property does a left outer join of two tables.

  • The rightJoin() property does a right outer join of two tables.

Consider the task of getting information out of the node and user tables, which requires doing a join of two tables. Listing 9-11 shows how to do this with a static query.

Example 9.11. Static Query with a Join

$result = db_query('SELECT u.name, u.mail n.title, n.type '
  . 'FROM {node} n '
  . 'INNER JOIN {users} u on u.uid = n.uid'
  . 'WHERE u.uid=:uid'.
  array(':uid' => $user->uid));

Listing 9-12 shows the same thing with a dynamic query.

Example 9.12. Dynamic Query with a Join

$query = db_select('node', 'n'),
$query
  ->condition('u.uid', $user->uid, '=')
  ->fields('u', array('name', 'mail'))
  ->fields('n', array('title', 'type'))
  ->join('users', 'u', 'u.uid = n.uid'),
$result = $query->execute();

This gives us the output shown in Figure 9-3.

Output from a dynamic query with join

Figure 9.3. Output from a dynamic query with join

There are other combinations and permutations of the joins that you can read about in the Drupal documentation, but this should be enough to get you started.

Fields

In the code in Listing 9-12, you'll notice the fields() property. This property is used to indicate the columns you want returned from your query. The alias is specified first, followed by an array of field names corresponding to the column names in your table.

Note

Calling fields() without any field names will result in a SELECT * clause.

Notice that Listing 9-12 has two fields() methods. This is necessary because the query defines two different aliases. There is no room, however, to create field aliases using this syntax. If you want to have field aliases, you can use the addField() method for each field as shown in Listing 9-13.

Example 9.13. Creating Field Aliases Using the addField() Method

global $user;
$query = db_select('node', 'n'),
$query
  ->condition('u.uid', $user->uid, '=')
  ->join('users', 'u', 'u.uid = n.uid'),
$query->addField('u', 'name', 'user_name'),
$query->addField('u', 'mail', 'user_mail'),
$query->addField('n', 'title', 'node_title'),
$query->addField('n', 'type', 'node_type'),

$result = $query->execute();

foreach ($result as $record) {
  $name = $record->user_name;
  $mail = $record->user_mail;
  $title = $record->node_title;
  $type = $record->node_type;
  print($title . ' (' . $type . ') , by ' . $name . ' (' . $mail . ')' . '<br>'),
}

This results in the same output as the previous example, and builds a SELECT statement as shown in Listing 9-14.

Example 9.14. SELECT Statement Created Using addField()

SELECT
  u.name AS user_name,
  u.mail AS user_mail,
  n.title AS node_title,
  n.type AS node_type
FROM
  {node} n
INNER JOIN
  {users} u ON u.uid = n.uid
WHERE
  (u.uid = :db_condition_placeholder_0)

Note

The query placeholder :db_condition_placeholder_0 in Listing 9-14 will be filled in by the execute() method.

Besides being able to assign aliases to fields, the addField() function is handy if you are building a query using some looping construct. Before this capability, you had to use the string concatenation method to build up a list of fields.

The Distinct Property

If you wish to limit your record set to a distinct set of values, you can use the distinct() property as shown in Listing 9-15.

Example 9.15. Setting the distinct Flag

$query = db_select('node', 'n'),
  $query
    ->fields('n', array('title'));
    $query->distinct();
    $result = $query->execute();

Expressions

Expressions can be added to a query if you need to express something that is not otherwise available, as Listing 9-16 shows.

Example 9.16. Using Expressions to Add Functionality to Your Query

$query = db_select('node', 'n'),
$query
  ->condition('n.created', time() - (14 * 86400), '>')
  ->fields('n', array('title', 'created', 'type'));
$query->addExpression('created - :offset', 'offset_by_an_hour', array(':offset' => 3600));
$result = $query->execute();

foreach ($result as $record) {
  $name = $record->title;
  $type = $record->type;
  $created = $record->created;
  $offset = $record->offset_by_an_hour;
  print($name . ' (' . $type . ')'
    . ' created: ' . date("D, j M, Y a\t G:i", $created)
    . ' hour before: ' . date("D, j M, Y a\t G:i", $offset)
    . '<br>'),
}

This inserts a computed field that is the value of the created timestamp minus one hour, and returns all nodes created in the past fortnight (14 * 86400). (See Figure 9-4.)

Output from the query that used an expression

Figure 9.4. Output from the query that used an expression

Warning

Since you can insert expressions that are SQL functions, it's possible to use a function that is available on one database engine but not on another. Using this feature may therefore cause your application to be platform-dependent.

The other side of this story is that you will also be depending on other programmers who create or modify core or contributed modules. If someone does something that makeshermodule platform-dependent, and doesn't see that it fails on your platform, you should file a bug report as soon as you find the problem. This will do at least two things: First, it will notify the programmer that she did something that broke your system. Second, it is a form of reporting that other members of the community will see and it will help them realize the importance of being platform-independent.

This is especially important for readers of this book, who might be using databases other than the default MySQL engine.

Ordering

You use the orderBy() method to specify the collation order of a record set. Listing 9-17 shows an example.

Example 9.17. Ordering a Record Set

$query = db_select('node', 'n')
  ->condition('n.created', time() - (14 * 86400), '>'),
$query->fields('n', array('title', 'type', 'created'));
$query->orderBy('title', 'ASC'),
$result = $query->execute();

foreach ($result as $record) {
  $name = $record->title;
  $type = $record->type;
  $created = $record->created;
  print($name . ' (' . $type . ')'
. ' created: ' . date("D, j M, Y a\t G:i", $created)
    . '<br>'),
}

This creates the output shown in Figure 9-5.

Ordering a result set

Figure 9.5. Ordering a result set

You can also specify a random ordering by using the orderRandom() method. It is possible to chain ordering and random ordering together, as shown in Listing 9-18.

Example 9.18. Chaining Normal and Random Ordering

$query = db_select('block', 'b'),
$query
  ->condition('b.region', −1, '<>')
  ->fields('b', array('region', 'module', 'delta'));
$query->orderBy('module')->orderRandom();
$result = $query->execute();

foreach ($result as $record) {
  $region = $record->region;
  $module = $record->module;
  $delta = $record->delta;
  print($region . ', ' . $module . ', ' . $delta . '<br>'),
}

This orders first by region and then randomly orders the records when there are two or more records with the same region, as shown in Figure 9-6.

Output from the random order method

Figure 9.6. Output from the random order method

Some notes on the orderBy() method:

  • The second parameter may be "ASC" or "DESC" to indicate ascending or descending order. If the parameter is not present, the default is "ASC".

  • If you are using an aliased field name, the field name in the orderBy() method should be the alias, not the native column name. You can set an aliased field name by using the addField() method as described above.

  • To order by multiple fields, call orderBy() multiple times in the order desired.

Grouping

Use the groupBy() method to group by a given field. This is demonstrated in Listing 9-19.

Example 9.19. Using the groupBy() Method

$query = db_select('block', 'b'),
$query
  ->condition('b.region', −1, '<>')
  ->fields('b', array('region', 'module', 'delta'));
$query->addExpression('COUNT(region)', 'count'),
$query->orderBy('region'),
$query->orderBy('module'),
$query->groupBy('region'),
$result = $query->execute();
foreach ($result as $record) {
  $region = $record->region;
  $module = $record->module;
  $delta = $record->delta;
  $count = $record->count;
  print('(' . $count . ') ' . $region . ', ' . $module . ', ' . $delta . '<br>'),
}

Notice that this example also uses the addExpression() method to add a count of records that are consumed in each group. The output of this code fragment is shown in Figure 9-7.

Output from the groupBy() example

Figure 9.7. Output from the groupBy() example

Some notes on the groupBy() method:

  • If you are using an aliased field name, the field name in the groupBy() method should be the alias, not the native column name. You can set an aliased field name using the addField() method as described above.

  • To order by multiple fields, call groupBy() multiple times in the order desired.

Ranges and Limits

The range() property can be used to limit the query to a certain number of records. It is called with two parameters that indicate the record to start with and the number of records to return from that point. An example is shown in Listing 9-20.

Example 9.20. Using the range() Property

$query = db_select('role', 'r'),
$query
  //->condition('p.module', 'node', '=')
  ->condition('r.name', 'administrator', '<>')
  ->fields('r', array('name'))
  ->fields('p', array('permission', 'module'))
  ->join('role_permission', 'p', 'r.rid = p.rid')
;
$query->orderRandom();
//$query->orderBy('name'),
//$query->orderBy('module'),
//$query->orderBy('permission'),
$query->range(5, 8);
$result = $query->execute();

foreach ($result as $record) {
  print($record->name . ', ' . $record->module . ', ' . $record->permission . '<br>'),
}

This limits the query to return eight records, starting with the fifth one. The output is shown in Figure 9-8.

Output from a range-limited query

Figure 9.8. Output from a range-limited query

Some notes on the range() method:

  • To return just the first n records, set the first parameter to 0: ->range(0,10).

  • Calling range() subsequent times will overwrite previously set values.

  • Calling range() with no parameters will remove any range restrictions from the query.

Counting

The countQuery() method returns the number of rows returned by any query. Listing 9-21 shows an example.

Example 9.21. Counting the Records Returned from a Query

$query = db_select('role_permission', 'p'),
$query
  ->condition('p.module', 'node', '=')
  ->fields('p', array('permission'))
;
$num_rows = $query->countQuery()->execute()->fetchField();

This outputs what you'd expect, as Figure 9-9 shows.

Counting rows

Figure 9.9. Counting rows

Query Alteration

Another advantage of dynamic queries over static queries is that other modules can alter the query before it is executed against the database. Since the query is expressed in a Drupal-smart form instead of in raw SQL, we can tag the query with metadata that lets us identify a query that another module might be interested in.

This tagging is done with the addTag() method, as shown in Listing 9-22.

Example 9.22. The addTag() Method Provides Information for Oher Modules

$query = db_select('role', 'r'),
$query
  ->condition('p.module', 'node', '=')
  ->condition('r.name', 'administrator', '=')
  ->fields('r', array('name'))
  ->fields('p', array('permission', 'module'))
  ->join('role_permission', 'p', 'r.rid = p.rid')
;
$query->orderBy('name'),
$query->orderBy('module'),
$query->orderBy('permission'),
$query->addTag('get_permissions'),
$result = $query->execute();

foreach ($result as $record) {
  print($record->name . ', ' . $record->module . ', ' . $record->permission . '<br>'),
}

The output, shown in Figure 9-10, displays all node permissions granted to administrator. The query is sorted by name, then module, then permission, all ascending.

A query before altering

Figure 9.10. A query before altering

Notice that the tag, 'get_permissions' was added to this query. Tagging the query just marks the query with some descriptive tags and doesn't cause any immediate processing. However, it is possible to create a hook that will be invoked when a tagged query is encountered.

You can put any tag on your query but there are some conventions used in the core modules:

  • node_access: This query should have node access restrictions placed on it.

  • translatable: This query should have translatable columns.

  • term_access: This query should have taxonomy term-based restrictions placed on it.

  • views: This query is generated by the views module.

All dynamic queries are passed through the query alter hook when the execute() method is called and just before the query string is compiled. That means modules have the opportunity to alter the query before it goes to the database.

The format for the hook is module_name_query_tag_name_alter, where module_name is the name of our module, and tag_name is the name of the tag we added. The hook function that captures the query is shown in Listing 9-23.

Example 9.23. A Query Alter Hook for a Tagged Query

function weather_info_query_get_permissions_alter(QueryAlterableInterface $query) {
  $order =& $query->getOrderBy();
  $order['permission'] = 'DESC';
}

This function will modify the ordering parameters, changing the order of 'permission' from 'ASC' to 'DESC'. Notice that the query is passed as a type of QueryAlterableInterface. Even though it is not passed by reference, it acts as though it is because of the way PHP 5 handles objects. It is essential to grab the properties you are looking for using the reference assignment (=&) so that anything that's altered affects the original query string.

Note

Be sure you don't execute the query inside the query alter hook, as this will cause an infinite loop.

Table 9-1 shows the properties that are available to the query alter hook.

Table 9.1. Properties That Can Be Modified on the Select Query Object

Artifact

Description

conditions();

Returns a reference to the conditions in the WHERE clause for this query.

getExpressions();

Returns a reference to the expressions array for this query.

getFields();

Returns a reference to the fields array for this query.

getGroupBy();

Returns a reference to the group-by array for this query.

getOrderBy();

Returns a reference to the order by array for this query.

getTables();

Returns a reference to the tables array for this query.

getUnion();

Returns a reference to the union queries for this query. This includes queries for UNION, UNIONALL, and UNION DISTINCT.

havingConditions();

Helper function to build most common HAVING conditional clauses.

There is no need to take any action inside the query alter hook. As soon as the function is completed, the query engine will look for any other alter hooks that need to be executed, and then it will compile the query string and send it to the database engine.

The result of our query, having been created in our module and altered to change the sort order, is shown in Figure 9-11.

The result of an altered query

Figure 9.11. The result of an altered query

In sum, the dynamic query model provides all of the functionality of the static query model, with the added benefit of being cross-database compatible. I believe it's also easier to read and maintain due to its self-documenting mode of expression.

Many of the methods that are available for SELECT queries are also available for other queries, as we will see in the next sections.

Dynamic Insert Query

Inserting records into a database can be done with the db_insert() function, which uses syntax similar to the db_select() function described earlier.

Tip

There are built-in methods that are designed to insert information into certain database tables. For example, the node_save() function writes a node to the database and, if necessary, updates several tables that form the basis of the node hierarchy. If you create a record in the node table and don't update the rest of the tables, you could end up with an unstable system.

The dynamic insert query method should be used with caution, and you should know the interdependencies of all of the data you are inserting before you use this function.

In the following sections, I will be using a table we created using the Schema API, which is described in Chapter 10. Figure 9-12 shows the layout of the table and some sample data.

Sample database table to illustrate the database API

Figure 9.12. Sample database table to illustrate the database API

Using the db_insert() function in its basic form requires just specifying the name of the table and the fields and values you want to insert. Listing 9-24 shows an example of the db_insert() function.

Example 9.24. The db_insert() Function Provides a Database-Agnostic Insert

$fields = array(
  'location' => 'San Luis Obispo',
  'language' => 'en',
  'units' => 'R',
  'uid' => 1,
  'timestamp' => REQUEST_TIME,
  );
$hid = db_insert('weather_info')
  ->fields($fields)
  ->execute();

Notice the $fields variable. This is a keyed array with the key being the name of the column in the table and the value being the value you want to assign to the column in the new row. Executing the db_insert() function will return, by default, the value of the primary key of the inserted row. This can be changed by setting the return value of the $options parameter, as shown in Listing 9-25.

Example 9.25. Changing the Options on an Insert

$fields = array(
  'location' => 'San Luis Obispo',
  'language' => 'en',
  'units' => 'R',
  'uid' => 1,
  'timestamp' => REQUEST_TIME,
  );
$options = array('return' => Database::RETURN_STATEMENT);
$statement = db_insert('weather_info', $options)
  ->fields($fields)
  ->execute();

This example returns the statement that was created by the database abstraction layer instead of the value of the inserted key. The options available are:

  • target: This is the database target against which to execute a query. Valid values are "default" or "slave". This is used if you are running a high-availability configuration with more than one database.

  • fetch: This element controls how rows from a result set will be returned. Legal values include PDO::FETCH_ASSOC, PDO::FETCH_BOTH, PDO::FETCH_OBJ, PDO::FETCH_NUM, or a string representing the name of a class. If a string is specified, each record will be fetched into a new object of that class. The behavior of all other values is defined by PDO. See http://www.php.net/PDOStatement-fetch for more information.

  • return: Depending on the type of query, different return values may be meaningful. This directive instructs the system which type of return value is desired. The system generally sets the correct value automatically, so it is rare that a module developer will ever need to specify this value. Setting the value incorrectly would likely lead to unpredictable results or fatal errors. Legal values include:

    • Database::RETURN_STATEMENT: Returns the prepared statement object for the query. This is usually meaningful only for SELECT queries, where the statement object is how one accesses the result set returned by the query.

    • Database::RETURN_AFFECTED: Returns the number of rows affected by an UPDATE or DELETE query. Note that this means the number of rows that actually changed, not the number of rows matched by the WHERE clause.

    • Database::RETURN_INSERT_ID: Returns the sequence ID (primary key) created by an INSERT statement on a table that contains a serial column.

    • Database::RETURN_NULL:Returnsnothing, as there is no meaningful value to return. This is the case for INSERT queries on tables that don't contain a serial column.

  • throw_exception: By default, the database system will catch any error on a query as an exception, log it, and then rethrow it so that code further up the call chain can take an appropriate action. To suppress that behavior and return NULL on failure, set this option to FALSE.

Like db_select(), the db_insert() function can be specified in a more compact form, eliminating the need to hold certain values as variables. This is shown in Listing 9-26.

Example 9.26. Compact Form for the db_insert() Function

$returned = db_insert('weather_info', array(
    'return' => Database::RETURN_STATEMENT
  ))
  ->fields(array(
    'location' => 'San Luis Obispo',
    'language' => 'en',
    'units' => 'R',
    'uid' => 1,
    'timestamp' => REQUEST_TIME,
    ))
  ->execute();

Multi-Insert Form

The db_insert() function can be executed with more than one record being inserted at a time. To do this, you must use a "degenerate" form of the query. Listing 9-27 shows the form using the query we've been working with.

Example 9.27. The Degenerate Form of db_insert()

$hid = db_insert('weather_info')
  ->fields(array(
      'location',
      'language',
      'units',
      'uid',
      'timestamp'))
  ->values(array(
      'location' => 'San Luis Obispo',
      'language' => 'en',
      'units' => 'R',
      'uid' => 1,
      'timestamp' => REQUEST_TIME,
      ))
  ->execute();

Notice that the fields and values are specified separately. By using this form, we can specify multiple sets of values and then execute them all at once, as shown in Listing 9-28.

Example 9.28. Executing a Multiple-Record Insert

$hid = db_insert('weather_info')
  ->fields(array(
      'location',
      'language',
      'units',
'uid',
      'timestamp'
      ))
  ->values(array(
'San Luis Obispo',
      'en',
      'R',
      1,
      REQUEST_TIME,
      ))
  ->values(array(
      'Cupertino',
      'en',
      'F',
      1,
      REQUEST_TIME,
      ))
  ->values(array(
      'San Diego',
      'es',
      'C',
      1,
      REQUEST_TIME,
      ))
  ->execute();

Another way to execute this multi-record insert would be to create an array and then iterate over it using the foreach() control structure, as shown in Listing 9-29.

Example 9.29. Processing an Array of Values

$values = array(
    array('San Luis Obispo', 'en', 'R', 1, REQUEST_TIME),
    array('Cupertino', 'en', 'F', 1, REQUEST_TIME),
    array('San Diego', 'es', 'C', 1, REQUEST_TIME),
    );

  $query = db_insert('weather_info')
    ->fields(array(
       'location',
       'language',
       'units',
       'uid',
       'timestamp'
       ));
  foreach($values as $record) {
    $query->values($record);
  }
  $hid = $query->execute();
  print $hid;

This method is used often in the core as well as in many contributed modules, so you should feel comfortable with it. Note that if you are using the multi-insert technique, the return value is undefined and should not be trusted as it may vary depending on the database driver.

Not only is this method a bit more straightforward than executing three separate inserts, it is potentially safer and more efficient. It is up to the database driver to determine the best way to do this insert. For example, multiple insert statements could be executed together inside a transaction for greater data integrity and speed. Or, if the database engine supports batch inserts, the driver might use that approach instead.

Dynamic Update Query

You can use the db_update() function to update data in a table. Listing 9-30 shows an example.

Example 9.30. The db_update() Function

$rows_updated = db_update('weather_info')
  ->fields(array(
      'language' => 'EN'
      ))
  ->condition('language', 'en', '=')
  ->execute();

This function generates the following SQL query:

UPDATE weather_info SET language='EN' WHERE language='en';

The value returned is the number of rows that are affected, which is not necessarily the same as the number of rows that are actually changed. For example, if there are three rows where the language is 'en' and two where the language is 'EN', the two uppercased rows will be matched, but since they didn't change, they will not be counted in the return value. The function will return a 3.

Dynamic Delete Query

The db_delete() function works similarly to the db_update() function in that it requires a condition and returns the number of records affected. An example is shown in Listing 9-31.

Example 9.31. The db_delete() Function

$rows_deleted = db_update('weather_info')
  ->condition('location', 'San Luis Obispo', '=')
  ->execute();

This returns the number of records that were deleted.

Dynamic Merge Query

Merge queries are a special type of hybrid query. A merge query is a combination of an insert and an update. If a given condition is met, usually a row with the primary key, an update query is run. If the specified key is not found, an insert query is run. Although there is a formal specification for merge in the SQL standard, many databases don't implement the feature, so it is the job of the database driver to encapsulate the logic for the Drupal db_merge() function.

The syntax is similar to the other dynamic query functions. Listing 9-32 shows an example.

Example 9.32. A Dynamic Merge Query Using db_merge()

$records_merged = db_merge('weather_info')
  ->key(array('location' => 'San Diego'))
  ->fields(array(
      'location' => 'San Diego, CA',
      'language' => 'es',
      ))
  ->execute();

If the merge function finds the location 'San Diego, CA' in one or more of the records, those records will be updated with a new location and language. Otherwise, the function will attempt to insert a new row. Note, however, that if the merge function needs to insert a new row, it will fail. Why? Because the table has fields that don't have a default value and aren't specified in the fields array. To fix that, we need to specify two different sets of fields for the two different conditions that the merge query might encounter. An example of this is shown in Listing 9-33.

Example 9.33. Adding the updateFields() Object

$records_merged = db_merge('weather_info')
  ->key(array('location' => 'San Francisco, CA'))
  ->fields(array(
      'location' => 'San Francisco, CA',
      'language' => 'us',
      'units' => 'F',
      'uid' => 1,
      'timestamp' => REQUEST_TIME,
      ))
  ->updateFields(array(
      'language' => 'es',
      ))
  ->execute();

In this case, if the key is not found, the merge function will insert a new record using the fields() array. If the record does exist in the database, only the language field will be updated. Note that if there is more than one record that matches the key condition, each of the rows will be updated.

Tip

When you are using this form of the db_merge() function, be sure to specify the updateFields() array after the fields() array.

Expressions

Expressions can be used to do some computation when a record is inserted. An example is shown in Listing 9-34.

Example 9.34. Using Expressions on the db_merge() Function

db_merge('search_index')
  ->key(array(
    'word' => $word,
    'sid' => $sid,
    'type' => $module,
  ))
  ->fields(array('score' => $score))
  ->expression('score', 'score + :score', array(':score' => $score))
  ->execute();

This is commonly used where there is some sort of counter that keeps track of similar records in the table. The example in Listing 9-34 is used by the search function. In this case, if the word doesn't exist in the search_index table, it is added with the precalculated score value. If it is already there, the score value is incremented by the score value.

Conditional Clauses

The db_select(), db_update(), and db_delete() functions can make use of a conditional clause that is eventually turned into the appropriate conditional clause for the target database server.

There are two basic ways of specifying conditions in your query:

  • ->condition(): This specifies a condition that defines a field and a value with an operator that says what the relationship is between the two. This is the preferred method because of its abstract nature. A simple example is:

    $query->condition('timestamp', REQUEST_TIME - 3600, '<'),
  • ->where(): For more complex clauses, the where method allows arbitrary SQL as a conditional fragment. This can contain a string that will be placed following a SQL WHERE statement, or it can contain string replacements, as shown below:

    $query->where("$alias.name <> test.name");
    $query->where('age > :age', array(':age' => 26));

    This form is deprecated, as it could make your code incompatible with other database engines.

In addition to the simple comparison operators (=, <, >, <=, >=), there are operators that require an array, as shown in Listing 9-35.

Example 9.35. Array Operators for the condition() Property

$query->condition('name',
  array(
    'location', 'language', 'unit',
    ),
  'IN'
);
// Becomes: name IN ('location', 'language', 'unit')


$query->condition('timestamp',
  array(REQUEST_TIME – 7200, REQUEST_TIME - 3600),
  'BETWEEN'
);
// Becomes: timestamp BETWEEN 1291305433 AND 1291309033

Nested Conditionals

Conditions can also be nested by specifying multiple condition() properties. This is shown in Listing 9-36.

Example 9.36. Conditions Can Be Nested

$query
  ->condition('field1', array(1, 2), 'IN')
  ->condition(db_or()->condition('field2', 5)->condition('field3', 6));
// Becomes: (field1 IN (1, 2) AND (field2 = 5 OR field3 = 6))

The default when specifying multiple conditions is to "AND" them together. We see this with the two conditions that are on separate lines. The db_or() function also takes multiple condition() properties but returns a SQL string set to "OR" all included conditions together. Combining that with the first condition gives us the desired conditional string.

Null Values

There are two methods for specifying the "nullness" of a field in the query, as shown in Listing 9-37.

Example 9.37. Determining Nullness

$query->isNull('language'),
// Becomes (language IS NULL)

$query->isNotNull('language'),
// Becomes: (language IS NOT NULL)

Error Handling

The database abstraction layer sits on top of Drupal's error-handling framework. This means errors that occur while accessing the database will be displayed and logged, depending on the site-wide settings for error logging.

If an error is detected by the database layer, it is thrown back up to the calling function until someone catches it. You can use try...catch structures to deal with problems in an intelligent way. Listing 9-38 shows an example.

Example 9.38. Trapping Errors Using try...catch

try {
  $records_merged = db_merge('weather_info')
    ->key(array('location' => 'San Francisco, ZA'))
    ->fields(array(
        'location' => 'San Francisco, CA',
        'language' => 'us',
        ))
    ->updateFields(array(
        'language' => 'es',
        ))
    ->execute();
}
catch(exception $err) {
  drupal_set_message("Whoops! Trouble merging data: $err", 'error'),
}

This sort of logic doesn't do much; I include it here just as an example. Using try...catch to handle errors does have an impact, however, if you are using transactions.

Transactions

The Drupal database abstraction layer supports transactions and includes a provision for creating a transparent fallback for databases that don't support transactions. Transactions can get quite complicated in certain situations, so transaction-processing behavior varies among databases.

In order to deal with nested transactions, the database abstraction layer sets up a framework using constructors and destructors. This is necessary because one PHP function might start a transaction and call another function that also starts a transaction. If the inner function were to end the transaction, it might unwittingly end the calling function's transaction.

Transactions are started using the db_transaction() object. They are committed once in the scope in which the db_transaction() object is created, and can be rolled back if necessary. Once the db_transaction() object goes out of scope, the transaction is committed and is torn down. Listing 9-39 shows an example using a try...catch block.

Example 9.39. Transactions in Action

$txn = db_transaction();

try {
  $records_merged = db_merge('weather_info')
    ->key(array('location' => 'San Francisco, ZA'))
    ->fields(array(
        'location' => 'San Francisco, CA',
        'language' => 'us',
        ))
    ->updateFields(array(
        'language' => 'es',
        ))
    ->execute();
}

catch(exception $err) {

  drupal_set_message('Whoops! Trouble merging data: ' .
    $err->xdebug_message, 'error'),
  $txn->rollback();
}

The catch will result in the message being sent to the screen, as shown in Figure 9-13.

Error caught and displayed

Figure 9.13. Error caught and displayed

When $txn is destroyed, the transaction will be committed. If your transaction is nested inside of another transaction, Drupal will track each transaction and only commit the outermost transaction when the last transaction object goes out of scope, that is, when all relevant queries complete successfully.

Summary

Drupal grew up using the LAMP stack, which utilizes MySQL as its database engine. The Drupal core, as well as most contributed and private modules, assumed that the database was MySQL and there are many instances of SQL statements written directly for that platform.

With version 7 of Drupal, the emphasis is on writing to an abstracted layer instead of assuming MySQL as the underlying data engine. Drupal's database layer now provides a database-agnostic way to manage content.

Even with this abstracted layer, however, you will surely find some code examples that write directly to a database, assuming it is MySQL. This may cause modules to break when they are deployed to a platform that does not use MySQL.

Keep this in mind as you develop your database interactions. Avoid embedding any SQL at all, but if you must, check to make sure that it is compatible with all database engines the code is likely to be deployed to.

In the next chapter, we will see what is required to deliver your module and provide updates as it goes through its lifecycle.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.210.205