Hour 14. Accessing Data from MongoDB in PHP Applications


What You’ll Learn in This Hour:

Image Paging documents from a large dataset in PHP

Image Limiting which fields are returned from documents in PHP

Image Using methods to generate lists of distinct field values in documents from a collection in PHP

Image Implementing grouping from PHP to group documents and build a return dataset

Image Applying an aggregation pipeline to build a dataset from documents in a collection from PHP


In this hour, you continue your exploration of the PHP MongoDB driver and how to implement it to retrieve data in your PHP applications. This hour takes you through the process of using PHP applications to limit the return results by limiting the number of documents returned, limiting the fields returned, and paging through large sets.

Then this hour covers how to implement distinct grouping and aggregation operations from a PHP application. These operations enable you to process data on the server before returning it to the PHP application, reducing the amount of data sent and the work required in the application.

Limiting Result Sets Using PHP

On larger databases with more complex documents, you often want to limit what is being returned in requests to reduce the impact on the network, the memory on both the server and client, and more. You have three ways to limit the result sets that match a specific query. You can simply accept only a limited amount of documents, you can limit the fields that are returned, or you can page the results and get them in paged chunks.

Limiting Results by Size in PHP

The simplest method of limiting the amount of data returned in a find() or other query request is to use the limit() method on the MongoCursor object returned by the find() operation. The limit() method limits only the cursor so that it returns a fixed number of items. This can save you from accidentally retrieving more objects than your application can handle.

For example, the following PHP code displays only the first 10 documents in a collection, even though there could be thousands:

$cursor = $wordsColl->find();
$cursor->limit(10);
while($cursor->hasNext()){
  $word = cursor->getNext();
  print_r($word);
}

Limiting Fields Returned in Objects in PHP

Another extremely effective method of limiting the resulting data when retrieving documents is to limit which fields are returned. Documents might have a lot of fields that are useful in some circumstances but not others. Consider which fields to include when retrieving documents from the MongoDB server and request only the ones necessary.

To limit the fields returned from the server in a find() operation on a MongoCollection object, you can use the fields parameter, which is an Array that contains the files with a value of true to include or false to exclude.

For example, to exclude the fields stats, value, and comments when returning documents, you would use null for the query object because you are not finding all objects and include the following fields object:

$fields = array('stats' => false, 'value' => false, 'comments' => false);
$cursor = $myColl->find(null, $fields);

Including just a few fields is often easier. For example, if you want to include only the word and size fields of documents in which the first field equals t, you would use

$query = array('first' => 't');
$fields = array('word' => true, 'size' => true);
$cursor = $myColl->find($query, $fields);

Paging Results in PHP

A common method of reducing the amount of documents returned is to use paging. Paging involves specifying a number of documents to skip in the matching set, as well as specifying a limit on the documents returned. Then the skip value is incremented each time by the amount returned the previous time.

To implement paging on a set of documents, you need to implement the limit() and skip() methods on the MongoCursor object. The skip() method enables you to specify a number of documents to skip before returning documents.

By incrementing the value used in the skip() method by the size used in limit() each time you get another set of documents, you can effectively page through the dataset.

For example, the following statements find documents 11–20:

$cursor = $collection->find();
$cursor->limit(10);
$cursor->skip(10);

Always include a sort option when paging data, to ensure that the order of documents is the same.

Finding Distinct Field Values in PHP

A useful query against a MongoDB collection is to get a list of the distinct values for a single field in a set of documents. Distinct means that even though thousands of documents exist, you want to know only the unique values.

The distinct() method on MongoCollection objects enables you to find a list of distinct values for a specific field. The syntax for the distinct() method follows:

distinct(key, [query])

The key parameter is the string value of the field name you want to get values for. You can specify subdocuments using the dot syntax, such as stats.count. The query parameter is an object with standard query options to limit the documents used to evaluate for distinct field values.

For example, to find the distinct last names of users over 65 in a collection that has documents with first, last, and age fields, you would use the following operation:

$query = array('age' =>
    array('$gt' => 65));
$lastNames = $myCollection.distinct('last', $query);

The distinct() method returns an array with the distinct values for the field specified. For example:

["Smith", "Jones", ...]

Grouping Results of Find Operations in PHP Applications

When performing operations on large datasets in PHP, it is often useful to group the results based on the distinct values of one or more fields in a document. You could do this in code after retrieving the documents, but it is much more efficient to have the MongoDB server do it for you as part of a single request that is already iterating though the documents.

In PHP, to group the results of a query, you can use the group() method on the MongoCollection object. The group request collects all the documents that match a query, adds a group object to an array based on distinct values of a set of keys, performs operations on the group objects, and returns the array of group objects.

The syntax for the group() methods follows:

group({key, cond , initial, reduce, [finalize]})

The key, cond, and initial parameters are Arrays that define the fields to use, query, and limit documents and initial value settings. The reduce and finalize methods are String objects that contain a string form of a JavaScript function that is run on the server to reduce and finalize the request. See Hour 9, “Utilizing the Power of Grouping, Aggregation, and Map Reduce,” for more information on these parameters.

To illustrate, the following code implements a basic grouping by generating the key, cond, and initial objects and passing in a reduce function as a string:

$key = array('first' => true);
$cond = array('last' => 'a', 'size' => 5);
$initial = array('count' => 0);
$reduce = "function (obj, prev) { prev.count++; }";
$options = array('condition' => $cond);
$results = $collection->group($key, $initial, $reduce, $options);

The result from the group() method is an Array object that contains the grouped results in an element named retval. The retval element is a list of the aggregated results. To illustrate, the following code accesses and displays the items in the grouped results one at a time:

foreach($results['retval'] as $idx => $result){
    print_r(json_encode($result)." ");
}

Using Aggregation to Manipulate the Data During Requests from PHP Applications

Another valuable tool when working with MongoDB in PHP applications is the aggregation framework. The MongoCollection object provides the aggregate() method to perform aggregation operations on data. The syntax for the aggregate() method follows:

aggregate(operator, [operator, ...])

The operator parameter is one or more operator objects that provide the pipeline for aggregating the results. The actual operator is an Array built with the operators. Hour 9 described the aggregation operators, so you should already be familiar with them.

As an example, the following code creates $group and $limit operators. The $group operator groups by _id of the word field and adds an average field using the $avg that averages a field named size. Notice that the field names must be prefixed with $ in aggregation operations:

$group = array('$group' =>
           array('_id' => '$word',
                 'average' => array('$avg' => '$size')));
$limit = array('$limit' => 10);
$result = $collection->aggregate($group, $limit);

The result from the aggregate() method is an Array object that contains the aggregation results in an element named result. The result is a list of the aggregated results. To illustrate, the following code accesses and displays the items in the aggregated results one at a time:

foreach($result['result'] as $idx => $item){
   print_r(json_encode($item)." ");
}

Summary

In this hour, you learned how to use additional methods on the MongoCollection and Cursor objects. You learned that the limit() method can reduce the documents the cursor retrieves and that using limit() and skip() enables you to page through a large dataset. Using a fields parameter on the find() method enables you to reduce the number of fields returned from the database.

This hour also covered applying the distinct(), group(), and aggregate() methods on the MongoCollection object to perform data gathering operations from a PHP application. These operations enable you to process data on the server before returning it to the PHP application, reducing the amount of data sent and the work required in the application.

Q&A

Q. Will the PHP MongoDB driver objects throw exceptions when they encounter errors?

A. Yes. The PHP MongoDB driver includes several exception objects, such as MongoException and MongoCursorException, that can be thrown if errors occur inside the driver code.

Q. Is there a way to convert PHP Array objects to and from BSON objects?

A. Yes. The PHP MongoDB driver provides the bson_encode(BSON) and bson_decode(array) methods to encode and decode BSON objects to and from arrays.

Workshop

The workshop consists of a set of questions and answers designed to solidify your understanding of the material covered in this hour. Try answering the questions before looking at the answers.

Quiz

1. How do you get the documents 21–30 represented by a Cursor object in PHP?

2. How do you find the different values for a specific field on documents in a collection in a PHP application?

3. How do you return the first 10 documents from a collection in PHP?

4. How do you prevent a specific field from being returned from a database query in PHP?

Quiz Answers

1. Call limit(10) and skip(20) on the Cursor object.

2. Use the distinct() method on the MongoCollection object.

3. Use the limit(10) method on the MongoCursor object.

4. Set the field to false in the fields object passed to the find() method.

Exercises

1. Write a new PHP application that finds words in the example dataset that start with n, sorts them by length descending, and then displays the top five.

2. Extend the PHPAggregate.php file to include a function that performs an aggregate that matches the words with a length of 4, limits it to only five documents, and finally projects the word as the _id field and displays the stats. The matching MongoDB shell aggregation would look similar to the following:

{$match: {size:4}},
{$limit: 5},
{$project: {_id:"$word", stats:1}}

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.212.29