Hour 7. Additional Data-Finding Operations Using the MongoDB Shell


What You’ll Learn in This Hour:

Image Getting a count of documents that match a find operation

Image Returning documents in a specific sorted order

Image Limiting the number of documents returned

Image Paging through a large dataset

Image Reducing the number of fields returned in documents

Image Finding distinct values for fields in a dataset


In the previous hour, you began retrieving documents from the MongoDB database. You also limited the documents to a specific set using query parameters that matched specific values of document fields. The MongoDB shell provides several other methods of controlling the documents that are returned, enabling you to make your database request more organized and optimized.

In this hour, you learn how to use additional methods on the Collection and Cursor objects to control how you retrieve information from a MongoDB database. For instance, counting the documents before retrieval helps you know the size of the dataset before you begin processing it. In addition, you can limit the amount of data returned by limiting the number of documents returned and limiting which fields get returned in the document.

Counting Documents

When accessing document sets in MongoDB, you might want to only get a count first before deciding to retrieve a set of documents. Performing a count is much less intensive on the MongoDB server and client because the actual documents do not need to be transferred.

When performing operations on the resulting set of documents from a find(), you also should be aware of how many documents you will be dealing with, especially in larger environments. Sometimes all you want is a count. For example, if you need to know how many users are configured in your application, you could just count the number of documents in the users collection.

The count() method on the Cursor object gives you a simple count of documents that are represented. For example, the following code uses the find() method to get a Cursor object; then it uses the count() method to get the number of items:

cursor = wordsColl.find({first: {$in: ['a', 'b', 'c']}});
itemCount = cursor.count();

The value of itemCount is the number of words that match the find() query.

Sorting Results Sets

An important aspect of retrieving documents from a MongoDB database is the capability to get them in a sorted order. This is especially helpful if you are retrieving only a certain number, such as the top 10, or if you are paging the requests. The options object provides the sort option that enables you to specify the sort order and direction of one or more fields in the document.

The sort() method on the Cursor object enables you to specify fields used to sort the documents represented in the cursor and return them in that order. The sort() method accepts an object that uses the field name as a property name and specifies the sort order as the value of 1 for ascending and -1 for descending.

For example, to sort on the name field in ascending order, you would use

myCollection.find().sort({name:1});

You can use sort() multiple times on the same cursor to sort on different fields. For example, to sort on the name field descending first and then the value field ascending, you could use

myCollection.find().sort({name:1}).sort({value:-1});

Limiting Result Sets

When finding documents on larger systems with more complex documents, you often want to limit what is being returned to reduce the impact on the network, memory on both server and client, and so on. You have three ways to limit the result sets that match a specific query. You can simply accept only a limited amount of documents, you can limit the fields that are returned, or you can page the results and get them in chunks.

Limiting Results by Size

The simplest method of limiting the amount of data returned in a find() or other query request is to use the limit() method on the Cursor object returned by the find() operation. The limit() method allows a fixed number of items to be returned with the Cursor object. This saves you from accidentally retrieving more objects than your application can handle.

For example, the following code displays only the first 10 documents in a collection, even though there could be thousands of matches:

cursor = wordsColl.find();
cursor = cursor.limit(10);
cursor.forEach(function(word){
  printjson(word);
});

Limiting Fields Returned in Objects

Another extremely effective method of limiting the resulting data when retrieving documents is to limit which fields are returned. Documents might have a lot of different fields that are useful in some circumstance but not in others. Consider which fields should be included when retrieving documents from the MongoDB server and request only the ones necessary.

To limit the fields returned from the server in a find() operation, you can use the projection parameter on the find() operation. The projection parameter is just a JavaScript object that specifies the field names as properties. The projection parameter enables you to either include or exclude fields by setting the value of the document field to 0/false for exclude or 1/true for include. You cannot mix includes and excludes in the same expression.

For example, to exclude the fields stats, value, and comments when returning documents in which the first field equals t, you would use the following fields option:

find({first:"t"}, {stats:false, value:false, comments:false});

Including just a few fields often is easier. For example, if you want to include only the word and size fields of documents in which the first field equals t, you would use

find({first:"t"}, {word:1, size:1 });

Paging Results

A common method of reducing the number of documents returned is using paging. Paging involves specifying a number of documents to skip in the matching set and also limiting the documents returned. The skip value is incremented each time by the amount returned the previous time.

To implement paging on a set of documents, you need to implement the limit() and skip() methods on the Cursor object. The skip() method enables you to specify a number of documents to skip before returning documents.

By incrementing the value used in the skip() method by the size used in limit() each time you get another set of documents, you can effectively page through the dataset.

When paging through large datasets, especially through requests from a website, you need to use a find() operation to build a cursor for each page. It is not a good idea to leave the cursor open for long periods of time waiting for subsequent web requests that might never come.

For example, the following statements find documents 1–10, then 11–20, and finally 21–30:

cursor = collection.find().sort({name:1});
cursor.limit(10);
cursor.skip(0);
cursor = collection.find().sort({name:1});
cursor.limit(10);
cursor.skip(10);
cursor = collection.find().sort({name:1});
cursor.limit(10);
cursor.skip(20);

Always include a sort option when paging data, to ensure that the order is always the same.

Finding Distinct Field Values

A useful query against a MongoDB collection is to get a list of the distinct values for a single field in a set of documents. Distinct means that even though thousands of documents exist, you want to know only the unique values.

The distinct() method on Collection objects enables you to find a list of distinct values for a specific field. Consider the syntax for the distinct() method:

distinct(key, [query])

The key parameter is the string value of the field name you want to get values for. You can specify subdocuments using the dot syntax, such as with stats.count. The query parameter is an object with standard query options to limit the documents used to evaluate for distinct field values.

For example, to find the distinct last names of users over 65 in a collection that has documents with first, last, and age fields, you would use the following operation:

lastNames =myUsers.distinct('last', { age: { $gt: 65} } );

The distinct() method returns an array with the distinct values for the field specified. For example:

["Smith", "Jones", ...]

Summary

In this hour, you learned how to use additional methods on the Collection and Cursor objects to control how you retrieve information from a MongoDB database. You learned that the limit() method can reduce the documents the cursor returns and that, using limit() and skip(), you can page through a large dataset. You also got a chance to sort documents returned in various orders.

In addition, you learned that being able to count documents before you actually retrieve them helps you know the size of the dataset before you begin processing it. The final section covered using distinct() to find distinct field values for a dataset.

Q&A

Q. How long is a cursor left open on the MongoDB server?

A. The MongoDB server automatically removes the cursor when you have read the last document from it. However, you can create a tailable cursor that remains open. Typically, tailable cursors are used for capped collections in which you want to retrieve new items as they are added to the collection.

Q. Does the MongoDB cursor time out?

A. Yes. By default, it times out after 10 minutes. You can use a noTimeout flag using the addOption() method on the Cursor object. For example:

cursor = db.inventory.find().addOption(DBQuery.Option.noTimeout);

Workshop

The workshop consists of a set of questions and answers designed to solidify your understanding of the material covered in this hour. Try answering the questions before looking at the answers.

Quiz

1. How do you limit the number of document returned by a cursor to 10?

2. What method would you use to get distinct values for the age field in the documents in a collection?

3. How do you sort a collection based on the name field in descending order?

4. True or false: You cannot skip documents when you are reading from a cursor.

Quiz Answers

1. Use limit(10) on the Cursor object.

2. Use distinct('age') on the Collection object.

3. sort({name:-1})

4. False. You can use skip() to skip documents in a Cursor.

Exercises

1. Create a MongoDB shell script that finds the distinct sizes of words in the example dataset that begin with q and end with y.

2. Create a MongoDB shell script that finds words that have a size larger than 10, and then get the distinct first letter for them.

3. Create a MongoDB shell script that returns a list of words beginning with q, sorted by size.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.20.240.124