Hour 16. Implementing MongoDB in Python Applications


What You’ll Learn in This Hour:

Image Using the Python Database objects to access the MongoDB database

Image Implementing the Python MongoDB driver in Python applications

Image Connecting to the MongoDB database in Python applications

Image Using methods to find and retrieve documents in Python applications

Image Sorting documents in a cursor before retrieving them in Python applications


This hour introduces you to implementing MongoDB in Python applications. To be able to access and utilize MongoDB in Python application, you first need to implement the Python MongoDB driver. The Python MongoDB driver is a library that provides the necessary objects and functionality to access a MongoDB server from your Python applications.

These object are similar to the objects you have already been working with in the MongoDB shell. The examples in this hour and the following one rely on the fact that you already understand the structure of Database object and requests. If you have not already gone through Hours 59, you should do so before continuing with this one.

The following sections describe the objects you deal with in Python to access the MongoDB server, databases, collection, and documents. You also implement the Python MongoDB driver and begin accessing documents in the example collection.

Understanding MongoDB Driver Objects in Python

The Python MongoDB driver provides several objects that enable you to connect to the MongoDB database and find and manipulate documents in collections. These objects represent the connection, database, collection cursor, and documents on the MongoDB server and provide the necessary functionality to integrate data from a MongoDB database into your Python applications.

The following sections describe how each of these objects is created and used in Python.

Understanding the Python MongoClient Object

The Python MongoClient object provides the functionality to connect to the MongoDB server and access databases. The first step you take in implementing MongoDB in your Python applications is to create an instance of the MongoClient object. Then you can use the object to get the database, set the write concern, and perform other operations (see Table 16.1).

Image

TABLE 16.1 Methods Available on the MongoClient Object in Python

To create an instance of the MongoClient object, you need to use new MongoClient() with the appropriate options. The most basic form connects on the localhost with the default port:

mongo = new MongoClient("")

You can also use a connection string that uses this format:

mongodb://username:password@host:port/database?options

For example, to connect to the words database on host 1.1.1.1 on port 8888 with username test and password myPass, you would use the following:

mongo = MongoClient("mongodb://test:[email protected]:8888/words")

After you have created an instance of the MongoClient object, you can use the methods in Table 16.1 to get the database and set options.

Understanding the Python Database Object

The Python Database object provides the functionality to authenticate, access users, and access and manipulate collections. The databases associated with a MongoClient are stored as part of the MongoClient object’s internal dictionary. The simplest method of getting an instance of a Database object is to access it directly by name on the MongoClient object. For example, the following gets a Database object for the words database:

mongo = MongoClient("")
db = mongo["words"]

After you have created an instance of the Database object, you can use the object to access the database. Table 16.2 lists the more common methods available on the Database object.

Image

TABLE 16.2 Methods Available on the Database Object in Python

Understanding the Python Collection Object

The Python Collection object provides the functionality to access and manipulate documents in a collection. The collections associated with a Database are stored as part of the Database object’s internal dictionary. The simplest method of getting an instance of a Collection object is to access it directly by name on the Database object. For example, the following gets a Collection object for the word_stats collection on the words database:

mongo = MongoClient("")
db = mongo["words"]
collection = db["word_stats"]

After you have created an instance of the Collection object, you can use the object to access the collection. Table 16.3 lists the more common methods available on the Collection object.

Image
Image

TABLE 16.3 Methods Available on the Collection Object in Python

Understanding the Python Cursor Object

The Python Cursor object represents a set of documents on the MongoDB server. Typically, a Cursor object is returned when you query the collection using a find operation. Instead of returning the full set of document objects to the Python application, a Cursor object is returned, enabling you to access the documents in a controlled manner from Python.

The Cursor object uses an index to iterate through documents on the server. The cursor pulls down documents from the server in batches. During iteration, when the index passes the end of the current batch, a new batch of documents is retrieved from the server.

The following code shows an example of getting an instance of the Cursor object using a find operation:

mongo = MongoClient("")
db = mongo['words']
collection = db['word_stats']
cursor = collection.find()

After you have created an instance of the Cursor object, you can use the object to access documents in the collection. Table 16.4 lists the more common methods available on the Cursor object.

Image

TABLE 16.4 Methods Available on the Cursor Object in Python

Understanding the Python Dictionary Objects Used as Parameters and Documents

As you saw in the MongoDB shell hours, most of the database, collection, and cursor operations accept objects as parameters. These objects define things such as query, sort, aggregation, and other operators. In addition, documents are returned from the database as objects.

In the MongoDB shell, these are JavaScript objects. However, in Python, objects that represent documents and request parameters are a Dictionary object. When the server returns a document from a cursor or request, it is in a Dictionary object, which has keys that match the fields in the document. For objects that you use as parameters to requests, you also use a Dictionary object.

Dictionary objects are built using the standard Python syntax:

myDict = {key : value, ...)

Setting the Write Concern and Other Request Options

Database operations that involve writing data to the database use a write concern that defines how to verify database writes before returning. As you likely noticed in the previous sections, several objects have a write_concern property that you can set to a Dictionary object that defines the write concern options. These options enable you to configure the write concern, timeout, and other options that best fit your application.

The following list describes some of the options you can set in the write_concern Dictionary object:

Image w: Sets the write concern value as 1 for acknowledged, 0 for unacknowledged, and majority for majority.

Image j: Set to True or False to enable or disable journal acknowledged.

Image wtimeout: Amount of time (in milliseconds) to wait for write concern acknowledgment.

Image fsync: Forces the database to fsync all files before returning when True.

As an example, the following illustrates using a basic options Dictionary object in Python:

collection.write_concern = {'w' : 1, 'j' : True, 'wtimeout': 10000, 'fsync': True);

Finding Documents Using Python

A common task in Python applications is to find one or more documents that you need to use in your application. Finding documents in Python is similar to finding them using the MongoDB shell. You can get one document or many, and you can use queries to limit which documents are returned.

The following sections discuss using the Python objects to find and retrieve documents from a MongoDB collection.

Getting the Documents from MongoDB Using Python

The Collection object provides the find() and find_one() methods, similar to what you saw in the MongoDB shell. These methods find a single document or multiple documents.

When you call find_one(), the server returns a single document as a Dictionary object. You can then use the object in your application as needed. For example:

doc = myColl.find_one()

The find() method on the Collection object returns a Cursor object that only represents the documents found and does not initially retrieve them. The Cursor object can be iterated in a few different ways.

You can use a for loop method to determine whether you have reached the end of the cursor. For example:

cursor = myColl.find()
for doc in cursor:
  print (doc)

Because Python treats the cursor as a list, you can also use Python slice syntax to access portions of the cursor. For example, the following finds all documents and then displays documents 5–10:

cursor = collection.find()
slice = cursor[5:10]
for doc in slice:
  print (doc)

Finding Specific Documents from MongoDB Using Python

Generally, you do not want to retrieve all documents in a collection from the server. The find() and find_one() methods enable you to send a query object to the server that limits documents in the same way you saw with the MongoDB shell.

To build the query object, you can use the Dictionary object described earlier. For fields in the object that require subobjects, you can create a sub Dictionary object. For other types, such as integers, strings, and arrays, use the Python equivalent.

For example, to create a query object that finds words with size=5, you would use

query = {'size' : 5}
myColl.find(query)

However, to create a query object that finds words with size>5, you would need to use

query = {'size' :
    {'$gt' : 5}}
myColl.find(query)

To create a query object that finds words with a first letter of x, y, or z, you would need to use a String array. For example:

query = {'first' :
    {'$in' : ["x", "y", "z"]}}
myColl.find(query)

You should be able to use these techniques to build any type of query object you need—not only for find operations, but for others that enable you to use a query object.

Counting Documents in Python

When accessing document sets in MongoDB from Python, you might want to only get a count first before deciding to retrieve a set of documents. Performing a count is much less intensive on the MongoDB server and client because the actual documents do not need to be transferred.

The count() method on the Cursor object enables you to get a simple count of documents that are represented. For example, the following code uses the find() method to get a Cursor object and then uses the count() method to get the number of items:

cursor = wordsColl.find()
itemCount = cursor.count()

The value of itemCount is the number of words that match the find() operation.

Sorting Result Sets in Python

An important aspect of retrieving documents from a MongoDB database is the capability to get them in a sorted order. This is especially helpful if you are retrieving only a certain number of results, such as the top 10, or if you are paging the requests. The options object provides the sort option, which enables you to specify the sort order and direction of one or more fields in the document.

The sort() method on the Cursor object enables you to specify fields to sort the documents represented in the cursor and return them in that order. The sort() method accepts a list of tuples that provide a (key,order) pair. The key is the field name to sort on, and order is 1 for ascending and -1 for descending.

For example, to sort on the name field in ascending order, you would use

sorter = [('name', 1)]
cursor = myCollection.find()
cursor.sort(sorter)

You can use multiple fields in the object passed to the sort() method, and the documents will be sorted on those fields. You can also apply sort() multiple times on the same cursor to sort on different fields. For example, to sort on the name field descending first and then the value field ascending, you could use

sorter = [('name', 1), ('value', -1)];
cursor = myCollection.find()
cursor.sort(sorter)

Or you could use

sorter1 = [('name', 1)]
sorter2 = [('value', -1)]
cursor = myCollection.find()
cursor = cursor.sort(sorter1)
cursor.sort(sorter2)

Summary

In this hour, you looked at the objects the Python MongoDB driver provides. These objects represent the connection, database, collection, cursor, and documents and provide functionality to access MongoDB from your Python applications.

You also implemented the Python MongoDB driver and created a basic Python MongoDB application to connect to the database. Then you learned how to use the Collection and Cursor objects to find and retrieve documents. Finally, you learned how to count and sort documents represented by the cursor before retrieving them.

Q&A

Q. Are there additional Python objects not discussed in this hour?

A. Yes. This hour covers the major objects you need to know about. However, the Python MongoDB driver has a lot more supporting objects and functions. You can find the documentation at http://api.mongodb.org/python/current/api/index.html.

Q. Which versions of Python support implementing MongoDB?

A. It depends on your platform. Most platforms support version 2.5 and above for both 32-and 64-bit versions of Python.

Workshop

The workshop consists of a set of questions and answers designed to solidify your understanding of the material covered in this hour. Try answering the questions before looking at the answers.

Quiz

1. How do you control which documents a find() operation returns?

2. How do you sort documents based on the name field in ascending order?

3. How do you get the value of fields in the Database object?

4. True or false: The find_one() method returns a Cursor object.

Quiz Answers

1. Create a Dictionary query object that defines a query filter.

2. Create a parameter called [('name', 1)] and pass it to the sort() method.

3. Use the get(fieldName) method.

4. False. It returns a Dictionary object representing the document.

Exercises

1. Extend the PythonFindSort.py file to include a method that sorts documents first by size in descending order and then by last letter, also in descending order.

2. Extend the PythonFindSpecific.py file to find words that start with a and end with e.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.49.182