What You’ll Learn in This Hour:
Inserting new documents into a collection from Python
Removing documents from a collection in Python
Getting, manipulating, and saving a single document in a collection from Python
Updating documents in a collection from Python
Performing an upsert operation from Python
This hour continue last hour’s exploration of the Python MongoDB driver and how to implement it to add, manipulate, and remove documents from a collection in your Python applications. You have several methods for changing data in a collection. You can insert new documents, update existing documents using update or save, remove old documents, and apply an upsert (which tries to update documents and, if it finds none, then inserts a new one).
The following sections cover the various methods on the Python Collection
object that enable you to manipulate data in the collection. You see how to insert, delete, save, and update documents in a collection from your Python application.
An important task when interacting with MongoDB databases from Python is inserting documents into collections. To insert a document, you need to first create a Dictionary
object that represents the document you want to store. Insert operations pass the Dictionary
object to the MongoDB sever as a BSON that you can insert into the collections.
When you have a Dictionary
version of your new document, you can store it in the MongoDB database using the insert()
method on an instance of the Collection
object that is connected to the database. The following shows the syntax for the insert()
method, where the doc
parameter can be a single document object:
insert(doc)
For example, the following shows an example of inserting a single document and an array of documents into a collection:
doc1 = {'name' : 'Fred'}
result = myColl.insert(doc1)
To insert multiple documents into your collection, you can pass an array of Dictionary
objects to the insert()
method on the Collection
object. For example:
doc2 = {'name' : 'George'}
doc3 = {'name' : 'Ron'}
result = myColl.batchInsert([doc2, doc3])
Notice that the insert()
method returns a result
object that contains the object ID(s) of the new objects inserted into the database.
Sometimes you need to delete documents from your MongoDB collection from Python, to keep space consumption down, improve performance, and keep things clean. The remove()
method on Collection
objects makes it simple to delete documents from a collection. The syntax for the remove()
method follows:
remove([query])
The query
parameter is a Dictionary
object that identifies which document(s) you want to delete. The request matches the fields and values in the query
with the fields and values of the object, and only those that match the query are updated. If no query
is provided, all the documents in the collection are deleted.
For example, to delete all documents in the words_stats
collection, you would use
collection = myDB['word_stats']
results = collection.remove()
The following code deletes all words that start with a
from the words_stats
collection:
collection = myDB['word_stats']
query = {'first' : 'a'}
collection.remove(query)
A convenient method of updating objects in the database is to use the save()
method on Collection
objects. The save
method accepts a Dictionary
object as a parameter and saves it to the database. If the document already exists in the database, it is updated with the new values. If the document does not already exist in the database, a new document is created.
The following shows the syntax of the save()
method, where the doc
parameter is the Dictionary
object representing the document to be saved to the collection:
save(doc)
After you have inserted objects into a collection, you might need to update them often from Python as data changes. The update()
method on the Collection
object enables you to update documents in a collection. The update
method is versatile yet fairly easy to implement. The following shows the syntax for the update()
method:
update(query, update, [upsert], [manipulate], [safe], [multi])
The query
parameter is a Dictionary
object that identifies which document(s) you want to change. The request matches the properties and values in the query with the fields and values of the object, and only those that match the query are updated. The update
parameter is a Dictionary
object that specifies the changes to make to the documents that match the query. Hour 8, “Manipulating MongoDB Documents in a Collection,” describes the update operators used in this object.
The other parameters you need to understand for basic update()
operations are the upsert
and multi
parameters. The upsert
parameter is a Boolean that determines whether to do an upsert
operation. If it is True
and no documents match the query, a new document is inserted into the collection. The multi
parameter is also a Boolean. When True
, the update
operation is applied to all documents that match the query; otherwise, only the first document is updated.
For example, the following changes the category
field value to Old
for items in the collection in which category
currently is New
. With upsert
set to False
, new documents are not created even if no documents have a category of New
; with multi
set to True
, all documents that match are updated:
query = {'category' : 'New'}
update = {'$set' : {'category' : 'Old'}}
myColl.update(query, update, upsert=False, multi=True)
Another way to use the update()
method on the Collection
object in Python is to use an upsert operation. An upsert operation first tries to update documents in the collection. If no documents match the update query, then the $set
operator is used to create a new document and add it to the collection. The following shows the syntax for the update()
method:
update(query, update, [upsert], [manipulate], [safe], [multi])
The query
parameter identifies which document(s) you want to change. The update
parameter is a Dictionary
object that specifies the changes to make to the documents that match the query. For upsert operations, the upsert
parameter should be set to True
and the multi
parameter should be set to False
.
For example, the following performs an upsert on a document with name=myDoc
. The $set
operator defines the fields used to create or update the document. With upsert
set to True
, if the document is not found, it is created; otherwise, it is updated:
query = {'name' : 'myDoc'}
update = { '$set' : { 'name' : 'myDoc', 'number' : 5, 'score' : '10'}}
results = collection.update(query, update, upsert=True, multi=False)
In this hour, you used the Python MongoDB driver to add, manipulate, and remove documents from a collection in your Python applications. You used several different methods on the Collection
object to change data in a collection.
The insert()
method adds new documents. The remove()
method deletes documents. The save()
method updates a single document.
You can use the update()
method in multiple ways. You can have it update a single document or multiple documents. You can also apply the upsert option to insert new documents into the collection if none matches.
Q. Is there a way to programmatically execute MongoDB commands from a Python application?
A. Yes. The Python MongoDB driver provides the command()
method on the Database
object that accepts a command that will be executed on the MongoDB server.
Q. Is there a way to convert Python Dictionary objects to and from BSON objects?
A. Yes. the Python MongoDB driver provides the bson.BSON
class, which has decode(dict)
and encode(BSON)
methods to encode and decode BSON objects to and from dictionaries.
The workshop consists of a set of questions and answers designed to solidify your understanding of the material covered in this hour. Try answering the questions before looking at the answers.
1. Which operation would you use from a Python application to create a new document if one does not exist?
2. How do you limit the update()
method to only a single document?
3. True or false: You can use the save()
method on Collection
objects only to save existing documents.
4. What type of parameter defines the fields to update in the update()
method of the Collection
object?
1. The update()
method on the Collection
object, with upsert
set to True
.
2. Set the multi
parameter to False
.
3. False. save()
adds the document if it doesn’t exist.
4. It is a Dictionary
object that contains MongoDB update operators as fields.
1. Find a new word that you want to add to the word_stats
collection in the example dataset. Write a new Python application similar to PythonDocAdd.py
file to add that word.
2. Create a new Python application that uses the update(
) method to update all words with a first letter of e
and add the category of eWords
to them.
18.191.223.208