Chapter 3. Creating, Updating, and Deleting Documents

This chapter covers the basics of moving data into and out of the database, including the following:

  • Adding new documents to a collection

  • Removing documents from a collection

  • Updating existing documents

  • Choosing the correct level of safety versus speed for all of these operations

Inserting Documents

Inserts are the basic method for adding data to MongoDB. To insert a single document, use the collection’s insertOne method:

> db.movies.insertOne({"title" : "Stand by Me"})

insertOne will add an "_id" key to the document (if you do not supply one) and store the document in MongoDB.

insertMany

If you need to insert multiple documents into a collection, you can use insertMany. This method enables you to pass an array of documents to the database. This is far more efficient because your code will not make a round trip to the database for each document inserted, but will insert them in bulk.

In the shell, you can try this out as follows:

> db.movies.drop()
true
> db.movies.insertMany([{"title" : "Ghostbusters"},
...                        {"title" : "E.T."},
...                        {"title" : "Blade Runner"}]);
{
      "acknowledged" : true,
       "insertedIds" : [
           ObjectId("572630ba11722fac4b6b4996"),
           ObjectId("572630ba11722fac4b6b4997"),
           ObjectId("572630ba11722fac4b6b4998")
       ]
}
> db.movies.find()
{ "_id" : ObjectId("572630ba11722fac4b6b4996"), "title" : "Ghostbusters" }
{ "_id" : ObjectId("572630ba11722fac4b6b4997"), "title" : "E.T." }
{ "_id" : ObjectId("572630ba11722fac4b6b4998"), "title" : "Blade Runner" }

Sending dozens, hundreds, or even thousands of documents at a time can make inserts significantly faster.

insertMany is useful if you are inserting multiple documents into a single collection. If you are just importing raw data (e.g., from a data feed or MySQL), there are command-line tools like mongoimport that can be used instead of a batch insert. On the other hand, it is often handy to munge data before saving it to MongoDB (converting dates to the date type or adding a custom "_id", for example). In such cases insertMany can be used for importing data, as well.

Current versions of MongoDB do not accept messages longer than 48 MB, so there is a limit to how much can be inserted in a single batch insert. If you attempt to insert more than 48 MB, many drivers will split up the batch insert into multiple 48 MB batch inserts. Check your driver documentation for details.

When performing a bulk insert using insertMany, if a document halfway through the array produces an error of some type, what happens depends on whether you have opted for ordered or unordered operations. As the second parameter to insertMany you may specify an options document. Specify true for the key "ordered" in the options document to ensure documents are inserted in the order they are provided. Specify false and MongoDB may reorder the inserts to increase performance. Ordered inserts is the default if no ordering is specified. For ordered inserts, the array passed to insertMany defines the insertion order. If a document produces an insertion error, no documents beyond that point in the array will be inserted. For unordered inserts, MongoDB will attempt to insert all documents, regardless of whether some insertions produce errors.

In this example, because ordered inserts is the default, only the first two documents will be inserted. The third document will produce an error, because you cannot insert two documents with the same "_id":

> db.movies.insertMany([
    ... {"_id" : 0, "title" : "Top Gun"},
    ... {"_id" : 1, "title" : "Back to the Future"},
    ... {"_id" : 1, "title" : "Gremlins"},
    ... {"_id" : 2, "title" : "Aliens"}])
2019-04-22T12:27:57.278-0400 E QUERY    [js] BulkWriteError: write 
error at item 2 in bulk operation :
BulkWriteError({
    "writeErrors" : [
        {
            "index" : 2,
            "code" : 11000,
            "errmsg" : "E11000 duplicate key error collection: 
            test.movies index: _id_ dup key: { _id: 1.0 }",
            "op" : {
                "_id" : 1,
                "title" : "Gremlins"
            }
        }
    ],
    "writeConcernErrors" : [ ],
    "nInserted" : 2,
    "nUpserted" : 0,
    "nMatched" : 0,
    "nModified" : 0,
    "nRemoved" : 0,
    "upserted" : [ ]
})
BulkWriteError@src/mongo/shell/bulk_api.js:367:48
BulkWriteResult/this.toError@src/mongo/shell/bulk_api.js:332:24
Bulk/this.execute@src/mongo/shell/bulk_api.js:1186:23
DBCollection.prototype.insertMany@src/mongo/shell/crud_api.js:314:5
@(shell):1:1

If instead we specify unordered inserts, the first, second, and fourth documents in the array are inserted. The only insert that fails is the third document, again because of a duplicate "_id" error:

> db.movies.insertMany([
... {"_id" : 3, "title" : "Sixteen Candles"},
... {"_id" : 4, "title" : "The Terminator"},
... {"_id" : 4, "title" : "The Princess Bride"},
... {"_id" : 5, "title" : "Scarface"}],
... {"ordered" : false})
2019-05-01T17:02:25.511-0400 E QUERY    [thread1] BulkWriteError: write
error at item 2 in bulk operation :
BulkWriteError({
  "writeErrors" : [
    {
      "index" : 2,
      "code" : 11000,
      "errmsg" : "E11000 duplicate key error index: test.movies.$_id_
      dup key: { : 4.0 }",
      "op" : {
        "_id" : 4,
        "title" : "The Princess Bride"
      }
    }
  ],
  "writeConcernErrors" : [ ],
  "nInserted" : 3,
  "nUpserted" : 0,
  "nMatched" : 0,
  "nModified" : 0,
  "nRemoved" : 0,
  "upserted" : [ ]
})
BulkWriteError@src/mongo/shell/bulk_api.js:367:48
BulkWriteResult/this.toError@src/mongo/shell/bulk_api.js:332:24
Bulk/this.execute@src/mongo/shell/bulk_api.js:1186.23
DBCollection.prototype.insertMany@src/mongo/shell/crud_api.js:314:5
@(shell):1:1

If you study these examples closely, you might note that the output of these two calls to insertMany hints that other operations besides simply inserts might be supported for bulk writes. While insertMany does not support operations other than insert, MongoDB does support a Bulk Write API that enables you to batch together a number of operations of different types in one call. While that is beyond the scope of this chapter, you can read about the Bulk Write API in the MongoDB documentation.

Insert Validation

MongoDB does minimal checks on data being inserted: it checks the document’s basic structure and adds an "_id" field if one does not exist. One of the basic structure checks is size: all documents must be smaller than 16 MB. This is a somewhat arbitrary limit (and may be raised in the future); it is mostly intended to prevent bad schema design and ensure consistent performance. To see the Binary JSON (BSON) size, in bytes, of the document doc, run Object.bsonsize(doc) from the shell.

To give you an idea of how much data 16 MB is, the entire text of War and Peace is just 3.14 MB.

These minimal checks also mean that it is fairly easy to insert invalid data (if you are trying to). Thus, you should only allow trusted sources, such as your application servers, to connect to the database. All of the MongoDB drivers for major languages (and most of the minor ones, too) do check for a variety of invalid data (documents that are too large, contain non-UTF-8 strings, or use unrecognized types) before sending anything to the database.

insert

In versions of MongoDB prior to 3.0, insert was the primary method for inserting documents into MongoDB. MongoDB drivers introduced a new CRUD API at the same time as the MongoDB 3.0 server release. As of MongoDB 3.2 the mongo shell also supports this API, which includes insertOne and insertMany as well as several other methods. The goal of the current CRUD API is to make the semantics of all CRUD operations consistent and clear across the drivers and the shell. While methods such as insert are still supported for backward compatibility, they should not be used in applications going forward. You should instead prefer insertOne and insertMany for creating documents.

Removing Documents

Now that there’s data in our database, let’s delete it. The CRUD API provides deleteOne and deleteMany for this purpose. Both of these methods take a filter document as their first parameter. The filter specifies a set of criteria to match against in removing documents. To delete the document with the "_id" value of 4, we use deleteOne in the mongo shell as illustrated here:

> db.movies.find()
{ "_id" : 0, "title" : "Top Gun"}
{ "_id" : 1, "title" : "Back to the Future"}
{ "_id" : 3, "title" : "Sixteen Candles"}
{ "_id" : 4, "title" : "The Terminator"}
{ "_id" : 5, "title" : "Scarface"}
> db.movies.deleteOne({"_id" : 4})
{ "acknowledged" : true, "deletedCount" : 1 }
> db.movies.find()
{ "_id" : 0, "title" : "Top Gun"}
{ "_id" : 1, "title" : "Back to the Future"}
{ "_id" : 3, "title" : "Sixteen Candles"}
{ "_id" : 5, "title" : "Scarface"}

In this example, we used a filter that could only match one document since "_id" values are unique in a collection. However, we can also specify a filter that matches multiple documents in a collection. In this case, deleteOne will delete the first document found that matches the filter. Which document is found first depends on several factors, including the order in which the documents were inserted, what updates were made to the documents (for some storage engines), and what indexes are specified. As with any database operation, be sure you know what effect your use of deleteOne will have on your data.

To delete all the documents that match a filter, use deleteMany:

> db.movies.find()
{ "_id" : 0, "title" : "Top Gun", "year" : 1986 }
{ "_id" : 1, "title" : "Back to the Future", "year" : 1985 }
{ "_id" : 3, "title" : "Sixteen Candles", "year" : 1984 }
{ "_id" : 4, "title" : "The Terminator", "year" : 1984 }
{ "_id" : 5, "title" : "Scarface", "year" : 1983 }
> db.movies.deleteMany({"year" : 1984})
{ "acknowledged" : true, "deletedCount" : 2 }
> db.movies.find()
{ "_id" : 0, "title" : "Top Gun", "year" : 1986 }
{ "_id" : 1, "title" : "Back to the Future", "year" : 1985 }
{ "_id" : 5, "title" : "Scarface", "year" : 1983 }

As a more realistic use case, suppose you want to remove every user from the mailing.list collection where the value for "opt-out" is true:

> db.mailing.list.deleteMany({"opt-out" : true})

In versions of MongoDB prior to 3.0, remove was the primary method for deleting documents. MongoDB drivers introduced the deleteOne and deleteMany methods at the same time as the MongoDB 3.0 server release, and the shell began supporting these methods in MongoDB 3.2. While remove is still supported for backward compatibility, you should use deleteOne and deleteMany in your applications. The current CRUD API provides a cleaner set of semantics and, especially for multidocument operations, helps application developers avoid a couple of common pitfalls with the previous API.

drop

It is possible to use deleteMany to remove all documents in a collection:

> db.movies.find()
{ "_id" : 0, "title" : "Top Gun", "year" : 1986 }
{ "_id" : 1, "title" : "Back to the Future", "year" : 1985 }
{ "_id" : 3, "title" : "Sixteen Candles", "year" : 1984 }
{ "_id" : 4, "title" : "The Terminator", "year" : 1984 }
{ "_id" : 5, "title" : "Scarface", "year" : 1983 }
> db.movies.deleteMany({})
{ "acknowledged" : true, "deletedCount" : 5 }
> db.movies.find()

Removing documents is usually a fairly quick operation. However, if you want to clear an entire collection, it is faster to drop it:

> db.movies.drop()
true

and then recreate any indexes on the empty collection.

Once data has been removed, it is gone forever. There is no way to undo a delete or drop operation or recover deleted documents, except, of course, by restoring a previously backed up version of the data. See Chapter 23 for a detailed discussion of MongoDB backup and restore.

Updating Documents

Once a document is stored in the database, it can be changed using one of several update methods: updateOne, updateMany, and replaceOne. updateOne and updateMany each take a filter document as their first parameter and a modifier document, which describes changes to make, as the second parameter. replaceOne also takes a filter as the first parameter, but as the second parameter replaceOne expects a document with which it will replace the document matching the filter.

Updating a document is atomic: if two updates happen at the same time, whichever one reaches the server first will be applied, and then the next one will be applied. Thus, conflicting updates can safely be sent in rapid-fire succession without any documents being corrupted: the last update will “win.” The Document Versioning pattern (see “Schema Design Patterns”) is worth considering if you don’t want the default behavior.

Document Replacement

replaceOne fully replaces a matching document with a new one. This can be useful to do a dramatic schema migration (see Chapter 9 for scheme migration strategies). For example, suppose we are making major changes to a user document, which looks like the following:

{
    "_id" : ObjectId("4b2b9f67a1f631733d917a7a"),
    "name" : "joe",
    "friends" : 32,
    "enemies" : 2
}

We want to move the "friends" and "enemies" fields to a "relationships" subdocument. We can change the structure of the document in the shell and then replace the database’s version with a replaceOne:

> var joe = db.users.findOne({"name" : "joe"});
> joe.relationships = {"friends" : joe.friends, "enemies" : joe.enemies};
{
    "friends" : 32,
    "enemies" : 2
}
> joe.username = joe.name;
"joe"
> delete joe.friends;
true
> delete joe.enemies;
true
> delete joe.name;
true
> db.users.replaceOne({"name" : "joe"}, joe);

Now, doing a findOne shows that the structure of the document has been updated:

{
    "_id" : ObjectId("4b2b9f67a1f631733d917a7a"),
    "username" : "joe",
    "relationships" : {
        "friends" : 32,
        "enemies" : 2
    }
}

A common mistake is matching more than one document with the criteria and then creating a duplicate "_id" value with the second parameter. The database will throw an error for this, and no documents will be updated.

For example, suppose we create several documents with the same value for "name", but we don’t realize it:

> db.people.find()
{"_id" : ObjectId("4b2b9f67a1f631733d917a7b"), "name" : "joe", "age" : 65}
{"_id" : ObjectId("4b2b9f67a1f631733d917a7c"), "name" : "joe", "age" : 20}
{"_id" : ObjectId("4b2b9f67a1f631733d917a7d"), "name" : "joe", "age" : 49}

Now, if it’s Joe #2’s birthday, we want to increment the value of his "age" key, so we might say this:

> joe = db.people.findOne({"name" : "joe", "age" : 20});
{
    "_id" : ObjectId("4b2b9f67a1f631733d917a7c"),
    "name" : "joe",
    "age" : 20
}
> joe.age++;
> db.people.replaceOne({"name" : "joe"}, joe);
E11001 duplicate key on update

What happened? When you do the update, the database will look for a document matching {"name" : "joe"}. The first one it finds will be the 65-year-old Joe. It will attempt to replace that document with the one in the joe variable, but there’s already a document in this collection with the same "_id". Thus, the update will fail, because "_id" values must be unique. The best way to avoid this situation is to make sure that your update always specifies a unique document, perhaps by matching on a key like "_id". For the preceding example, this would be the correct update to use:

> db.people.replaceOne({"_id" : ObjectId("4b2b9f67a1f631733d917a7c")}, joe)

Using "_id" for the filter will also be efficient since"_id" values form the basis for the primary index of a collection. We’ll cover primary and secondary indexes and how indexing affects updates and other operations more in Chapter 5.

Using Update Operators

Usually only certain portions of a document need to be updated. You can update specific fields in a document using atomic update operators. Update operators are special keys that can be used to specify complex update operations, such as altering, adding, or removing keys, and even manipulating arrays and embedded documents.

Suppose we’re keeping website analytics in a collection and want to increment a counter each time someone visits a page. We can use update operators to do this increment atomically. Each URL and its number of page views is stored in a document that looks like this:

{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "url" : "www.example.com",
    "pageviews" : 52
}

Every time someone visits a page, we can find the page by its URL and use the "$inc" modifier to increment the value of the "pageviews" key:

> db.analytics.updateOne({"url" : "www.example.com"},
... {"$inc" : {"pageviews" : 1}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }

Now, if we do a findOne, we see that "pageviews" has increased by one:

> db.analytics.findOne()
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "url" : "www.example.com",
    "pageviews" : 53
}

When using operators, the value of "_id" cannot be changed. (Note that "_id" can be changed by using whole-document replacement.) Values for any other key, including other uniquely indexed keys, can be modified.

Getting started with the “$set” modifier

"$set" sets the value of a field. If the field does not yet exist, it will be created. This can be handy for updating schemas or adding user-defined keys. For example, suppose you have a simple user profile stored as a document that looks something like the following:

> db.users.findOne()
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "name" : "joe",
    "age" : 30,
    "sex" : "male",
    "location" : "Wisconsin"
}

This is a pretty bare-bones user profile. If the user wanted to store his favorite book in his profile, he could add it using "$set":

> db.users.updateOne({"_id" : ObjectId("4b253b067525f35f94b60a31")},
... {"$set" : {"favorite book" : "War and Peace"}})

Now the document will have a "favorite book" key:

> db.users.findOne()
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "name" : "joe",
    "age" : 30,
    "sex" : "male",
    "location" : "Wisconsin",
    "favorite book" : "War and Peace"
}

If the user decides that he actually enjoys a different book, "$set" can be used again to change the value:

> db.users.updateOne({"name" : "joe"},
... {"$set" : {"favorite book" : "Green Eggs and Ham"}})

"$set" can even change the type of the key it modifies. For instance, if our fickle user decides that he actually likes quite a few books, he can change the value of the "favorite book" key into an array:

> db.users.updateOne({"name" : "joe"},
... {"$set" : {"favorite book" :
...     ["Cat's Cradle", "Foundation Trilogy", "Ender's Game"]}})

If the user realizes that he actually doesn’t like reading, he can remove the key altogether with "$unset":

> db.users.updateOne({"name" : "joe"},
... {"$unset" : {"favorite book" : 1}})

Now the document will be the same as it was at the beginning of this example.

You can also use "$set" to reach in and change embedded documents:

> db.blog.posts.findOne()
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "title" : "A Blog Post",
    "content" : "...",
    "author" : {
        "name" : "joe",
        "email" : "[email protected]"
    }
}
> db.blog.posts.updateOne({"author.name" : "joe"},
... {"$set" : {"author.name" : "joe schmoe"}})

> db.blog.posts.findOne()
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "title" : "A Blog Post",
    "content" : "...",
    "author" : {
        "name" : "joe schmoe",
        "email" : "[email protected]"
    }
}

You must always use a $-modifier for adding, changing, or removing keys. A common error people make when starting out is to try to set the value of a key to some other value by doing an update that resembles this:

> db.blog.posts.updateOne({"author.name" : "joe"}, 
... {"author.name" : "joe schmoe"})

This will result in an error. The update document must contain update operators. Previous versions of the CRUD API did not catch this type of error. Earlier update methods would simply complete a whole document replacement in such situations. It is this type of pitfall that led to the creation of a new CRUD API.

Incrementing and decrementing

The "$inc" operator can be used to change the value for an existing key or to create a new key if it does not already exist. It’s useful for updating analytics, karma, votes, or anything else that has a changeable, numeric value.

Suppose we are creating a game collection where we want to save games and update scores as they change. When a user starts playing, say, a game of pinball, we can insert a document that identifies the game by name and the user playing it:

> db.games.insertOne({"game" : "pinball", "user" : "joe"})

When the ball hits a bumper, the game should increment the player’s score. Since points in pinball are given out pretty freely, let’s say that the base unit of points a player can earn is 50. We can use the "$inc" modifier to add 50 to the player’s score:

> db.games.updateOne({"game" : "pinball", "user" : "joe"},
... {"$inc" : {"score" : 50}})

If we look at the document after this update, we’ll see the following:

> db.games.findOne()
{
     "_id" : ObjectId("4b2d75476cc613d5ee930164"),
     "game" : "pinball",
     "user" : "joe",
     "score" : 50
}

The "score" key did not already exist, so it was created by "$inc" and set to the increment amount: 50.

If the ball lands in a “bonus” slot, we want to add 10,000 to the score. We can do this by passing a different value to "$inc":

> db.games.updateOne({"game" : "pinball", "user" : "joe"},
... {"$inc" : {"score" : 10000}})

Now if we look at the game, we’ll see the following:

> db.games.findOne()
{
     "_id" : ObjectId("4b2d75476cc613d5ee930164"),
     "game" : "pinball",
     "user" : "joe",
     "score" : 10050
}

The "score" key existed and had a numeric value, so the server added 10,000 to it.

"$inc" is similar to "$set", but it is designed for incrementing (and decrementing) numbers. "$inc" can be used only on values of type integer, long, double, or decimal. If it is used on any other type of value, it will fail. This includes types that many languages will automatically cast into numbers, like nulls, booleans, or strings of numeric characters:

> db.strcounts.insert({"count" : "1"})
WriteResult({ "nInserted" : 1 })
> db.strcounts.update({}, {"$inc" : {"count" : 1}})
WriteResult({
  "nMatched" : 0,
  "nUpserted" : 0,
  "nModified" : 0,
  "writeError" : {
    "code" : 16837,
    "errmsg" : "Cannot apply $inc to a value of non-numeric type.
    {_id: ObjectId('5726c0d36855a935cb57a659')} has the field 'count' of
    non-numeric type String"
  }
})

Also, the value of the "$inc" key must be a number. You cannot increment by a string, array, or other nonnumeric value. Doing so will give a “Modifier "$inc" allowed for numbers only” error message. To modify other types, use "$set" or one of the following array operators.

Array operators

An extensive class of update operators exists for manipulating arrays. Arrays are common and powerful data structures: not only are they lists that can be referenced by index, but they can also double as sets.

Adding elements

"$push" adds elements to the end of an array if the array exists and creates a new array if it does not. For example, suppose that we are storing blog posts and want to add a "comments" key containing an array. We can push a comment onto the nonexistent "comments" array, which will create the array and add the comment:

> db.blog.posts.findOne()
{
    "_id" : ObjectId("4b2d75476cc613d5ee930164"),
    "title" : "A blog post",
    "content" : "..."
}
> db.blog.posts.updateOne({"title" : "A blog post"},
... {"$push" : {"comments" :
...     {"name" : "joe", "email" : "[email protected]",
...     "content" : "nice post."}}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
> db.blog.posts.findOne()
{
    "_id" : ObjectId("4b2d75476cc613d5ee930164"),
    "title" : "A blog post",
    "content" : "...",
    "comments" : [
        {
            "name" : "joe",
            "email" : "[email protected]",
            "content" : "nice post."
        }
    ]
}

Now, if we want to add another comment, we can simply use "$push" again:

> db.blog.posts.updateOne({"title" : "A blog post"},
... {"$push" : {"comments" :
...     {"name" : "bob", "email" : "[email protected]",
...     "content" : "good post."}}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
> db.blog.posts.findOne()
{
    "_id" : ObjectId("4b2d75476cc613d5ee930164"),
    "title" : "A blog post",
    "content" : "...",
    "comments" : [
        {
            "name" : "joe",
            "email" : "[email protected]",
            "content" : "nice post."
        },
        {
            "name" : "bob",
            "email" : "[email protected]",
            "content" : "good post."
        }
    ]
}

This is the “simple” form of "push", but you can use it for more complex array operations as well. The MongoDB query language provides modifiers for some operators, including "$push". You can push multiple values in one operation using the "$each" modifer for "$push":

> db.stock.ticker.updateOne({"_id" : "GOOG"},
... {"$push" : {"hourly" : {"$each" : [562.776, 562.790, 559.123]}}})

This would push three new elements onto the array.

If you only want the array to grow to a certain length, you can use the "$slice" modifier with "$push" to prevent an array from growing beyond a certain size, effectively making a “top N” list of items:

> db.movies.updateOne({"genre" : "horror"},
... {"$push" : {"top10" : {"$each" : ["Nightmare on Elm Street", "Saw"],
...                        "$slice" : -10}}})

This example limits the array to the last 10 elements pushed.

If the array is smaller than 10 elements (after the push), all elements will be kept. If the array is larger than 10 elements, only the last 10 elements will be kept. Thus, "$slice" can be used to create a queue in a document.

Finally, you can apply the "$sort" modifier to "$push" operations before trimming:

> db.movies.updateOne({"genre" : "horror"},
... {"$push" : {"top10" : {"$each" : [{"name" : "Nightmare on Elm Street",
...                                    "rating" : 6.6},
...                                   {"name" : "Saw", "rating" : 4.3}],
...                        "$slice" : -10,
...                        "$sort" : {"rating" : -1}}}})

This will sort all of the objects in the array by their "rating" field and then keep the first 10. Note that you must include "$each"; you cannot just "$slice" or "$sort" an array with "$push".

Using arrays as sets

You might want to treat an array as a set, only adding values if they are not present. This can be done using "$ne" in the query document. For example, to push an author onto a list of citations, but only if they aren’t already there, use the following:

> db.papers.updateOne({"authors cited" : {"$ne" : "Richie"}},
... {$push : {"authors cited" : "Richie"}})

This can also be done with "$addToSet", which is useful for cases where "$ne" won’t work or where "$addToSet" describes what is happening better.

For example, suppose you have a document that represents a user. You might have a set of email addresses that they have added:

> db.users.findOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")})
{
    "_id" : ObjectId("4b2d75476cc613d5ee930164"),
    "username" : "joe",
    "emails" : [
        "[email protected]",
        "[email protected]",
        "[email protected]"
    ]
}

When adding another address, you can use “$addToSet" to prevent duplicates:

> db.users.updateOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")},
... {"$addToSet" : {"emails" : "[email protected]"}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 0 }
> db.users.findOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")})
{
    "_id" : ObjectId("4b2d75476cc613d5ee930164"),
    "username" : "joe",
    "emails" : [
        "[email protected]",
        "[email protected]",
        "[email protected]",
    ]
}
> db.users.updateOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")},
... {"$addToSet" : {"emails" : "[email protected]"}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
> db.users.findOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")})
{
    "_id" : ObjectId("4b2d75476cc613d5ee930164"),
    "username" : "joe",
    "emails" : [
        "[email protected]",
        "[email protected]",
        "[email protected]",
        "[email protected]"
    ]
}

You can also use "$addToSet" in conjunction with "$each" to add multiple unique values, which cannot be done with the "$ne"/"$push" combination. For instance, you could use these operators if the user wanted to add more than one email address:

> db.users.updateOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")}, 
... {"$addToSet" : {"emails" : {"$each" :
...     ["[email protected]", "[email protected]", "[email protected]"]}}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
> db.users.findOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")})
{
    "_id" : ObjectId("4b2d75476cc613d5ee930164"),
    "username" : "joe",
    "emails" : [
        "[email protected]",
        "[email protected]",
        "[email protected]",
        "[email protected]"
        "[email protected]"
        "[email protected]"
    ]
}

Removing elements

There are a few ways to remove elements from an array. If you want to treat the array like a queue or a stack, you can use "$pop", which can remove elements from either end. {"$pop" : {"key" : 1}} removes an element from the end of the array. {"$pop" : {"key" : -1}} removes it from the beginning.

Sometimes an element should be removed based on specific criteria, rather than its position in the array. "$pull" is used to remove elements of an array that match the given criteria. For example, suppose we have a list of things that need to be done, but not in any specific order:

> db.lists.insertOne({"todo" : ["dishes", "laundry", "dry cleaning"]})

If we do the laundry first, we can remove it from the list with the following:

> db.lists.updateOne({}, {"$pull" : {"todo" : "laundry"}})

Now if we do a find, we’ll see that there are only two elements remaining in the array:

> db.lists.findOne()
{
    "_id" : ObjectId("4b2d75476cc613d5ee930164"),
    "todo" : [
        "dishes",
        "dry cleaning"
    ]
}

Pulling removes all matching documents, not just a single match. If you have an array that looks like [1, 1, 2, 1] and pull 1, you’ll end up with a single-element array, [2].

Array operators can be used only on keys with array values. For example, you cannot push onto an integer or pop off of a string. Use "$set" or "$inc" to modify scalar values.

Positional array modifications

Array manipulation becomes a little trickier when you have multiple values in an array and want to modify some of them. There are two ways to manipulate values in arrays: by position or by using the position operator (the $ character).

Arrays use 0-based indexing, and elements can be selected as though their index were a document key. For example, suppose we have a document containing an array with a few embedded documents, such as a blog post with comments:

> db.blog.posts.findOne()
{
    "_id" : ObjectId("4b329a216cc613d5ee930192"),
    "content" : "...",
    "comments" : [
        {
            "comment" : "good post",
            "author" : "John",
            "votes" : 0
        },
        {
            "comment" : "i thought it was too short",
            "author" : "Claire",
            "votes" : 3
        },
        {
            "comment" : "free watches",
            "author" : "Alice",
            "votes" : -5
        },
        {
            "comment" : "vacation getaways",
            "author" : "Lynn",
            "votes" : -7
        }
    ]
}

If we want to increment the number of votes for the first comment, we can say the following:

> db.blog.updateOne({"post" : post_id},
... {"$inc" : {"comments.0.votes" : 1}})

In many cases, though, we don’t know what index of the array to modify without querying for the document first and examining it. To get around this, MongoDB has a positional operator, $, that figures out which element of the array the query document matched and updates that element. For example, if we have a user named John who updates his name to Jim, we can replace it in the comments by using the positional operator:

> db.blog.updateOne({"comments.author" : "John"},
... {"$set" : {"comments.$.author" : "Jim"}})

The positional operator updates only the first match. Thus, if John had left more than one comment, his name would be changed only for the first comment he left.

Updates using array filters

MongoDB 3.6 introduced another option for updating individual array elements: arrayFilters. This option enables us to modify array elements matching particular critera. For example, if we want to hide all comments with five or more down votes, we can do something like the following:

db.blog.updateOne(
   {"post" : post_id },
   { $set: { "comments.$[elem].hidden" : true } },
   {
     arrayFilters: [ { "elem.votes": { $lte: -5 } } ]
   }
)

This command defines elem as the identifier for each matching element in the "comments" array. If the votes value for the comment identified by elem is less than or equal to -5, we will add a field called "hidden" to the "comments" document and set its value to true.

Upserts

An upsert is a special type of update. If no document is found that matches the filter, a new document will be created by combining the criteria and updated documents. If a matching document is found, it will be updated normally. Upserts can be handy because they can eliminate the need to “seed” your collection: you can often have the same code create and update documents.

Let’s go back to our example that records the number of views for each page of a website. Without an upsert, we might try to find the URL and increment the number of views or create a new document if the URL doesn’t exist. If we were to write this out as a JavaScript program it might look something like the following:

// check if we have an entry for this page
blog = db.analytics.findOne({url : "/blog"})

// if we do, add one to the number of views and save
if (blog) {
  blog.pageviews++;
  db.analytics.save(blog);
}
// otherwise, create a new document for this page
else {
  db.analytics.insertOne({url : "/blog", pageviews : 1})
}

This means we are making a round trip to the database, plus sending an update or insert, every time someone visits a page. If we are running this code in multiple processes, we are also subject to a race condition where more than one document can be inserted for a given URL.

We can eliminate the race condition and cut down the amount of code by just sending an upsert to the database (the third parameter to updateOne and updateMany is an options document that enables us to specify this):

> db.analytics.updateOne({"url" : "/blog"}, {"$inc" : {"pageviews" : 1}}, 
... {"upsert" : true})

This line does exactly what the previous code block does, except it’s faster and atomic! The new document is created by using the criteria document as a base and applying any modifier documents to it.

For example, if you do an upsert that matches a key and increments to the value of that key, the increment will be applied to the match:

> db.users.updateOne({"rep" : 25}, {"$inc" : {"rep" : 3}}, {"upsert" : true})
WriteResult({
    "acknowledged" : true,
    "matchedCount" : 0,
    "modifiedCount" : 0,
    "upsertedId" : ObjectId("5a93b07aaea1cb8780a4cf72")
})
> db.users.findOne({"_id" : ObjectId("5727b2a7223502483c7f3acd")} )
{ "_id" : ObjectId("5727b2a7223502483c7f3acd"), "rep" : 28 }

The upsert creates a new document with a "rep" of 25 and then increments that by 3, giving us a document where "rep" is 28. If the upsert option were not specified, {"rep" : 25} would not match any documents, so nothing would happen.

If we run the upsert again (with the criterion {"rep" : 25}), it will create another new document. This is because the criterion does not match the only document in the collection. (Its "rep" is 28.)

Sometimes a field needs to be set when a document is created, but not changed on subsequent updates. This is what "$setOnInsert" is for. "$setOnInsert" is an operator that only sets the value of a field when the document is being inserted. Thus, we could do something like this:

> db.users.updateOne({}, {"$setOnInsert" : {"createdAt" : new Date()}},
... {"upsert" : true})
{
    "acknowledged" : true,
    "matchedCount" : 0,
    "modifiedCount" : 0,
    "upsertedId" : ObjectId("5727b4ac223502483c7f3ace")
}
> db.users.findOne()
{
    "_id" : ObjectId("5727b4ac223502483c7f3ace"),
    "createdAt" : ISODate("2016-05-02T20:12:28.640Z")
}

If we run this update again, it will match the existing document, nothing will be inserted, and so the "createdAt" field will not be changed:

> db.users.updateOne({}, {"$setOnInsert" : {"createdAt" : new Date()}},
... {"upsert" : true})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 0 }
> db.users.findOne()
{
    "_id" : ObjectId("5727b4ac223502483c7f3ace"),
    "createdAt" : ISODate("2016-05-02T20:12:28.640Z")
}

Note that you generally do not need to keep a "createdAt" field, as ObjectIds contain a timestamp of when the document was created. However, "$setOnInsert" can be useful for creating padding, initializing counters, and for collections that do not use ObjectIds.

The save shell helper

save is a shell function that lets you insert a document if it doesn’t exist and update it if it does. It takes one argument: a document. If the document contains an "_id" key, save will do an upsert. Otherwise, it will do an insert. save is really just a convenience function so that programmers can quickly modify documents in the shell:

> var x = db.testcol.findOne()
> x.num = 42
42
> db.testcol.save(x)

Without save, the last line would have been more cumbersome:

db.testcol.replaceOne({"_id" : x._id}, x)

Updating Multiple Documents

So far in this chapter we have used updateOne to illustrate update operations. updateOne updates only the first document found that matches the filter criteria. If there are more matching documents, they will remain unchanged. To modify all of the documents matching a filter, use updateMany. updateMany follows the same semantics as updateOne and takes the same parameters. The key difference is in the number of documents that might be changed.

updateMany provides a powerful tool for performing schema migrations or rolling out new features to certain users. Suppose, for example, we want to give a gift to every user who has a birthday on a certain day. We can use updateMany to add a "gift" to their accounts. For example:

> db.users.insertMany([
... {birthday: "10/13/1978"},
... {birthday: "10/13/1978"},
... {birthday: "10/13/1978"}])
{
    "acknowledged" : true,
    "insertedIds" : [
        ObjectId("5727d6fc6855a935cb57a65b"),
        ObjectId("5727d6fc6855a935cb57a65c"),
        ObjectId("5727d6fc6855a935cb57a65d")
    ]
}
> db.users.updateMany({"birthday" : "10/13/1978"},
... {"$set" : {"gift" : "Happy Birthday!"}})
{ "acknowledged" : true, "matchedCount" : 3, "modifiedCount" : 3 }

The call to updateMany adds a "gift" field to each of the three documents we inserted into the users collection immediately before.

Returning Updated Documents

For some use cases it is important to return the document modified. In earlier versions of MongoDB, findAndModify was the method of choice in such situations. It is handy for manipulating queues and performing other operations that need get-and-set−style atomicity. However, findAndModify is prone to user error because it’s a complex method combining the functionality of three different types of operations: delete, replace, and update (including upserts).

MongoDB 3.2 introduced three new collection methods to the shell to accommodate the functionality of findAndModify, but with semantics that are easier to learn and remember: findOneAndDelete, findOneAndReplace, and findOneAndUpdate. The primary difference between these methods and, for example, updateOne is that they enable you to atomically get the value of a modified document. MongoDB 4.2 extended findOneAndUpdate to accept an aggregation pipeline for the update. The pipeline can consist of the following stages: $addFields and its alias $set, $project and its alias $unset, and $replaceRoot and its alias $replaceWith.

Suppose we have a collection of processes run in a certain order. Each is represented with a document that has the following form:

{
    "_id" : ObjectId(),
    "status" : "state",
    "priority" : N
}

"status" is a string that can be "READY", "RUNNING", or "DONE". We need to find the job with the highest priority in the "READY" state, run the process function, and then update the status to "DONE". We might try querying for the ready processes, sorting by priority, and updating the status of the highest-priority process to mark it as "RUNNING". Once we have processed it, we update the status to "DONE". This looks something like the following:

var cursor = db.processes.find({"status" : "READY"});
ps = cursor.sort({"priority" : -1}).limit(1).next();
db.processes.updateOne({"_id" : ps._id}, {"$set" : {"status" : "RUNNING"}});
do_something(ps);
db.processes.updateOne({"_id" : ps._id}, {"$set" : {"status" : "DONE"}});

This algorithm isn’t great because it is subject to a race condition. Suppose we have two threads running. If one thread (call it A) retrieved the document and another thread (call it B) retrieved the same document before A had updated its status to "RUNNING", then both threads would be running the same process. We can avoid this by checking the result as part of the update query, but this becomes complex:

var cursor = db.processes.find({"status" : "READY"});
cursor.sort({"priority" : -1}).limit(1);
while ((ps = cursor.next()) != null) {
    var result = db.processes.updateOne({"_id" : ps._id, "status" : "READY"},
                              {"$set" : {"status" : "RUNNING"}});

    if (result.modifiedCount === 1) {
        do_something(ps);
        db.processes.updateOne({"_id" : ps._id}, {"$set" : {"status" : "DONE"}});
        break;
    }
    cursor = db.processes.find({"status" : "READY"});
    cursor.sort({"priority" : -1}).limit(1);
}

Also, depending on timing, one thread may end up doing all the work while another thread uselessly trails it. Thread A could always grab the process, and then B would try to get the same process, fail, and leave A to do all the work.

Situations like this are perfect for findOneAndUpdate. findOneAndUpdate can return the item and update it in a single operation. In this case, it looks like the following:

> db.processes.findOneAndUpdate({"status" : "READY"},
... {"$set" : {"status" : "RUNNING"}},
... {"sort" : {"priority" : -1}})
{
    "_id" : ObjectId("4b3e7a18005cab32be6291f7"),
    "priority" : 1,
    "status" : "READY"
}

Notice that the status is still "READY" in the returned document because the findOneAndUpdate method defaults to returning the state of the document before it was modified. It will return the updated document if we set the "returnNewDocument" field in the options document to true. An options document is passed as the third parameter to findOneAndUpdate:

> db.processes.findOneAndUpdate({"status" : "READY"},
... {"$set" : {"status" : "RUNNING"}},
... {"sort" : {"priority" : -1},
...  "returnNewDocument": true})
{
    "_id" : ObjectId("4b3e7a18005cab32be6291f7"),
    "priority" : 1,
    "status" : "RUNNING"
}

Thus, the program becomes the following:

ps = db.processes.findOneAndUpdate({"status" : "READY"},
                                   {"$set" : {"status" : "RUNNING"}},
                                   {"sort" : {"priority" : -1},
                                    "returnNewDocument": true})
do_something(ps)
db.process.updateOne({"_id" : ps._id}, {"$set" : {"status" : "DONE"}})

In addition to this one, there are two other methods you should be aware of. findOneAndReplace takes the same parameters and returns the document matching the filter either before or after the replacement, depending on the value of returnNewDocument. findOneAndDelete is similar except it does not take an update document as a parameter and has a subset of the options of the other two methods. findOneAndDelete returns the deleted document.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.36.30