This chapter covers the basics of moving data into and out of the database, including the following:
Adding new documents to a collection
Removing documents from a collection
Updating existing documents
Choosing the correct level of safety versus speed for all of these operations
Inserts are the basic method for adding data to MongoDB.
To insert a single document, use the collection’s
insertOne
method:
>
db
.
movies
.
insertOne
({
"title"
:
"Stand by Me"
})
insertOne
will add an "_id"
key to the document (if you do not supply
one) and store the document in MongoDB.
If you need to insert multiple documents into a collection, you can use
insertMany
. This method enables you to pass an array of
documents to the database. This is far more efficient because your code
will not make a round trip to the database for each document inserted,
but will insert them in bulk.
In the shell, you can try this out as follows:
>
db
.
movies
.
drop
()
true
>
db
.
movies
.
insertMany
([{
"title"
:
"Ghostbusters"
},
...
{
"title"
:
"E.T."
},
...
{
"title"
:
"Blade Runner"
}]);
{
"acknowledged"
:
true
,
"insertedIds"
:
[
ObjectId
(
"572630ba11722fac4b6b4996"
),
ObjectId
(
"572630ba11722fac4b6b4997"
),
ObjectId
(
"572630ba11722fac4b6b4998"
)
]
}
>
db
.
movies
.
find
()
{
"_id"
:
ObjectId
(
"572630ba11722fac4b6b4996"
),
"title"
:
"Ghostbusters"
}
{
"_id"
:
ObjectId
(
"572630ba11722fac4b6b4997"
),
"title"
:
"E.T."
}
{
"_id"
:
ObjectId
(
"572630ba11722fac4b6b4998"
),
"title"
:
"Blade Runner"
}
Sending dozens, hundreds, or even thousands of documents at a time can make inserts significantly faster.
insertMany
is useful if you are inserting
multiple documents into a single collection. If you are just importing
raw data (e.g., from a data feed or MySQL), there are command-line tools
like mongoimport that can be used instead of a batch insert. On the other hand, it is often handy to munge data before
saving it to MongoDB (converting dates to the date type or adding a
custom "_id"
, for example). In such
cases insertMany
can be used for importing
data, as well.
Current versions of MongoDB do not accept messages longer than 48 MB, so there is a limit to how much can be inserted in a single batch insert. If you attempt to insert more than 48 MB, many drivers will split up the batch insert into multiple 48 MB batch inserts. Check your driver documentation for details.
When performing a bulk insert using
insertMany
, if a document halfway through the
array produces an error of some type, what happens depends on whether
you have opted for ordered or unordered operations. As the second
parameter to insertMany
you may specify an
options document. Specify true
for
the key "ordered"
in the options
document to ensure documents are inserted in the order they are
provided. Specify false
and MongoDB
may reorder the inserts to increase performance. Ordered inserts is the
default if no ordering is specified. For ordered inserts, the array
passed to insertMany
defines the insertion
order. If a document produces an insertion error, no documents beyond
that point in the array will be inserted. For unordered inserts, MongoDB
will attempt to insert all documents, regardless of whether some
insertions produce errors.
In this example, because ordered inserts is the default, only the
first two documents will be inserted. The third document will produce an
error, because you cannot insert two documents with the same "_id"
:
>
db
.
movies
.
insertMany
([
...
{
"_id"
:
0
,
"title"
:
"Top Gun"
},
...
{
"_id"
:
1
,
"title"
:
"Back to the Future"
},
...
{
"_id"
:
1
,
"title"
:
"Gremlins"
},
...
{
"_id"
:
2
,
"title"
:
"Aliens"
}])
2019
-
04
-
22
T12
:
27
:
57.278
-
0400
E
QUERY
[
js
]
BulkWriteError
:
write
error
at
item
2
in
bulk
operation
:
BulkWriteError
({
"writeErrors"
:
[
{
"index"
:
2
,
"code"
:
11000
,
"errmsg"
:
"E11000 duplicate key error collection:
test.movies index: _id_ dup key: { _id: 1.0 }"
,
"op"
:
{
"_id"
:
1
,
"title"
:
"Gremlins"
}
}
],
"writeConcernErrors"
:
[
],
"nInserted"
:
2
,
"nUpserted"
:
0
,
"nMatched"
:
0
,
"nModified"
:
0
,
"nRemoved"
:
0
,
"upserted"
:
[
]
})
BulkWriteError
@
src
/
mongo
/
shell
/
bulk_api
.
js
:
367
:
48
BulkWriteResult
/
this
.
toError
@
src
/
mongo
/
shell
/
bulk_api
.
js
:
332
:
24
Bulk
/
this
.
execute
@
src
/
mongo
/
shell
/
bulk_api
.
js
:
1186
:
23
DBCollection
.
prototype
.
insertMany
@
src
/
mongo
/
shell
/
crud_api
.
js
:
314
:
5
@
(
shell
)
:
1
:
1
If instead we specify unordered inserts, the first, second, and
fourth documents in the array are inserted. The only insert that fails
is the third document, again because of a duplicate "_id"
error:
>
db
.
movies
.
insertMany
([
...
{
"_id"
:
3
,
"title"
:
"Sixteen Candles"
},
...
{
"_id"
:
4
,
"title"
:
"The Terminator"
},
...
{
"_id"
:
4
,
"title"
:
"The Princess Bride"
},
...
{
"_id"
:
5
,
"title"
:
"Scarface"
}],
...
{
"ordered"
:
false
})
2019
-
05
-
01
T17
:
02
:
25.511
-
0400
E
QUERY
[
thread1
]
BulkWriteError
:
write
error
at
item
2
in
bulk
operation
:
BulkWriteError
({
"writeErrors"
:
[
{
"index"
:
2
,
"code"
:
11000
,
"errmsg"
:
"E11000 duplicate key error index: test.movies.$_id_
dup key: { : 4.0 }"
,
"op"
:
{
"_id"
:
4
,
"title"
:
"The Princess Bride"
}
}
],
"writeConcernErrors"
:
[
],
"nInserted"
:
3
,
"nUpserted"
:
0
,
"nMatched"
:
0
,
"nModified"
:
0
,
"nRemoved"
:
0
,
"upserted"
:
[
]
})
BulkWriteError
@
src
/
mongo
/
shell
/
bulk_api
.
js
:
367
:
48
BulkWriteResult
/
this
.
toError
@
src
/
mongo
/
shell
/
bulk_api
.
js
:
332
:
24
Bulk
/
this
.
execute
@
src
/
mongo
/
shell
/
bulk_api
.
js
:
1186.23
DBCollection
.
prototype
.
insertMany
@
src
/
mongo
/
shell
/
crud_api
.
js
:
314
:
5
@
(
shell
)
:
1
:
1
If you study these examples closely, you might note that the
output of these two calls to insertMany
hints that other operations besides
simply inserts might be supported for bulk writes. While
insertMany
does not support operations other
than insert, MongoDB does support a Bulk Write API that enables you to
batch together a number of operations of different types in one call.
While that is beyond the scope of this chapter, you can read about the
Bulk
Write API in the MongoDB documentation.
MongoDB does minimal checks on data being inserted: it checks the
document’s basic structure and adds an "_id"
field if one
does not exist. One of the basic structure checks is size: all documents
must be smaller than 16 MB. This is a somewhat arbitrary limit (and may
be raised in the future); it is mostly intended to prevent bad schema
design and ensure consistent performance. To see the Binary JSON (BSON)
size, in bytes, of the document doc
, run
Object.bsonsize(
from the shell.doc
)
To give you an idea of how much data 16 MB is, the entire text of War and Peace is just 3.14 MB.
These minimal checks also mean that it is fairly easy to insert invalid data (if you are trying to). Thus, you should only allow trusted sources, such as your application servers, to connect to the database. All of the MongoDB drivers for major languages (and most of the minor ones, too) do check for a variety of invalid data (documents that are too large, contain non-UTF-8 strings, or use unrecognized types) before sending anything to the database.
In versions of MongoDB prior to 3.0, insert
was the primary method for inserting
documents into MongoDB. MongoDB drivers introduced a new CRUD
API at the same time as the MongoDB 3.0 server release. As of
MongoDB 3.2 the mongo shell also supports this API,
which includes insertOne
and
insertMany
as well as several other methods.
The goal of the current CRUD API is to make the semantics of all CRUD
operations consistent and clear across the drivers and the shell. While
methods such as insert
are still
supported for backward compatibility, they should not be used in
applications going forward. You should instead prefer
insertOne
and
insertMany
for creating documents.
Now that there’s data in our database, let’s delete it. The
CRUD API provides deleteOne
and
deleteMany
for this purpose. Both of these methods take a filter
document as their first parameter. The filter specifies a set of criteria
to match against in removing documents. To delete the document with the
value of "_id"
4
, we use deleteOne
in
the mongo shell as illustrated here:
>
db
.
movies
.
find
()
{
"_id"
:
0
,
"title"
:
"Top Gun"
}
{
"_id"
:
1
,
"title"
:
"Back to the Future"
}
{
"_id"
:
3
,
"title"
:
"Sixteen Candles"
}
{
"_id"
:
4
,
"title"
:
"The Terminator"
}
{
"_id"
:
5
,
"title"
:
"Scarface"
}
>
db
.
movies
.
deleteOne
({
"_id"
:
4
})
{
"acknowledged"
:
true
,
"deletedCount"
:
1
}
>
db
.
movies
.
find
()
{
"_id"
:
0
,
"title"
:
"Top Gun"
}
{
"_id"
:
1
,
"title"
:
"Back to the Future"
}
{
"_id"
:
3
,
"title"
:
"Sixteen Candles"
}
{
"_id"
:
5
,
"title"
:
"Scarface"
}
In this example, we used a filter that could only match one document
since "_id"
values are unique in a collection. However, we
can also specify a filter that matches multiple documents in a collection.
In this case, deleteOne
will delete the first
document found that matches the filter. Which document is found first
depends on several factors, including the order in which the documents
were inserted, what updates were made to the documents (for some storage
engines), and what indexes are specified. As with any database operation,
be sure you know what effect your use of
deleteOne
will have on your data.
To delete all the documents that match a filter, use
deleteMany
:
>
db
.
movies
.
find
()
{
"_id"
:
0
,
"title"
:
"Top Gun"
,
"year"
:
1986
}
{
"_id"
:
1
,
"title"
:
"Back to the Future"
,
"year"
:
1985
}
{
"_id"
:
3
,
"title"
:
"Sixteen Candles"
,
"year"
:
1984
}
{
"_id"
:
4
,
"title"
:
"The Terminator"
,
"year"
:
1984
}
{
"_id"
:
5
,
"title"
:
"Scarface"
,
"year"
:
1983
}
>
db
.
movies
.
deleteMany
({
"year"
:
1984
})
{
"acknowledged"
:
true
,
"deletedCount"
:
2
}
>
db
.
movies
.
find
()
{
"_id"
:
0
,
"title"
:
"Top Gun"
,
"year"
:
1986
}
{
"_id"
:
1
,
"title"
:
"Back to the Future"
,
"year"
:
1985
}
{
"_id"
:
5
,
"title"
:
"Scarface"
,
"year"
:
1983
}
As a more realistic use case, suppose you want to remove every user
from the mailing.list collection where the value for
"opt-out"
is true
:
>
db
.
mailing
.
list
.
deleteMany
({
"opt-out"
:
true
})
In versions of MongoDB prior to 3.0, remove
was the primary method for deleting documents.
MongoDB drivers introduced the deleteOne
and
deleteMany
methods at the same time as the
MongoDB 3.0 server release, and the shell began supporting these methods
in MongoDB 3.2. While remove
is still
supported for backward compatibility, you should use
deleteOne
and deleteMany
in your applications. The current CRUD API provides a cleaner set of
semantics and, especially for multidocument operations, helps application
developers avoid a couple of common pitfalls with the previous API.
It is possible to use deleteMany
to
remove all documents in a collection:
>
db
.
movies
.
find
()
{
"_id"
:
0
,
"title"
:
"Top Gun"
,
"year"
:
1986
}
{
"_id"
:
1
,
"title"
:
"Back to the Future"
,
"year"
:
1985
}
{
"_id"
:
3
,
"title"
:
"Sixteen Candles"
,
"year"
:
1984
}
{
"_id"
:
4
,
"title"
:
"The Terminator"
,
"year"
:
1984
}
{
"_id"
:
5
,
"title"
:
"Scarface"
,
"year"
:
1983
}
>
db
.
movies
.
deleteMany
({})
{
"acknowledged"
:
true
,
"deletedCount"
:
5
}
>
db
.
movies
.
find
()
Removing documents is usually a fairly quick operation. However,
if you want to clear an entire collection, it is faster to drop
it:
>
db
.
movies
.
drop
()
true
and then recreate any indexes on the empty collection.
Once data has been removed, it is gone forever. There is no way to undo a delete or drop operation or recover deleted documents, except, of course, by restoring a previously backed up version of the data. See Chapter 23 for a detailed discussion of MongoDB backup and restore.
Once a document is stored in the database, it can be changed
using one of several update methods: updateOne
, updateMany
, and replaceOne
. updateOne
and updateMany
each take a filter document as their
first parameter and a modifier document, which describes changes to make,
as the second parameter. replaceOne
also takes a filter as the first parameter, but as the second parameter
replaceOne
expects a document with
which it will replace the document matching the filter.
Updating a document is atomic: if two updates happen at the same time, whichever one reaches the server first will be applied, and then the next one will be applied. Thus, conflicting updates can safely be sent in rapid-fire succession without any documents being corrupted: the last update will “win.” The Document Versioning pattern (see “Schema Design Patterns”) is worth considering if you don’t want the default behavior.
replaceOne
fully replaces a matching document with a new one. This can be
useful to do a dramatic schema migration (see Chapter 9 for scheme migration strategies). For
example, suppose we are making major changes to a user document, which
looks like the following:
{
"_id"
:
ObjectId
(
"4b2b9f67a1f631733d917a7a"
),
"name"
:
"joe"
,
"friends"
:
32
,
"enemies"
:
2
}
We want to move the "friends"
and "enemies"
fields to a "relationships"
subdocument. We can change the
structure of the document in the shell and then replace the database’s
version with a replaceOne
:
>
var
joe
=
db
.
users
.
findOne
({
"name"
:
"joe"
});
>
joe
.
relationships
=
{
"friends"
:
joe
.
friends
,
"enemies"
:
joe
.
enemies
};
{
"friends"
:
32
,
"enemies"
:
2
}
>
joe
.
username
=
joe
.
name
;
"joe"
>
delete
joe
.
friends
;
true
>
delete
joe
.
enemies
;
true
>
delete
joe
.
name
;
true
>
db
.
users
.
replaceOne
({
"name"
:
"joe"
},
joe
);
Now, doing a findOne
shows
that the structure of the document has been updated:
{
"_id"
:
ObjectId
(
"4b2b9f67a1f631733d917a7a"
),
"username"
:
"joe"
,
"relationships"
:
{
"friends"
:
32
,
"enemies"
:
2
}
}
A common mistake is matching more than one document with the
criteria and then creating a duplicate "_id"
value with the second parameter. The
database will throw an error for this, and no documents will be
updated.
For example, suppose we create several documents with the same
value for "name"
, but we don’t
realize it:
>
db
.
people
.
find
()
{
"_id"
:
ObjectId
(
"4b2b9f67a1f631733d917a7b"
),
"name"
:
"joe"
,
"age"
:
65
}
{
"_id"
:
ObjectId
(
"4b2b9f67a1f631733d917a7c"
),
"name"
:
"joe"
,
"age"
:
20
}
{
"_id"
:
ObjectId
(
"4b2b9f67a1f631733d917a7d"
),
"name"
:
"joe"
,
"age"
:
49
}
Now, if it’s Joe #2’s birthday, we want to increment the value of
his "age"
key, so we might say
this:
>
joe
=
db
.
people
.
findOne
({
"name"
:
"joe"
,
"age"
:
20
});
{
"_id"
:
ObjectId
(
"4b2b9f67a1f631733d917a7c"
),
"name"
:
"joe"
,
"age"
:
20
}
>
joe
.
age
++
;
>
db
.
people
.
replaceOne
({
"name"
:
"joe"
},
joe
);
E11001
duplicate
key
on
update
What happened? When you do the update, the database will look for
a document matching {"name" : "joe"}
.
The first one it finds will be the 65-year-old Joe. It will attempt to
replace that document with the one in the joe
variable, but there’s already a document in this collection with the
same "_id"
. Thus, the update will
fail, because "_id"
values must be
unique. The best way to avoid this situation is to make sure that your
update always specifies a unique document, perhaps by matching on a key
like "_id"
. For the preceding
example, this would be the correct update to use:
>
db
.
people
.
replaceOne
({
"_id"
:
ObjectId
(
"4b2b9f67a1f631733d917a7c"
)},
joe
)
Using "_id"
for the filter will
also be efficient since"_id"
values
form the basis for the primary index of a collection. We’ll cover
primary and secondary indexes and how indexing affects updates and other
operations more in Chapter 5.
Usually only certain portions of a document need to be updated. You can update specific fields in a document using atomic update operators. Update operators are special keys that can be used to specify complex update operations, such as altering, adding, or removing keys, and even manipulating arrays and embedded documents.
Suppose we’re keeping website analytics in a collection and want to increment a counter each time someone visits a page. We can use update operators to do this increment atomically. Each URL and its number of page views is stored in a document that looks like this:
{
"_id"
:
ObjectId
(
"4b253b067525f35f94b60a31"
),
"url"
:
"www.example.com"
,
"pageviews"
:
52
}
Every time someone visits a page, we can find the page by its URL
and use the "$inc"
modifier to increment the value of the
"pageviews"
key:
>
db
.
analytics
.
updateOne
({
"url"
:
"www.example.com"
},
...
{
"$inc"
:
{
"pageviews"
:
1
}})
{
"acknowledged"
:
true
,
"matchedCount"
:
1
,
"modifiedCount"
:
1
}
Now, if we do a findOne
, we
see that "pageviews"
has increased by
one:
>
db
.
analytics
.
findOne
()
{
"_id"
:
ObjectId
(
"4b253b067525f35f94b60a31"
),
"url"
:
"www.example.com"
,
"pageviews"
:
53
}
When using operators, the value of "_id"
cannot be changed. (Note that "_id"
can be changed by
using whole-document replacement.) Values for any other key, including
other uniquely indexed keys, can be modified.
"$set"
sets the value of a field. If the field does not yet exist,
it will be created. This can be handy for updating schemas or adding user-defined keys. For
example, suppose you have a simple user profile stored as a document
that looks something like the following:
>
db
.
users
.
findOne
()
{
"_id"
:
ObjectId
(
"4b253b067525f35f94b60a31"
),
"name"
:
"joe"
,
"age"
:
30
,
"sex"
:
"male"
,
"location"
:
"Wisconsin"
}
This is a pretty bare-bones user profile. If the user wanted to
store his favorite book in his profile, he could add it using "$set"
:
>
db
.
users
.
updateOne
({
"_id"
:
ObjectId
(
"4b253b067525f35f94b60a31"
)},
...
{
"$set"
:
{
"favorite book"
:
"War and Peace"
}})
Now the document will have a "favorite
book"
key:
>
db
.
users
.
findOne
()
{
"_id"
:
ObjectId
(
"4b253b067525f35f94b60a31"
),
"name"
:
"joe"
,
"age"
:
30
,
"sex"
:
"male"
,
"location"
:
"Wisconsin"
,
"favorite book"
:
"War and Peace"
}
If the user decides that he actually enjoys a different book,
"$set"
can be used again to change
the value:
>
db
.
users
.
updateOne
({
"name"
:
"joe"
},
...
{
"$set"
:
{
"favorite book"
:
"Green Eggs and Ham"
}})
"$set"
can even change the
type of the key it modifies. For instance, if our fickle user decides
that he actually likes quite a few books, he can change the value of
the "favorite book"
key into an
array:
>
db
.
users
.
updateOne
({
"name"
:
"joe"
},
...
{
"$set"
:
{
"favorite book"
:
...
[
"Cat's Cradle"
,
"Foundation Trilogy"
,
"Ender's Game"
]}})
If the user realizes that he actually doesn’t like
reading, he can remove the key altogether with "$unset"
:
>
db
.
users
.
updateOne
({
"name"
:
"joe"
},
...
{
"$unset"
:
{
"favorite book"
:
1
}})
Now the document will be the same as it was at the beginning of this example.
You can also use "$set"
to reach in and change embedded documents:
>
db
.
blog
.
posts
.
findOne
()
{
"_id"
:
ObjectId
(
"4b253b067525f35f94b60a31"
),
"title"
:
"A Blog Post"
,
"content"
:
"..."
,
"author"
:
{
"name"
:
"joe"
,
"email"
:
"[email protected]"
}
}
>
db
.
blog
.
posts
.
updateOne
({
"author.name"
:
"joe"
},
...
{
"$set"
:
{
"author.name"
:
"joe schmoe"
}})
>
db
.
blog
.
posts
.
findOne
()
{
"_id"
:
ObjectId
(
"4b253b067525f35f94b60a31"
),
"title"
:
"A Blog Post"
,
"content"
:
"..."
,
"author"
:
{
"name"
:
"joe schmoe"
,
"email"
:
"[email protected]"
}
}
You must always use a $
-modifier for adding, changing, or removing
keys. A common error people make when starting out is to try to set
the value of a key to some other value by doing an update that
resembles this:
>
db
.
blog
.
posts
.
updateOne
({
"author.name"
:
"joe"
},
...
{
"author.name"
:
"joe schmoe"
})
This will result in an error. The update document must contain update operators. Previous versions of the CRUD API did not catch this type of error. Earlier update methods would simply complete a whole document replacement in such situations. It is this type of pitfall that led to the creation of a new CRUD API.
The "$inc"
operator can be used to change the value for an existing key or
to create a new key if it does not already exist. It’s useful for
updating analytics, karma, votes, or anything else that has a
changeable, numeric value.
Suppose we are creating a game collection where we want to save games and update scores as they change. When a user starts playing, say, a game of pinball, we can insert a document that identifies the game by name and the user playing it:
>
db
.
games
.
insertOne
({
"game"
:
"pinball"
,
"user"
:
"joe"
})
When the ball hits a bumper, the game should increment the
player’s score. Since points in pinball are given out pretty freely,
let’s say that the base unit of points a player can earn is 50. We can
use the "$inc"
modifier to add 50
to the player’s score:
>
db
.
games
.
updateOne
({
"game"
:
"pinball"
,
"user"
:
"joe"
},
...
{
"$inc"
:
{
"score"
:
50
}})
If we look at the document after this update, we’ll see the following:
>
db
.
games
.
findOne
()
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"game"
:
"pinball"
,
"user"
:
"joe"
,
"score"
:
50
}
The "score"
key did not
already exist, so it was created by "$inc"
and set to the increment amount:
50
.
If the ball lands in a “bonus” slot, we want to add 10,000 to
the score. We can do this by passing a different value to "$inc"
:
>
db
.
games
.
updateOne
({
"game"
:
"pinball"
,
"user"
:
"joe"
},
...
{
"$inc"
:
{
"score"
:
10000
}})
Now if we look at the game, we’ll see the following:
>
db
.
games
.
findOne
()
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"game"
:
"pinball"
,
"user"
:
"joe"
,
"score"
:
10050
}
The "score"
key existed and
had a numeric value, so the server added 10,000 to it.
"$inc"
is similar to "$set"
, but it is designed for
incrementing (and decrementing) numbers. "$inc"
can be used only on values of type
integer, long, double, or decimal. If it is used on any other type of
value, it will fail. This includes types that many languages will
automatically cast into numbers, like nulls, booleans, or strings of
numeric characters:
>
db
.
strcounts
.
insert
({
"count"
:
"1"
})
WriteResult
({
"nInserted"
:
1
})
>
db
.
strcounts
.
update
({},
{
"$inc"
:
{
"count"
:
1
}})
WriteResult
({
"nMatched"
:
0
,
"nUpserted"
:
0
,
"nModified"
:
0
,
"writeError"
:
{
"code"
:
16837
,
"errmsg"
:
"Cannot apply $inc to a value of non-numeric type.
{_id: ObjectId('5726c0d36855a935cb57a659')} has the field 'count' of
non-numeric type String"
}
})
Also, the value of the "$inc"
key must be a number. You cannot increment by a string, array, or
other nonnumeric value. Doing so will give a “Modifier "$inc"
allowed for numbers only” error
message. To modify other types, use "$set"
or one of the following array
operators.
An extensive class of update operators exists for manipulating arrays. Arrays are common and powerful data structures: not only are they lists that can be referenced by index, but they can also double as sets.
"$push"
adds
elements to the end of an array if the array exists and
creates a new array if it does not. For example, suppose that we are
storing blog posts and want to add a "comments"
key containing an array. We can
push a comment onto the nonexistent "comments"
array, which will create the
array and add the comment:
>
db
.
blog
.
posts
.
findOne
()
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"title"
:
"A blog post"
,
"content"
:
"..."
}
>
db
.
blog
.
posts
.
updateOne
({
"title"
:
"A blog post"
},
...
{
"$push"
:
{
"comments"
:
...
{
"name"
:
"joe"
,
"email"
:
"[email protected]"
,
...
"content"
:
"nice post."
}}})
{
"acknowledged"
:
true
,
"matchedCount"
:
1
,
"modifiedCount"
:
1
}
>
db
.
blog
.
posts
.
findOne
()
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"title"
:
"A blog post"
,
"content"
:
"..."
,
"comments"
:
[
{
"name"
:
"joe"
,
"email"
:
"[email protected]"
,
"content"
:
"nice post."
}
]
}
Now, if we want to add another comment, we can simply use
"$push"
again:
>
db
.
blog
.
posts
.
updateOne
({
"title"
:
"A blog post"
},
...
{
"$push"
:
{
"comments"
:
...
{
"name"
:
"bob"
,
"email"
:
"[email protected]"
,
...
"content"
:
"good post."
}}})
{
"acknowledged"
:
true
,
"matchedCount"
:
1
,
"modifiedCount"
:
1
}
>
db
.
blog
.
posts
.
findOne
()
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"title"
:
"A blog post"
,
"content"
:
"..."
,
"comments"
:
[
{
"name"
:
"joe"
,
"email"
:
"[email protected]"
,
"content"
:
"nice post."
},
{
"name"
:
"bob"
,
"email"
:
"[email protected]"
,
"content"
:
"good post."
}
]
}
This is the “simple” form of "push"
, but you can use it for more
complex array operations as well. The MongoDB query language
provides modifiers for some operators, including "$push"
. You can push multiple values in
one operation using the "$each"
modifer for "$push"
:
>
db
.
stock
.
ticker
.
updateOne
({
"_id"
:
"GOOG"
},
...
{
"$push"
:
{
"hourly"
:
{
"$each"
:
[
562.776
,
562.790
,
559.123
]}}})
This would push three new elements onto the array.
If you only want the array to grow to a certain length,
you can use the "$slice"
modifier
with "$push"
to prevent an array
from growing beyond a certain size, effectively making a “top N”
list of items:
>
db
.
movies
.
updateOne
({
"genre"
:
"horror"
},
...
{
"$push"
:
{
"top10"
:
{
"$each"
:
[
"Nightmare on Elm Street"
,
"Saw"
],
...
"$slice"
:
-
10
}}})
This example limits the array to the last 10 elements pushed.
If the array is smaller than 10 elements (after the push), all
elements will be kept. If the array is larger than 10 elements, only
the last 10 elements will be kept. Thus, "$slice"
can be used to create a queue in
a document.
Finally, you can apply the "$sort"
modifier to "$push"
operations before trimming:
>
db
.
movies
.
updateOne
({
"genre"
:
"horror"
},
...
{
"$push"
:
{
"top10"
:
{
"$each"
:
[{
"name"
:
"Nightmare on Elm Street"
,
...
"rating"
:
6.6
},
...
{
"name"
:
"Saw"
,
"rating"
:
4.3
}],
...
"$slice"
:
-
10
,
...
"$sort"
:
{
"rating"
:
-
1
}}}})
This will sort all of the objects in the array by their
"rating"
field and then keep the
first 10. Note that you must include "$each"
; you cannot just "$slice"
or "$sort"
an array with "$push"
.
You might want to treat an array as a set, only adding
values if they are not present. This can be done using "$ne"
in the query document. For example,
to push an author onto a list of citations, but only if they aren’t
already there, use the following:
>
db
.
papers
.
updateOne
({
"authors cited"
:
{
"$ne"
:
"Richie"
}},
...
{
$push
:
{
"authors cited"
:
"Richie"
}})
This can also be done with "$addToSet"
, which is useful for cases
where "$ne"
won’t work or where
"$addToSet"
describes what is
happening better.
For example, suppose you have a document that represents a user. You might have a set of email addresses that they have added:
>
db
.
users
.
findOne
({
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
)})
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"username"
:
"joe"
,
"emails"
:
[
"[email protected]"
,
"[email protected]"
,
"[email protected]"
]
}
When adding another address, you can use “$addToSet"
to prevent duplicates:
>
db
.
users
.
updateOne
({
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
)},
...
{
"$addToSet"
:
{
"emails"
:
"[email protected]"
}})
{
"acknowledged"
:
true
,
"matchedCount"
:
1
,
"modifiedCount"
:
0
}
>
db
.
users
.
findOne
({
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
)})
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"username"
:
"joe"
,
"emails"
:
[
"[email protected]"
,
"[email protected]"
,
"[email protected]"
,
]
}
>
db
.
users
.
updateOne
({
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
)},
...
{
"$addToSet"
:
{
"emails"
:
"[email protected]"
}})
{
"acknowledged"
:
true
,
"matchedCount"
:
1
,
"modifiedCount"
:
1
}
>
db
.
users
.
findOne
({
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
)})
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"username"
:
"joe"
,
"emails"
:
[
"[email protected]"
,
"[email protected]"
,
"[email protected]"
,
"[email protected]"
]
}
You can also use "$addToSet"
in conjunction with "$each"
to add multiple unique values,
which cannot be done with the "$ne"
/"$push"
combination. For instance, you
could use these operators if the user wanted to add more than one
email address:
>
db
.
users
.
updateOne
({
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
)},
...
{
"$addToSet"
:
{
"emails"
:
{
"$each"
:
...
[
"[email protected]"
,
"[email protected]"
,
"[email protected]"
]}}})
{
"acknowledged"
:
true
,
"matchedCount"
:
1
,
"modifiedCount"
:
1
}
>
db
.
users
.
findOne
({
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
)})
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"username"
:
"joe"
,
"emails"
:
[
"[email protected]"
,
"[email protected]"
,
"[email protected]"
,
"[email protected]"
"[email protected]"
"[email protected]"
]
}
There are a few ways to remove elements from an array. If
you want to treat the array like a queue or a stack, you
can use "$pop"
, which
can remove elements from either end. {"$pop" : {"
removes an element from the end of the array. key
" :
1}}{"$pop" : {"
removes it from the beginning.key
" :
-1}}
Sometimes an element should be removed based on specific
criteria, rather than its position in the array. "$pull"
is used to remove elements of an array that match the
given criteria. For example, suppose we have a list of things that
need to be done, but not in any specific order:
>
db
.
lists
.
insertOne
({
"todo"
:
[
"dishes"
,
"laundry"
,
"dry cleaning"
]})
If we do the laundry first, we can remove it from the list with the following:
>
db
.
lists
.
updateOne
({},
{
"$pull"
:
{
"todo"
:
"laundry"
}})
Now if we do a find, we’ll see that there are only two elements remaining in the array:
>
db
.
lists
.
findOne
()
{
"_id"
:
ObjectId
(
"4b2d75476cc613d5ee930164"
),
"todo"
:
[
"dishes"
,
"dry cleaning"
]
}
Pulling removes all matching documents, not just a single
match. If you have an array that looks like [1, 1, 2, 1]
and pull 1
, you’ll end up with a single-element
array, [2]
.
Array operators can be used only on keys with array values.
For example, you cannot push onto an integer or pop off of a string.
Use "$set"
or "$inc"
to modify scalar values.
Array manipulation becomes a little trickier when you have multiple
values in an array and want to modify some of them. There are two
ways to manipulate values in arrays: by position or by using
the position operator (the $
character).
Arrays use 0-based indexing, and elements can be selected as though their index were a document key. For example, suppose we have a document containing an array with a few embedded documents, such as a blog post with comments:
>
db
.
blog
.
posts
.
findOne
()
{
"_id"
:
ObjectId
(
"4b329a216cc613d5ee930192"
),
"content"
:
"..."
,
"comments"
:
[
{
"comment"
:
"good post"
,
"author"
:
"John"
,
"votes"
:
0
},
{
"comment"
:
"i thought it was too short"
,
"author"
:
"Claire"
,
"votes"
:
3
},
{
"comment"
:
"free watches"
,
"author"
:
"Alice"
,
"votes"
:
-
5
},
{
"comment"
:
"vacation getaways"
,
"author"
:
"Lynn"
,
"votes"
:
-
7
}
]
}
If we want to increment the number of votes for the first comment, we can say the following:
>
db
.
blog
.
updateOne
({
"post"
:
post_id
},
...
{
"$inc"
:
{
"comments.0.votes"
:
1
}})
In many cases, though, we don’t know what index of the array
to modify without querying for the document first and examining it.
To get around this, MongoDB has a positional operator, $
, that figures out which element of the
array the query document matched and updates that element. For
example, if we have a user named John who updates his name to Jim,
we can replace it in the comments by using the positional
operator:
>
db
.
blog
.
updateOne
({
"comments.author"
:
"John"
},
...
{
"$set"
:
{
"comments.$.author"
:
"Jim"
}})
The positional operator updates only the first match. Thus, if John had left more than one comment, his name would be changed only for the first comment he left.
MongoDB 3.6 introduced another option for updating individual array
elements: arrayFilters
. This
option enables us to modify array elements matching particular
critera. For example, if we want to hide all comments with five or
more down votes, we can do something like the
following:
db
.
blog
.
updateOne
(
{
"post"
:
post_id
},
{
$set
:
{
"comments.$[elem].hidden"
:
true
}
},
{
arrayFilters
:
[
{
"elem.votes"
:
{
$lte
:
-
5
}
}
]
}
)
This command defines elem
as the identifier
for each matching element in the "comments"
array. If the votes
value for the comment identified by
elem
is less than or equal to -5
, we will add a field called "hidden"
to the "comments"
document and set its
value to true
.
An upsert is a special type of update. If no document is found that matches the filter, a new document will be created by combining the criteria and updated documents. If a matching document is found, it will be updated normally. Upserts can be handy because they can eliminate the need to “seed” your collection: you can often have the same code create and update documents.
Let’s go back to our example that records the number of views for each page of a website. Without an upsert, we might try to find the URL and increment the number of views or create a new document if the URL doesn’t exist. If we were to write this out as a JavaScript program it might look something like the following:
// check if we have an entry for this page
blog
=
db
.
analytics
.
findOne
({
url
:
"/blog"
})
// if we do, add one to the number of views and save
if
(
blog
)
{
blog
.
pageviews
++
;
db
.
analytics
.
save
(
blog
);
}
// otherwise, create a new document for this page
else
{
db
.
analytics
.
insertOne
({
url
:
"/blog"
,
pageviews
:
1
})
}
This means we are making a round trip to the database, plus sending an update or insert, every time someone visits a page. If we are running this code in multiple processes, we are also subject to a race condition where more than one document can be inserted for a given URL.
We can eliminate the race condition and cut down the amount of
code by just sending an upsert to the database (the third parameter to
updateOne
and updateMany
is an options document that enables us to
specify this):
>
db
.
analytics
.
updateOne
({
"url"
:
"/blog"
},
{
"$inc"
:
{
"pageviews"
:
1
}},
...
{
"upsert"
:
true
})
This line does exactly what the previous code block does, except it’s faster and atomic! The new document is created by using the criteria document as a base and applying any modifier documents to it.
For example, if you do an upsert that matches a key and increments to the value of that key, the increment will be applied to the match:
>
db
.
users
.
updateOne
({
"rep"
:
25
},
{
"$inc"
:
{
"rep"
:
3
}},
{
"upsert"
:
true
})
WriteResult
({
"acknowledged"
:
true
,
"matchedCount"
:
0
,
"modifiedCount"
:
0
,
"upsertedId"
:
ObjectId
(
"5a93b07aaea1cb8780a4cf72"
)
})
>
db
.
users
.
findOne
({
"_id"
:
ObjectId
(
"5727b2a7223502483c7f3acd"
)}
)
{
"_id"
:
ObjectId
(
"5727b2a7223502483c7f3acd"
),
"rep"
:
28
}
The upsert creates a new document with a "rep"
of 25
and then increments that by 3, giving us a document where "rep"
is 28
. If the upsert option were not specified,
{"rep" : 25}
would not match any
documents, so nothing would happen.
If we run the upsert again (with the criterion {"rep" : 25}
), it will create another new
document. This is because the criterion does not match the only document
in the collection. (Its "rep"
is
28
.)
Sometimes a field needs to be set when a document is created, but
not changed on subsequent updates. This is what "$setOnInsert"
is for. "$setOnInsert"
is an operator that only sets
the value of a field when the document is being inserted. Thus, we could
do something like this:
>
db
.
users
.
updateOne
({},
{
"$setOnInsert"
:
{
"createdAt"
:
new
Date
()}},
...
{
"upsert"
:
true
})
{
"acknowledged"
:
true
,
"matchedCount"
:
0
,
"modifiedCount"
:
0
,
"upsertedId"
:
ObjectId
(
"5727b4ac223502483c7f3ace"
)
}
>
db
.
users
.
findOne
()
{
"_id"
:
ObjectId
(
"5727b4ac223502483c7f3ace"
),
"createdAt"
:
ISODate
(
"2016-05-02T20:12:28.640Z"
)
}
If we run this update again, it will match the existing document,
nothing will be inserted, and so the "createdAt"
field will not be changed:
>
db
.
users
.
updateOne
({},
{
"$setOnInsert"
:
{
"createdAt"
:
new
Date
()}},
...
{
"upsert"
:
true
})
{
"acknowledged"
:
true
,
"matchedCount"
:
1
,
"modifiedCount"
:
0
}
>
db
.
users
.
findOne
()
{
"_id"
:
ObjectId
(
"5727b4ac223502483c7f3ace"
),
"createdAt"
:
ISODate
(
"2016-05-02T20:12:28.640Z"
)
}
Note that you generally do not need to keep a "createdAt"
field, as ObjectId
s contain a timestamp of when the
document was created. However, "$setOnInsert"
can be useful for creating
padding, initializing counters, and for collections that do not use
ObjectId
s.
save
is a shell function that lets you insert a document if
it doesn’t exist and update it if it does. It takes one argument: a
document. If the document contains an "_id"
key, save
will do an upsert. Otherwise, it will
do an insert. save
is really just a
convenience function so that programmers can quickly modify documents
in the shell:
>
var
x
=
db
.
testcol
.
findOne
()
>
x
.
num
=
42
42
>
db
.
testcol
.
save
(
x
)
Without save
, the last line
would have been more cumbersome:
db.testcol.replaceOne({"_id" : x._id}, x)
So far in this chapter we have used updateOne
to illustrate update operations.
updateOne
updates only the first
document found that matches the filter criteria. If there are more
matching documents, they will remain unchanged. To modify all of the
documents matching a filter, use updateMany
. updateMany
follows the same semantics as
updateOne
and takes the same
parameters. The key difference is in the number of documents that might
be changed.
updateMany
provides a
powerful tool for performing schema migrations or rolling out new
features to certain users. Suppose, for example, we want to give a gift
to every user who has a birthday on a certain day. We can use updateMany
to add a "gift"
to their accounts. For example:
>
db
.
users
.
insertMany
([
...
{
birthday
:
"10/13/1978"
},
...
{
birthday
:
"10/13/1978"
},
...
{
birthday
:
"10/13/1978"
}])
{
"acknowledged"
:
true
,
"insertedIds"
:
[
ObjectId
(
"5727d6fc6855a935cb57a65b"
),
ObjectId
(
"5727d6fc6855a935cb57a65c"
),
ObjectId
(
"5727d6fc6855a935cb57a65d"
)
]
}
>
db
.
users
.
updateMany
({
"birthday"
:
"10/13/1978"
},
...
{
"$set"
:
{
"gift"
:
"Happy Birthday!"
}})
{
"acknowledged"
:
true
,
"matchedCount"
:
3
,
"modifiedCount"
:
3
}
The call to updateMany
adds a
"gift"
field to each of the three
documents we inserted into the users collection
immediately before.
For some use cases it is important to return the document
modified. In earlier versions of MongoDB, findAndModify
was the method of choice in such situations. It is handy
for manipulating queues and performing other operations that need
get-and-set−style atomicity. However, findAndModify
is prone to user error because
it’s a complex method combining the functionality of three different
types of operations: delete, replace, and update (including
upserts).
MongoDB 3.2 introduced three new collection methods to the shell
to accommodate the functionality of findAndModify
, but with semantics that are
easier to learn and remember: findOneAndDelete
, findOneAndReplace
, and findOneAndUpdate
. The primary difference between these methods
and, for example, updateOne
is that
they enable you to atomically get the value of a modified document.
MongoDB 4.2 extended findOneAndUpdate
to accept an
aggregation pipeline for the update. The pipeline can consist of the
following stages: $addFields
and its alias
$set
, $project
and its alias
$unset
, and $replaceRoot
and its
alias $replaceWith
.
Suppose we have a collection of processes run in a certain order. Each is represented with a document that has the following form:
{ "_id" : ObjectId(), "status" : "state
", "priority" :N
}
"status"
is a string that can
be "READY"
, "RUNNING"
, or "DONE"
. We need to find the job with the
highest priority in the "READY"
state, run the process function, and then update the status to "DONE"
. We might try querying for the ready
processes, sorting by priority, and updating the status of the
highest-priority process to mark it as "RUNNING"
.
Once we have processed it, we update the status to "DONE"
. This looks something like the
following:
var
cursor
=
db
.
processes
.
find
({
"status"
:
"READY"
});
ps
=
cursor
.
sort
({
"priority"
:
-
1
}).
limit
(
1
).
next
();
db
.
processes
.
updateOne
({
"_id"
:
ps
.
_id
},
{
"$set"
:
{
"status"
:
"RUNNING"
}});
do_something
(
ps
);
db
.
processes
.
updateOne
({
"_id"
:
ps
.
_id
},
{
"$set"
:
{
"status"
:
"DONE"
}});
This algorithm isn’t great because it is subject to a race
condition. Suppose we have two threads running. If one thread (call it
A) retrieved the document and another thread (call it B) retrieved the
same document before A had updated its status to "RUNNING"
,
then both threads would be running the same process. We can
avoid this by checking the result as part of the update query, but this
becomes complex:
var
cursor
=
db
.
processes
.
find
({
"status"
:
"READY"
});
cursor
.
sort
({
"priority"
:
-
1
}).
limit
(
1
);
while
((
ps
=
cursor
.
next
())
!=
null
)
{
var
result
=
db
.
processes
.
updateOne
({
"_id"
:
ps
.
_id
,
"status"
:
"READY"
},
{
"$set"
:
{
"status"
:
"RUNNING"
}});
if
(
result
.
modifiedCount
===
1
)
{
do_something
(
ps
);
db
.
processes
.
updateOne
({
"_id"
:
ps
.
_id
},
{
"$set"
:
{
"status"
:
"DONE"
}});
break
;
}
cursor
=
db
.
processes
.
find
({
"status"
:
"READY"
});
cursor
.
sort
({
"priority"
:
-
1
}).
limit
(
1
);
}
Also, depending on timing, one thread may end up doing all the work while another thread uselessly trails it. Thread A could always grab the process, and then B would try to get the same process, fail, and leave A to do all the work.
Situations like this are perfect for findOneAndUpdate
. findOneAndUpdate
can return the item and
update it in a single operation. In this case, it looks like the
following:
>
db
.
processes
.
findOneAndUpdate
({
"status"
:
"READY"
},
...
{
"$set"
:
{
"status"
:
"RUNNING"
}},
...
{
"sort"
:
{
"priority"
:
-
1
}})
{
"_id"
:
ObjectId
(
"4b3e7a18005cab32be6291f7"
),
"priority"
:
1
,
"status"
:
"READY"
}
Notice that the status is still "READY"
in the returned document because the
findOneAndUpdate
method defaults to
returning the state of the document before it was modified. It will
return the updated document if we set the "returnNewDocument"
field in the options
document to true
. An options document
is passed as the third parameter to findOneAndUpdate
:
>
db
.
processes
.
findOneAndUpdate
({
"status"
:
"READY"
},
...
{
"$set"
:
{
"status"
:
"RUNNING"
}},
...
{
"sort"
:
{
"priority"
:
-
1
},
...
"returnNewDocument"
:
true
})
{
"_id"
:
ObjectId
(
"4b3e7a18005cab32be6291f7"
),
"priority"
:
1
,
"status"
:
"RUNNING"
}
Thus, the program becomes the following:
ps
=
db
.
processes
.
findOneAndUpdate
({
"status"
:
"READY"
},
{
"$set"
:
{
"status"
:
"RUNNING"
}},
{
"sort"
:
{
"priority"
:
-
1
},
"returnNewDocument"
:
true
})
do_something
(
ps
)
db
.
process
.
updateOne
({
"_id"
:
ps
.
_id
},
{
"$set"
:
{
"status"
:
"DONE"
}})
In addition to this one, there are two other methods you should be
aware of. findOneAndReplace
takes the same
parameters and returns the document matching the filter either before or
after the replacement, depending on the value of returnNewDocument
. findOneAndDelete
is similar except it does not
take an update document as a parameter and has a subset of the options
of the other two methods. findOneAndDelete
returns the deleted document.
3.12.36.30