This chapter covers how to scale with MongoDB. We’ll look at:
What sharding is and the components of a cluster
How to configure sharding
The basics of how sharding interacts with your application
Sharding refers to the process of splitting data up across machines; the term partitioning is also sometimes used to describe this concept. By putting a subset of data on each machine, it becomes possible to store more data and handle more load without requiring larger or more powerful machines—just a larger quantity of less-powerful machines. Sharding may be used for other purposes as well, including placing more frequently accessed data on more performant hardware or splitting a dataset based on geography to locate a subset of documents in a collection (e.g., for users based in a particular locale) close to the application servers from which they are most commonly accessed.
Manual sharding can be done with almost any database software. With this approach, an application maintains connections to several different database servers, each of which are completely independent. The application manages storing different data on different servers and querying against the appropriate server to get data back. This setup can work well but becomes difficult to maintain when adding or removing nodes from the cluster or in the face of changing data distributions or load patterns.
MongoDB supports autosharding, which tries to both abstract the architecture away from the application and simplify the administration of such a system. MongoDB allows your application to ignore the fact that it isn’t talking to a standalone MongoDB server, to some extent. On the operations side, MongoDB automates balancing data across shards and makes it easier to add and remove capacity.
Sharding is the most complex way of configuring MongoDB, both from a development and an operational point of view. There are many components to configure and monitor, and data moves around the cluster automatically. You should be comfortable with standalone servers and replica sets before attempting to deploy or use a sharded cluster. Also, as with replica sets, the recommended means of configuring and deploying sharded clusters is through either MongoDB Ops Manager or MongoDB Atlas. Ops Manager is recommended if you need to maintain control of your computing infrastructure. MongoDB Atlas is recommended if you can leave the infrastructure management to MongoDB (you have the option of running in Amazon AWS, Microsoft Azure, or Google Compute Cloud).
MongoDB’s sharding allows you to create a cluster of many machines (shards) and break up a collection across them, putting a subset of data on each shard. This allows your application to grow beyond the resource limits of a standalone server or replica set.
Many people are confused about the difference between replication and sharding. Remember that replication creates an exact copy of your data on multiple servers, so every server is a mirror image of every other server. Conversely, every shard contains a different subset of data.
One of the goals of sharding is to make a cluster of 2, 3, 10, or even hundreds of shards look like a single machine to your application. To hide these details from the application, we run one or more routing processes called a mongos in front of the shards. A mongos keeps a “table of contents” that tells it which shard contains which data. Applications can connect to this router and issue requests normally, as shown in Figure 14-1. The router, knowing what data is on which shard, is able to forward the requests to the appropriate shard(s). If there are responses to a request the router collects them and, if necessary, merges them, and sends them back to the application. As far as the application knows, it’s connected to a standalone mongod, as illustrated in Figure 14-2.
We’ll start by setting up a quick cluster on a single machine. First,
start a mongo shell with the --nodb
and --norc
options:
$ mongo --nodb --norc
To create a cluster, use the ShardingTest
class. Run the following in the mongo shell
you just launched:
st
=
ShardingTest
({
name
:
"one-min-shards"
,
chunkSize
:
1
,
shards
:
2
,
rs
:
{
nodes
:
3
,
oplogSize
:
10
},
other
:
{
enableBalancer
:
true
}
});
The chunksize
option is covered in Chapter 17. For now, simply set it to 1
. As for the other options passed to ShardingTest
here, name
simply
provides a label for our sharded cluster, shards
specifies that our cluster will be composed of two shards (we do this to
keep the resource requirements low for this example), and
rs
defines each shard as a three-node replica set with an oplogSize
of 10 MiB (again, to keep resource
shard as a three-node replica set with an oplogSize
of 10 MiB (again, to keep resource
utilization low). Though it is possible to run one standalone
mongod for each shard, it paints a clearer picture of
the typical architecture of a sharded cluster if we create each shard as a
replica set. In the last option specified, we are instructing ShardingTest
to enable the balancer once the
cluster is spun up. This will ensure that data is evenly distributed
across both shards.
ShardingTest
is a class designed
for internal use by MongoDB Engineering and is therefore undocumented
externally. However, because it ships with the MongoDB server, it provides
the most straightforward means of experimenting with a sharded cluster.
ShardingTest
was originally designed to
support server test suites and is still used for this purpose. By default
it provides a number of conveniences that help in keeping resource
utilization as small as possible and in setting up the relatively complex
architecture of a sharded cluster. It makes an assumption about the
presence of a /data /db
directory on
your machine; if ShardingTest
fails to
run then create this directory and rerun the command again.
When you run this command, ShardingTest
will do a lot for you
automatically. It will create a new cluster with two shards, each of which
is a replica set. It will configure the replica sets and launch each node
with the necessary options to establish replication protocols. It will
launch a mongos to manage requests across the shards
so that clients can interact with the cluster as if communicating with a
standalone mongod, to some extent. Finally, it will
launch an additional replica set for the config servers that maintain the
routing table information necessary to ensure queries are directed to the
correct shard. Remember that the primary use cases for sharding are to split a
dataset to address hardware and cost constraints or to provide better
performance to applications (e.g., geographical partitioning). MongoDB
sharding provides these capabilities in a way that is seamless to the
application in many respects.
Once ShardingTest
has finished
setting up your cluster, you will have 10 processes up and running to
which you can connect: two replica sets of three nodes each, one config
server replica set of three nodes, and one mongos. By
default, these processes should begin at port 20000. The
mongos should be running at port 20009. Other
processes you have running on your local machine and previous calls to
ShardingTest
can have an effect on
which ports ShardingTest
uses, but you
should not have too much difficulty determining the ports on which your
cluster processes are running.
Next, you’ll connect to the mongos to play around with the cluster. Your entire cluster will be dumping its logs to your current shell, so open up a second terminal window and launch another mongo shell:
$ mongo --nodb
Use this shell to connect to your cluster’s mongos. Again, your mongos should be running on port 20009:
>
db
=
(
new
Mongo
(
"localhost:20009"
)).
getDB
(
"accounts"
)
Note that the prompt in your mongo shell should change to reflect that you are connected to a mongos. Now you are in the situation shown earlier, in Figure 14-1: the shell is the client and is connected to a mongos. You can start passing requests to the mongos and it’ll route them to the shards. You don’t really have to know anything about the shards, like how many there are or what their addresses are. So long as there are some shards out there, you can pass the requests to the mongos and allow it to forward them appropriately.
Start by inserting some data:
>
for
(
var
i
=
0
;
i
<
100000
;
i
++
)
{
...
db
.
users
.
insert
({
"username"
:
"user"
+
i
,
"created_at"
:
new
Date
()});
...
}
>
db
.
users
.
count
()
100000
As you can see, interacting with mongos works the same way as interacting with a standalone server does.
You can get an overall view of your cluster by running sh.status()
. It will give you a summary of your
shards, databases, and collections:
>
sh
.
status
()
---
Sharding
Status
---
sharding
version
:
{
"_id"
:
1
,
"minCompatibleVersion"
:
5
,
"currentVersion"
:
6
,
"clusterId"
:
ObjectId
(
"5a4f93d6bcde690005986071"
)
}
shards
:
{
"_id"
:
"one-min-shards-rs0"
,
"host"
:
"one-min-shards-rs0/MBP:20000,MBP:20001,MBP:20002"
,
"state"
:
1
}
{
"_id"
:
"one-min-shards-rs1"
,
"host"
:
"one-min-shards-rs1/MBP:20003,MBP:20004,MBP:20005"
,
"state"
:
1
}
active
mongoses
:
"3.6.1"
:
1
autosplit
:
Currently
enabled
:
no
balancer
:
Currently
enabled
:
no
Currently
running
:
no
Failed
balancer
rounds
in
last
5
attempts
:
0
Migration
Results
for
the
last
24
hours
:
No
recent
migrations
databases
:
{
"_id"
:
"accounts"
,
"primary"
:
"one-min-shards-rs1"
,
"partitioned"
:
false
}
{
"_id"
:
"config"
,
"primary"
:
"config"
,
"partitioned"
:
true
}
config
.
system
.
sessions
shard
key
:
{
"_id"
:
1
}
unique
:
false
balancing
:
true
chunks
:
one
-
min
-
shards
-
rs0
1
{
"_id"
:
{
"$minKey"
:
1
}
}
-->>
{
"_id"
:
{
"$maxKey"
:
1
}
}
on
:
one
-
min
-
shards
-
rs0
Timestamp
(
1
,
0
)
sh
is similar to rs
, but for
sharding: it is a global variable that defines a number of sharding
helper functions, which you can see by running sh.help()
. As you can see from the sh.status()
output, you have two shards and two databases
(config is created
automatically).
Your accounts database may have a different primary shard than the one shown here. A primary shard is a “home base” shard that is randomly chosen for each database. All of your data will be on this primary shard. MongoDB cannot automatically distribute your data yet because it doesn’t know how (or if) you want it to be distributed. You have to tell it, per collection, how you want it to distribute data.
A primary shard is different from a replica set primary. A primary shard refers to the entire replica set composing a shard. A primary in a replica set is the single server in the set that can take writes.
To shard a particular collection, first enable sharding on the
collection’s database. To do so, run the enableSharding
command:
>
sh
.
enableSharding
(
"accounts"
)
Now sharding is enabled on the accounts database, which allows you to shard collections within the database.
When you shard a collection, you choose a shard
key. This is a field or two that MongoDB uses to break up
data. For example, if you chose to shard on "username"
, MongoDB would break up the data into
ranges of usernames: "a1-steak-sauce"
through
"defcon"
, "defcon1"
through
"howie1998"
, and so on. Choosing a shard key can be
thought of as choosing an ordering for the data in the collection. This is
a similar concept to indexing, and for good reason: the shard key becomes
the most important index on your collection as it gets bigger. To even
create a shard key, the field(s) must be indexed.
So, before enabling sharding, you have to create an index on the key you want to shard by:
>
db
.
users
.
createIndex
({
"username"
:
1
})
Now you can shard the collection by "username"
:
>
sh
.
shardCollection
(
"accounts.users"
,
{
"username"
:
1
})
Although we are choosing a shard key without much thought here, it is an important decision that should be carefully considered in a real system. See Chapter 16 for more advice on choosing a shard key.
If you wait a few minutes and run sh.status()
again, you’ll see that there’s a
lot more information displayed than there was before:
>
sh
.
status
()
---
Sharding
Status
---
sharding
version
:
{
"_id"
:
1
,
"minCompatibleVersion"
:
5
,
"currentVersion"
:
6
,
"clusterId"
:
ObjectId
(
"5a4f93d6bcde690005986071"
)
}
shards
:
{
"_id"
:
"one-min-shards-rs0"
,
"host"
:
"one-min-shards-rs0/MBP:20000,MBP:20001,MBP:20002"
,
"state"
:
1
}
{
"_id"
:
"one-min-shards-rs1"
,
"host"
:
"one-min-shards-rs1/MBP:20003,MBP:20004,MBP:20005"
,
"state"
:
1
}
active
mongoses
:
"3.6.1"
:
1
autosplit
:
Currently
enabled
:
no
balancer
:
Currently
enabled
:
yes
Currently
running
:
no
Failed
balancer
rounds
in
last
5
attempts
:
0
Migration
Results
for
the
last
24
hours
:
6
:
Success
databases
:
{
"_id"
:
"accounts"
,
"primary"
:
"one-min-shards-rs1"
,
"partitioned"
:
true
}
accounts
.
users
shard
key
:
{
"username"
:
1
}
unique
:
false
balancing
:
true
chunks
:
one
-
min
-
shards
-
rs0
6
one
-
min
-
shards
-
rs1
7
{
"username"
:
{
"$minKey"
:
1
}
}
-->>
{
"username"
:
"user17256"
}
on
:
one
-
min
-
shards
-
rs0
Timestamp
(
2
,
0
)
{
"username"
:
"user17256"
}
-->>
{
"username"
:
"user24515"
}
on
:
one
-
min
-
shards
-
rs0
Timestamp
(
3
,
0
)
{
"username"
:
"user24515"
}
-->>
{
"username"
:
"user31775"
}
on
:
one
-
min
-
shards
-
rs0
Timestamp
(
4
,
0
)
{
"username"
:
"user31775"
}
-->>
{
"username"
:
"user39034"
}
on
:
one
-
min
-
shards
-
rs0
Timestamp
(
5
,
0
)
{
"username"
:
"user39034"
}
-->>
{
"username"
:
"user46294"
}
on
:
one
-
min
-
shards
-
rs0
Timestamp
(
6
,
0
)
{
"username"
:
"user46294"
}
-->>
{
"username"
:
"user53553"
}
on
:
one
-
min
-
shards
-
rs0
Timestamp
(
7
,
0
)
{
"username"
:
"user53553"
}
-->>
{
"username"
:
"user60812"
}
on
:
one
-
min
-
shards
-
rs1
Timestamp
(
7
,
1
)
{
"username"
:
"user60812"
}
-->>
{
"username"
:
"user68072"
}
on
:
one
-
min
-
shards
-
rs1
Timestamp
(
1
,
7
)
{
"username"
:
"user68072"
}
-->>
{
"username"
:
"user75331"
}
on
:
one
-
min
-
shards
-
rs1
Timestamp
(
1
,
8
)
{
"username"
:
"user75331"
}
-->>
{
"username"
:
"user82591"
}
on
:
one
-
min
-
shards
-
rs1
Timestamp
(
1
,
9
)
{
"username"
:
"user82591"
}
-->>
{
"username"
:
"user89851"
}
on
:
one
-
min
-
shards
-
rs1
Timestamp
(
1
,
10
)
{
"username"
:
"user89851"
}
-->>
{
"username"
:
"user9711"
}
on
:
one
-
min
-
shards
-
rs1
Timestamp
(
1
,
11
)
{
"username"
:
"user9711"
}
-->>
{
"username"
:
{
"$maxKey"
:
1
}
}
on
:
one
-
min
-
shards
-
rs1
Timestamp
(
1
,
12
)
{
"_id"
:
"config"
,
"primary"
:
"config"
,
"partitioned"
:
true
}
config
.
system
.
sessions
shard
key
:
{
"_id"
:
1
}
unique
:
false
balancing
:
true
chunks
:
one
-
min
-
shards
-
rs0
1
{
"_id"
:
{
"$minKey"
:
1
}
}
-->>
{
"_id"
:
{
"$maxKey"
:
1
}
}
on
:
one
-
min
-
shards
-
rs0
Timestamp
(
1
,
0
)
The collection has been split up into 13 chunks, where each chunk is
a subset of your data. These are listed by shard key range (the
{"username" :
denotes the
range of each chunk). Looking at the minValue
} -->>
{"username" : maxValue
}"on" :
part of the output, you can see
that these chunks have been evenly distributed between the shards.shard
This process of a collection being split into chunks is shown graphically in Figures 14-3 through 14-5. Before sharding, the collection is essentially a single chunk. Sharding splits it into smaller chunks based on the shard key, as shown in Figure 14-4. These chunks can then be distributed across the cluster, as Figure 14-5 shows.
Notice the keys at the beginning and end of the chunk list:
$minKey
and $maxKey
. $minKey
can be thought of as “negative
infinity.” It is smaller than any other value in MongoDB. Similarly,
$maxKey
is like “positive infinity.” It is greater than
any other value. Thus, you’ll always see these as the “caps” on your chunk
ranges. The values for your shard key will always be between
$minKey
and $maxKey
. These values
are actually BSON types and should not be used in your application; they
are mainly for internal use. If you wish to refer to them in the shell,
use the MinKey
and MaxKey
constants.
Now that the data is distributed across multiple shards, let’s try doing some queries. First, try a query on a specific username:
>
db
.
users
.
find
({
username
:
"user12345"
})
{
"_id"
:
ObjectId
(
"5a4fb11dbb9ce6070f377880"
),
"username"
:
"user12345"
,
"created_at"
:
ISODate
(
"2018-01-05T17:08:45.657Z"
)
}
As you can see, querying works normally. However, let’s run an
explain
to see what MongoDB is doing
under the covers:
>
db
.
users
.
find
({
username
:
"user12345"
}}).
explain
()
{
"queryPlanner"
:
{
"mongosPlannerVersion"
:
1
,
"winningPlan"
:
{
"stage"
:
"SINGLE_SHARD"
,
"shards"
:
[{
"shardName"
:
"one-min-shards-rs0"
,
"connectionString"
:
"one-min-shards-rs0/MBP:20000,MBP:20001,MBP:20002"
,
"serverInfo"
:
{
"host"
:
"MBP"
,
"port"
:
20000
,
"version"
:
"3.6.1"
,
"gitVersion"
:
"025d4f4fe61efd1fb6f0005be20cb45a004093d1"
},
"plannerVersion"
:
1
,
"namespace"
:
"accounts.users"
,
"indexFilterSet"
:
false
,
"parsedQuery"
:
{
"username"
:
{
"$eq"
:
"user12345"
}
},
"winningPlan"
:
{
"stage"
:
"FETCH"
,
"inputStage"
:
{
"stage"
:
"SHARDING_FILTER"
,
"inputStage"
:
{
"stage"
:
"IXSCAN"
,
"keyPattern"
:
{
"username"
:
1
},
"indexName"
:
"username_1"
,
"isMultiKey"
:
false
,
"multiKeyPaths"
:
{
"username"
:
[
]
},
"isUnique"
:
false
,
"isSparse"
:
false
,
"isPartial"
:
false
,
"indexVersion"
:
2
,
"direction"
:
"forward"
,
"indexBounds"
:
{
"username"
:
[
"["user12345", "user12345"]"
]
}
}
}
},
"rejectedPlans"
:
[
]
}]
}
},
"ok"
:
1
,
"$clusterTime"
:
{
"clusterTime"
:
Timestamp
(
1515174248
,
1
),
"signature"
:
{
"hash"
:
BinData
(
0
,
"AAAAAAAAAAAAAAAAAAAAAAAAAAA="
),
"keyId"
:
NumberLong
(
0
)
}
},
"operationTime"
:
Timestamp
(
1515173700
,
201
)
}
From the "winningPlan"
field in the explain
output, we can see that our cluster
satisfied this query using a single shard, one-min-shards-rs0. Based on the output of
sh.status()
shown earlier, we can see
that user12345 does fall within the key range for the
first chunk listed for that shard in our cluster.
Because "username"
is the shard
key, mongos was able to route the
query directly to the correct shard. Contrast that with the results for
querying for all of the users:
>
db
.
users
.
find
().
explain
()
{
"queryPlanner"
:
{
"mongosPlannerVersion"
:
1
,
"winningPlan"
:
{
"stage"
:
"SHARD_MERGE"
,
"shards"
:
[
{
"shardName"
:
"one-min-shards-rs0"
,
"connectionString"
:
"one-min-shards-rs0/MBP:20000,MBP:20001,MBP:20002"
,
"serverInfo"
:
{
"host"
:
"MBP.fios-router.home"
,
"port"
:
20000
,
"version"
:
"3.6.1"
,
"gitVersion"
:
"025d4f4fe61efd1fb6f0005be20cb45a004093d1"
},
"plannerVersion"
:
1
,
"namespace"
:
"accounts.users"
,
"indexFilterSet"
:
false
,
"parsedQuery"
:
{
},
"winningPlan"
:
{
"stage"
:
"SHARDING_FILTER"
,
"inputStage"
:
{
"stage"
:
"COLLSCAN"
,
"direction"
:
"forward"
}
},
"rejectedPlans"
:
[
]
},
{
"shardName"
:
"one-min-shards-rs1"
,
"connectionString"
:
"one-min-shards-rs1/MBP:20003,MBP:20004,MBP:20005"
,
"serverInfo"
:
{
"host"
:
"MBP.fios-router.home"
,
"port"
:
20003
,
"version"
:
"3.6.1"
,
"gitVersion"
:
"025d4f4fe61efd1fb6f0005be20cb45a004093d1"
},
"plannerVersion"
:
1
,
"namespace"
:
"accounts.users"
,
"indexFilterSet"
:
false
,
"parsedQuery"
:
{
},
"winningPlan"
:
{
"stage"
:
"SHARDING_FILTER"
,
"inputStage"
:
{
"stage"
:
"COLLSCAN"
,
"direction"
:
"forward"
}
},
"rejectedPlans"
:
[
]
}
]
}
},
"ok"
:
1
,
"$clusterTime"
:
{
"clusterTime"
:
Timestamp
(
1515174893
,
1
),
"signature"
:
{
"hash"
:
BinData
(
0
,
"AAAAAAAAAAAAAAAAAAAAAAAAAAA="
),
"keyId"
:
NumberLong
(
0
)
}
},
"operationTime"
:
Timestamp
(
1515173709
,
514
)
}
As you can see from this explain
,
this query has to visit both shards to find all the data. In general, if
we are not using the shard key in the query, mongos will have to send the query to every
shard.
Queries that contain the shard key and can be sent to a single shard or a subset of shards are called targeted queries. Queries that must be sent to all shards are called scatter-gather (broadcast) queries: mongos scatters the query to all the shards and then gathers up the results.
Once you are finished experimenting, shut down the set. Switch back
to your original shell and hit Enter a few times to get back to the
command line, then run st.stop()
to cleanly shut down all of the servers:
>
st
.
stop
()
If you are ever unsure of what an operation will do, it can be
helpful to use ShardingTest
to spin
up a quick local cluster and try it out.
3.145.166.7