© Subhashini Chellappan and Dharanitharan Ganesan 2020
S. Chellappan, D. GanesanMongoDB Recipeshttps://doi.org/10.1007/978-1-4842-4891-1_5

5. Replication and Sharding

Subhashini Chellappan1  and Dharanitharan Ganesan2
(1)
Bangalore, India
(2)
Krishnagiri, Tamil Nadu, India
 
In Chapter 4, we discussed various indexes in MongoDB. In this chapter, we cover the following topics:
  • Replication.

  • Sharding.

Replication

Replication is the process of creating and managing a duplicate version of a database across servers to provide redundancy and increase availability of data.

In MongoDB, replication is achieved with the help of a replica set, a group of mongod instances that maintain the same data set. A replica set contains one primary node that is responsible for all write operations and one or more secondary nodes that replicate the primary’s oplog and apply the operations to their data sets to reflect the primary’s data set. Figure 5-1 is an illustration of a replica set.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig1_HTML.jpg
Figure 5-1

A replica set

Recipe 5-1. Set Up a Replica Set

In this recipe, we are going to discuss how to set up a replica set (one primary and two secondaries) in Windows.

Problem

You want to create a replica set.

Solution

Use a group of mongod instances.

How It Works

Let’s follow the steps in this section to set up a three-member replica set.

Step 1: Three-Member Replica Set

First, create three data directories:
md c:mongodb epset s1
md c:mongodb epset s2
md c:mongodb epset s3
Here is the output,
c:>md c:mongodb epset s1
c:>md c:mongodb epset s2
c:>md c:mongodb epset s3
Next, start three mongod instances as shown here (see Figures 5-2, 5-4, and 5-6).
start mongod --bind_ip hostname --dbpath c:mongodb epset s1 --port 20001 --replSet myrs
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig2_HTML.jpg
Figure 5-2

Starting mongod with replica set 1

Note

hostname must be replaced as with ipaddress or localhost if it is the same local machine.

Refer to Figure 5-3 for a mongod instance that is waiting for connection on port 20001.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig3_HTML.jpg
Figure 5-3

mongod instance waiting for connection on port 20001

start mongod --bind_ip hostname --dbpath c:mongodb epset s2 --port 20002 --replSet myrs
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig4_HTML.jpg
Figure 5-4

Starting mongod with replica set 2

Refer to Figure 5-5 for a mongod instance that is waiting for connection on port 20002.

../images/475461_1_En_5_Chapter/475461_1_En_5_Fig5_HTML.jpg
Figure 5-5

mongod instance waiting for connection on port 20002

start mongod --bind_ip hostname --dbpath c:mongodb epset s3 --port 20003 --replSet myrs
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig6_HTML.jpg
Figure 5-6

Starting mongod with replica set 3

Refer Figure 5-7 for a mongod instance that is waiting for connection on port 20002.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig7_HTML.jpg
Figure 5-7

mongod instance waiting for connection on port 20003

Next, issue the following command to connect to a mongod instance running on port 20001.
mongo hostname:20001
Figure 5-8 shows the mongo shell that is running on port 20001.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig8_HTML.jpg
Figure 5-8

Connect to mongo instance on port 20001

Issue the following command in the mongo shell to create a three-member replica set.
rs.initiate();   // to initiate replica set
Here is the output,
> rs.initiate();
{
        "info2" : "no configuration specified. Using a default configuration for the set",
        "me" : "hostname:20001",
        "ok" : 1,
        "operationTime" : Timestamp(1538362864, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538362864, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
After initiating the replica set, we can add the secondary node by using this command.
rs.add("hostname:20002");  // to add secondary
Here is the output,
myrs:SECONDARY> rs.add("hostname:20002");
{
        "ok" : 1,
        "operationTime" : Timestamp(1538362927, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538362927, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
myrs:PRIMARY>

Here, the mongod instance running on port 20001 becomes primary.

We can add another secondary node by using the following command.
rs.add("hostname:20003");    // to add secondary
Here is the output,
myrs:PRIMARY>  rs.add("hostname:20003");
{
        "ok" : 1,
        "operationTime" : Timestamp(1538362931, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538362931, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
myrs:PRIMARY>
Now, you can check the status of the replica set by issuing the following command.
rs.status()
Next, create a collection named employee using the primary replica node.
 db.employee.insert({_id:10001,name:'Subhashini'});
 db.employee.insert({_id:10002,name:'Shobana'});
Here is the output,
myrs:PRIMARY> db.employee.insert({_id:10001,name:'Subhashini'});
WriteResult({ "nInserted" : 1 })
myrs:PRIMARY>  db.employee.insert({_id:10002,name:'Shobana'});
WriteResult({ "nInserted" : 1 })
myrs:PRIMARY>
Now, connect to the mongo shell running on port 20002 by issuing this command.
mongo hostname:20002
Here is the output,
myrs:SECONDARY>
Now, issue the following command to find all employees as shown here.
myrs:SECONDARY> db.employee.find()
Error: error: {
        "operationTime" : Timestamp(1538364766, 1),
        "ok" : 0,
        "errmsg" : "not master and slaveOk=false",
        "code" : 13435,
        "codeName" : "NotMasterNoSlaveOk",
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538364766, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
myrs:SECONDARY>

The output Error: error: { ........... } shows that we are getting an error message because we are trying to read data from the secondary node.

Issue the following command to perform a read operation from a secondary node.
myrs:SECONDARY> rs.slaveOk()
myrs:SECONDARY> db.employee.find()
{ "_id" : 10001, "name" : "Subhashini" }
{ "_id" : 10002, "name" : "Shobana" }
Next, try to perform a write operation from a secondary node.
myrs:SECONDARY> db.employee.insert({_id:10003,name:"Arunaa MS"})
WriteCommandError({
        "operationTime" : Timestamp(1538364966, 1),
        "ok" : 0,
        "errmsg" : "not master",
        "code" : 10107,
        "codeName" : "NotMaster",
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538364966, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
})
myrs:SECONDARY>

In the preceding output, WriteCommandError { ........... } shows that we are getting an error message because we can’t perform write operations in a secondary node. We can perform write operations only in the primary node.

All the secondary nodes replicate the primary’s log and apply their operations to ensure the secondary’s data set reflects the primary’s data set, as shown in Figure 5-9.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig9_HTML.jpg
Figure 5-9

Replication strategy

Step 2: Auto Failover—High Availability

Kill the primary running on port 20001 and press Enter in the mongo shell running on ports 20002 and 20003 (Figure 5-10). Any one of the secondary nodes can become primary now.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig10_HTML.jpg
Figure 5-10

Primary failover

When a primary does not communicate with other members in the replica set for a configured period (10 seconds by default), an eligible secondary node calls for an election to nominate itself as the new primary and resume its normal operation (Figure 5-11).
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig11_HTML.jpg
Figure 5-11

Primary failover and new primary election

To enable free monitoring, run the following command.
db.enableFreeMonitoring()
To permanently disable this reminder, run the following command.
db.disableFreeMonitoring()
To check the status of free monitoring, use this command.
db.getFreeMonitoringStatus()

Sharding

Sharding is a method for distributing data across multiple machines. There are two methods for addressing system growth: vertical and horizontal scaling.
  • Vertical scaling: We need to increase the capacity of a single server such as using a more powerful CPU, adding more RAM, or increasing the amount of storage space.

  • Horizontal scaling: We need to divide the data set and distribute the workload across the servers by adding additional servers to increase the capacity as required.

MongoDB supports horizontal scaling through sharding. A MongoDB sharded cluster consists of the following components:
  1. 1.

    Shard: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.

     
  2. 2.

    mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.

     
  3. 3.

    Config servers: Config servers store metadata and configuration settings for the cluster.

     

Recipe 5-2. Sharding

In this recipe, we are going to discuss how to create sharding to distribute data across servers.

Problem

You want to create sharding to distribute data across servers.

Solution

The solution is a group of mongod instances .

How It Works

Let’s follow the steps in this section to set up a sharding.

Step 1: Sharded Cluster

First, create data directories for three shards as shown here.

shard1
md c:shard_datashard1data1
md c:shard_datashard1data2
md c:shard_datashard1data3
shard2
md c:shard_datashard2data1
md c:shard_datashard2data2
md c:shard_datashard2data3
shard3
md c:shard_datashard3data1
md c:shard_datashard3data2
md c:shard_datashard3data3

Next, start the shards as shown here.

shard1
start mongod.exe --shardsvr --port 26017 --dbpath "c:shard_datashard1data1" --replSet shard1_replset
start mongod.exe --shardsvr --port 26117 --dbpath "c:shard_datashard1data2" --replSet shard1_replset
start mongod.exe --shardsvr --port 26217 --dbpath "c:shard_datashard1data3" --replSet shard1_replset
Figure 5-12 shows the execution of commands.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig12_HTML.jpg
Figure 5-12

Starting the shard1 server

shard2
start mongod.exe --shardsvr --port 28017 --dbpath "c:shard_datashard2data1" --replSet shard2_replset
start mongod.exe --shardsvr --port 28117 --dbpath "c:shard_datashard2data2" --replSet shard2_replset
start mongod.exe --shardsvr --port 28217 --dbpath "c:shard_datashard2data3" --replSet shard2_replset
Figure 5-13 shows the execution of commands.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig13_HTML.jpg
Figure 5-13

Starting the shard2 server

shard3
start mongod.exe --shardsvr --port 29017 --dbpath "c:shard_datashard3data1" --replSet shard3_replset
start mongod.exe --shardsvr --port 29117 --dbpath "c:shard_datashard3data2" --replSet shard3_replset
start mongod.exe --shardsvr --port 29217 --dbpath "c:shard_datashard3data3" --replSet shard3_replset
Figure 5-14 shows the execution of commands.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig14_HTML.jpg
Figure 5-14

Starting the shard3 server

Now, connect to one of the shard servers to enable a replica set as shown here and in Figure 5-15.
mongo.exe hostname:26017
C:Program FilesMongoDBServer4.0in> mongo.exe localhost:26017
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig15_HTML.jpg
Figure 5-15

Connect the shard1 server

Next initiate the replica by using the following command in mongo shell as shown here.
MongoDB Enterprise > rs.initiate(
{
_id: "shard1_replset",
members: [
{ _id : 0, host:"hostname:26017" },
{ _id : 1, host:"hostname:26117" },
{ _id : 2, host:"hostname:26217" }]}
)
Figure 5-16 is the snapshot for reference.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig16_HTML.jpg
Figure 5-16

Initiating the replica on the shard1 server

Now connect to another shard and initiate the replica.
C:Program FilesMongoDBServer4.0in> mongo.exe hostname:28017
MongoDB Enterprise >  rs.initiate(
{
_id: "shard2_replset",
members: [
{ _id : 0, host:"hostname:28017" },
{ _id : 1, host:"hostname:28117" },
{ _id : 2, host:"hostname:28217" }
]
}
)
Now connect to the third shard and initiate the replica,.
C:Program FilesMongoDBServer4.0in> mongo.exe hostname:29017
MongoDB Enterprise >  rs.initiate(
{
_id: "shard3_replset",
members: [
{ _id : 0, host:"hostname:29017" },
{ _id : 1, host:"hostname:29117" },
{ _id : 2, host:"hostname:29217" }
]
}
)
Now, start the config servers by using the commands that follow. Create data directories for the config servers.
md c:shard_dataconfig_server1data1
md c:shard_dataconfig_server1data2
md c:shard_dataconfig_server1data3
Start the config server with a replica set.
start mongod.exe --configsvr --port 47017 --dbpath "c:shard_dataconfig_server1data1" --replSet configserver1_replset
start mongod.exe --configsvr --port 47117 --dbpath "c:shard_dataconfig_server1data2" --replSet configserver1_replset
start mongod.exe --configsvr --port 47217 --dbpath "c:shard_dataconfig_server1data3" --replSet configserver1_replset
Figure 5-17 is a snapshot for reference to start config servers.
../images/475461_1_En_5_Chapter/475461_1_En_5_Fig17_HTML.jpg
Figure 5-17

Starting the config servers

Now we can connect to config servers to enable the replica set.
C:Program FilesMongoDBServer4.0in> mongo.exe hostname:47017
Then initiate the replica set.
MongoDB Enterprise >  rs.initiate(
{
_id: "configserver1_replset",
configsvr: true,
members: [
{ _id : 0, host : "hostname:47017" },
{ _id : 1, host : "hostname:47117" },
{ _id : 2, host : "hostname:47217" }
]
}
)
Start mongos as shown here.
start mongos.exe --configdb configserver1_replset/hostname:47017,hostname:47117,hostname:47217 --port 1000

Now it is time to perform sharding. Here, we are going to use one query router, one config server, and three shards.

Connect to the query router as shown here.
mongo.exe localhost:1000
Here is the output,
2018-10-01T12:47:08.929+0530 I CONTROL  [main]
mongos>
Next, add the three shard servers to the config server as shown here.
sh.addShard("shard1_replset/localhost:26017,localhost:26117,localhost:26217")
Here is the output for the first shard:
mongos> sh.addShard("shard1_replset/localhost:26017,localhost:26117,localhost:26217")
{
        "shardAdded" : "shard1_replset",
        "ok" : 1,
        "operationTime" : Timestamp(1538378773, 7),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538378773, 7),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
mongos>
sh.addShard("shard2_replset/localhost:28017,localhost:28117,localhost:28217")
Here is the output for the second shard.
mongos> sh.addShard("shard2_replset/localhost:28017,localhost:28117,localhost:28217")
{
        "shardAdded" : "shard2_replset",
        "ok" : 1,
        "operationTime" : Timestamp(1538378886, 4),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538378886, 6),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
mongos>
sh.addShard("shard3_replset/localhost:29017,localhost:29117,localhost:29217")
Here is the output for the third shard.
mongos> sh.addShard("shard3_replset/localhost:29017,localhost:29117,localhost:29217")
{
        "shardAdded" : "shard3_replset",
        "ok" : 1,
        "operationTime" : Timestamp(1538378938, 4),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538378938, 4),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
Issue the following command to check the status of the sharding.
mongos> sh.status()
Now, to enable sharding for a database, use this command.
mongos> sh.enableSharding("demos")
To enable sharding for a collection, use this code.
 mongos> sh.shardCollection("demos.users",{"id":1}
Here is the output,
mongos> sh.shardCollection("demos.users",{"id":1})
{
        "collectionsharded" : "demos.users",
        "collectionUUID" : UUID("0122f602-212e-4c79-8be7-5c7a63676c8b"),
        "ok" : 1,
        "operationTime" : Timestamp(1538379345, 15),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1538379345, 15),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
Next we can create a huge collection using the following commands.
mongos> use demos;
for(var i=0;i<10000;i++){db.users.insert({id: Math.random(), count:i, date: new Date()})}
Issue the next command to count the number of users.
mongos> db.users.count()
10000
Issue the following command to see the distribution of the user collection .
mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("5bb1c63e9bbb23ade7b2dd95")
  }
  shards:
        {  "_id" : "shard1_replset",  "host" : "shard1_replset/localhost:26017,localhost:26117,localhost:26217",  "state" : 1 }
        {  "_id" : "shard2_replset",  "host" : "shard2_replset/localhost:28017,localhost:28117,localhost:28217",  "state" : 1 }
        {  "_id" : "shard3_replset",  "host" : "shard3_replset/localhost:29017,localhost:29117,localhost:29217",  "state" : 1 }
  active mongoses:
        "4.0.2" : 1
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard1_replset  1
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard1_replset Timestamp(1, 0)
        {  "_id" : "demos",  "primary" : "shard2_replset",  "partitioned" : true,  "version" : {  "uuid" : UUID("52ced1e7-6afd-4554-a310-609d62c11450"),  "lastMod" : 1 } }
                demos.users
                        shard key: { "id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard2_replset  1
                        { "id" : { "$minKey" : 1 } } -->> { "id" : { "$maxKey" : 1 } } on : shard2_replset Timestamp(1, 0)

This shows that user data is distributed to shard1 and shard2.

Note

All these demos are executed in a Windows environment.

Next, to shard a collection using hashed sharding, enable sharding for a database as shown here.
sh.enableSharding("sample")
Here is the output,
mongos> sh.enableSharding("sample")
{
        "ok" : 1,
        "operationTime" : Timestamp(1554705786, 5),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1554705786, 5),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
Finally, use the following command to shard a collection using hashed sharding.
sh.shardCollection("sample.users",{"id":"hashed"})
Here is the output,
mongos> sh.shardCollection("sample.users",{"id":"hashed"})
{
        "collectionsharded" : "sample.users",
        "collectionUUID" : UUID("4fb2750a-ec5d-4ae0-be8f-a8fbea8294ab"),
        "ok" : 1,
        "operationTime" : Timestamp(1554705891, 13),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1554705891, 25),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.163.180