Time for action — setting up the master/slave replication

Let's set up the basic master/slave replication. We shall need two machines for this.

First, start the master:

server-1$ mongod --master

Now, we will start the slave server:

server-2$ mongod --slave --source server-1

That's it! Now we have server-2 which is a slave of server-1 and all the databases on server-1 are seamlessly replicated to server-2.

Note

In case server-1 goes down, you need to change the configuration of the application to point to server-2.

What just happened?

We fired two simple commands and see that everything has started working. Let's understand them in detail:

$ sudo mongod --master -vvvv

This command will pick up the default mongod.conf file and start this server as the master!

Note

Remember that - vvvv means very verbose. The more v you add, the more verbose output on the console.

If all is well, you should see this on the console:

[initandlisten] MongoDB starting : pid=53165 port=27017 dbpath=/usr/local/var/mongodb master=1 64-bit host=server-1
[initandlisten] db version v2.0.2, pdfile version 4.5
...
[initandlisten] Accessing: local for the first time
[initandlisten] query local.system.namespaces reslen:20 0ms
...
[initandlisten] master=true
[initandlisten] ******
[initandlisten] creating replication oplog of size: 183MB...
[initandlisten] create collection local.oplog.$main { size: 192000000.0, capped: true, autoIndexId: false }
[initandlisten] New namespace: local.oplog.$main
[initandlisten] New namespace: local.system.namespaces
...
[FileAllocator] allocating new datafile /usr/local/var/mongodb/local.ns, filling with zeroes...
[FileAllocator] creating directory /usr/local/var/mongodb/_tmp
[FileAllocator] done allocating datafile /usr/local/var/mongodb/local.ns, size: 16MB, took 2.174 secs
[FileAllocator] allocating new datafile /usr/local/var/mongodb/local.0, filling with zeroes...
...
[initandlisten] runQuery called local.oplog.$main { query: {}, orderby: { $natural: -1 } }
[initandlisten] query local.oplog.$main ntoreturn:1 nscanned:1 nreturned:1 reslen:64 372ms
...
[initandlisten] waiting for connections on port 27017
[websvr] fd limit hard:9223372036854775807 soft:256 max conn: 204
[websvr] admin web console waiting for connections on port 28017

The console log we see is a very detailed one as it helps us understand how MongoDB replication works! Let's see this in smaller parts:

[initandlisten] master=true

[initandlisten] ******

[initandlisten] creating replication oplog of size: 183MB...
[initandlisten] create collection local.oplog.$main { size: 192000000.0, capped: true, autoIndexId: false }

[initandlisten] New namespace: local.oplog.$main
[initandlisten] New namespace: local.system.namespaces

We can see that the server has started as the master. The local.oplog.$main is a capped collection which saves all transaction log entries that will be replicated over to the slaves.

[FileAllocator] allocating new datafile /usr/local/var/mongodb/local.ns, filling with zeroes...
[FileAllocator] creating directory /usr/local/var/mongodb/_tmp
[FileAllocator] done allocating datafile /usr/local/var/mongodb/local.ns, size: 16MB, took 2.174 secs

When we set up the master for the first time, this local.oplog.$main capped collection and the local namespace is created (and depending on the machine this can take a few minutes!).

...
[initandlisten] runQuery called local.oplog.$main { query: {}, orderby: { $natural: -1 } }

[initandlisten] query local.oplog.$main ntoreturn:1 nscanned:1 nreturned:1 reslen:64 372ms
...

This is where the transaction logs are checked for their natural order and setup. After this, the master server is waiting for connections and serving requests normally.

Now let's see what happens when a slave connects:

$ sudo mongod --slave --source 192.168.1.141
[initandlisten] MongoDB starting : pid=20653 port=27017 dbpath=/usr/local/var/mongodb slave=1 64-bit host=server-2
...
[replslave] repl: from host:192.168.1.141
[replslave] repl: applied 1 operations
[replslave] repl: end sync_pullOpLog syncedTo: Apr 5 15:33:41 4f7d6dfd:1
[replslave] repl: sleep 1 sec before next pass

At this point, the slave has sent a request to the master for syncing and received a reply. A lot of interesting things happen on the master:

[initandlisten] connection accepted from 192.168.1.153:63591 #1
[conn1] runQuery called admin.$cmd { handshake: ObjectId('4f7d6d3fb7d32a318178619f') }
[conn1] run command admin.$cmd { handshake: ObjectId('4f7d6d3fb7d32a318178619f') }

[conn1] command admin.$cmd command: { handshake: ObjectId('4f7d6d3fb7d32a318178619f') } ntoreturn:1 reslen:37 0ms
[conn1] runQuery called local.oplog.$main { query: {}, orderby: { $natural: -1 } }
[conn1] query local.oplog.$main ntoreturn:1 nreturned:1 reslen:64 0ms

This is the master/slave handshake and they exchange object IDs so that the master knows which slave has connected:

[conn1] runQuery called admin.$cmd { listDatabases: 1 }
[conn1] run command admin.$cmd { listDatabases: 1 }

[conn1] command: { listDatabases: 1 }

Next up, the master checks for which databases should be replicated:

[conn1] command admin.$cmd command: { listDatabases: 1 } ntoreturn:1 reslen:195 1143ms
[conn1] runQuery called local.oplog.$main { ts: { $gte: new Date(5727855097040338945) } }

[conn1] query local.oplog.$main nreturned:1 reslen:64 47ms
BackgroundJob starting: SlaveTracking

Now, it checks the transaction log (local.oplog.$main) to see where it should start the replication from and then spawns a SlaveTracking background job. This happens as follows:

[slaveTracking] New namespace: local.slaves
[slaveTracking] adding _id index for collection local.slaves

[slaveTracking] New namespace: local.system.indexes
[slaveTracking] build index local.slaves { _id: 1 }

mem info: before index start vsize: 3509 resident: 41 mapped: 544
[slaveTracking] external sort root: /usr/local/var/mongodb/_tmp/esort.1333620219.2003184756/
mem info: before final sort vsize: 3509 resident: 41 mapped: 544
mem info: after final sort vsize: 3509 resident: 41 mapped: 544
[slaveTracking] external sort used : 0 files in 0 secs
[slaveTracking] New namespace: local.slaves.$_id_
[slaveTracking] done building bottom layer, going to commit
[slaveTracking] fastBuildIndex dupsToDrop:0
[slaveTracking] build index done 0 records 0.023 secs

In case the local.slaves collection has not been built, the master builds it and indexes it:

[slaveTracking] update local.slaves query: { _id: ObjectId('4f7d6d3fb7d32a318178619f'), host: "192.168.1.153", ns: "local.oplog.$main" } update: { $set: { syncedTo: Timestamp 1333620189000|1 } } fastmodinsert:1 134ms

Here, the slave is added with host information and its timestamp for replication. After this is done, there are continuous sync commands that would go back and forth between the master and the slave like this:

[conn1] getmore local.oplog.$main query: { ts: { $gte: new Date(5727855097040338945) } } cursorid:1979419191886059940 reslen:20 2311ms
[conn1] running multiple plans
[conn1] getmore local.oplog.$main query: { ts: { $gte: new Date(5727855097040338945) } } cursorid:1979419191886059940 nreturned:1 reslen:64 886ms

Tip

The sync commands are continuous, they do not directly interfere with the routine database processing for the master, but they can consume valuable CPU and network resources.

It is recommended to keep the slave behind the master for an acceptable duration that depends on the application. We use the --slavedelay option for this.

What happens if the master goes down? The slave shows log entries like this:

[replslave] repl: from host:192.168.1.141
[replslave] repl: AssertionException dbclient error communicating with server: 192.168.1.141
repl: sleep 2 sec before next pass
[replslave] repl: from host:192.168.1.141
[replslave] repl: couldn't connect to server 192.168.1.141

[replslave] repl: sleep 3 sec before next pass
[replslave] repl: from host:192.168.1.141

Once the master comes up again, the syncing begins.

Tip

It is possible to have a configuration such that the writes are always on the master but reads can be from the master or slave.

Note

Suppose you want to simulate the master/slave configuration on a single machine, remember to run the slave on a different port

$ sudo mongod --slave --source localhost --port 27123

Using replica sets

Using replica sets is the recommended approach for replication and failover.

Note

Replica sets are available only in MongoDB versions after v1.6.

Replica sets, as the name suggests, are a bunch of MongoDB nodes that work together and keep replicas of the data. This is not a master/slave configuration! Nodes elect a leader, which then behaves as the master and the other nodes become the slaves and receive replication data. According to replica set terminology, they are called PRIMARY and SECONDARY respectively.

As this is the normal case for ensuring write consistency, we can write on to PRIMARY and if required read from SECONDARY. The beauty of replica sets is the election process. Nodes exchange handshakes and vote or veto nodes and finally elect a PRIMARY. We can also insert arbiters to ensure enough members for the voting process.

Note

Arbiters are very light-weighted MongoDB instances that only vote! They are not replication nodes and are involved only in the voting process

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.163.10