Monitoring replication

Replica sets use the oplog (operations log) to keep the synced state. Every operation gets applied on the primary server and then gets written in the primary server's oplog, which is a capped collection. Secondaries read this oplog asynchronously and apply the operations one by one.

If the primary server gets overloaded, then secondaries won't be able to read and apply operations fast enough, generating replication lag. Replication lag is counted as the time difference between the last operation applied on the primary and the last operation applied on the secondary as stored in the oplog capped collection.

For example, if the time is 4:30:00 pm and the secondary just applied an operation that was applied on our primary server at 4:25:00 pm, this means that the secondary is lagging 5 minutes behind our primary server.

In our production cluster, replication lag should be close to or equal to zero.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.172.93