Resyncing a member of a replica set

Secondaries sync up with the primary by replaying the contents of the oplog. If our oplog is not large enough or we encounter network issues (partitioning, underperforming network, or just an outage of the secondary server) for a period of time larger than the oplog then MongoDB cannot use the oplog to catch up to the primary anymore.

At this point we have two options:

  • The more straightforward option is to delete our dbpath directory and restart the mongod process. In this case, MongoDB will start an initial sync from scratch. This option has the downside of putting a strain on our replica set and our network as well.
  • The more complicated (from an operational standpoint) option is to copy data files from another well-behaving member of the replica set. This goes back to the contents of Chapter 8, Monitoring, Backup, and Security. The important thing to keep in mind is that a simple file copy will most probably not suffice as data files will have changed from the time that we start copying to the time that the copying has ended.

Thus, we need to be able to take a snapshot copy of the filesystem under our data directory.

Another point of consideration is that by the time we start our secondary server with the newly copied files, our MongoDB secondary server will, again, try to sync up to the primary using the oplog. So, if our oplog has again fallen so far behind the primary that it can't find the entry on our primary server, this method will fail too.

Keep a sufficiently sized oplog. Don't let data grow out of hand in any replica set member. Design, test, and deploy sharding early on.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.234.118