ZooKeeper best practices

Some of the best practices of running and managing a ZooKeeper ensemble are show here:

  • The ZooKeeper data directory contains the snapshot and transactional log files. It is a good practice to periodically clean up the directory if the autopurge option is not enabled. Also, an administrator might want to keep a backup of these files, depending on the application needs. However, since ZooKeeper is a replicated service, we need to back up the data of only one of the servers in the ensemble.
  • ZooKeeper uses Apache log4j as its logging infrastructure. As the logfiles grow bigger in size, it is recommended that you set the auto-rollover of the logfiles using the in-built log4j feature for ZooKeeper logs.
  • The list of ZooKeeper servers used by the clients in their connection strings must match the list of ZooKeeper servers that each ZooKeeper server has. Strange behaviors might occur if the lists don't match.
  • The server lists in each Zookeeper server configuration file should be consistent with the other members of the ensemble.
  • As already mentioned, the ZooKeeper transaction log must be configured in a dedicated device. This is very important to achieve best performance from ZooKeeper.
  • The Java heap size should be chosen with care. Swapping should never be allowed to happen in the ZooKeeper server. It is better if the ZooKeeper servers have a reasonably high memory (RAM).
  • System monitoring tools such as vmstat can be used to monitor virtual memory statistics and decide on the optimal size of memory needed, depending on the need of the application. In any case, swapping should be avoided.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.27.155