Deploying a Sharded MongoDB Cluster

The process of deploying a sharded MongoDB cluster involves several steps to set the different types of servers and then configuring the databases and collections. To deploy a sharded MongoDB cluster, you need to follow these basic steps:

1. Create config server database instances.

2. Start query router servers.

3. Add shards to the cluster.

4. Enable sharding on a database.

5. Enable sharding on a collection.

6. Set up shard tag ranges.

The following sections describe each of these steps in more detail.


Warning

All members of a sharded cluster must be able to connect to all other members of a sharded cluster, including all shards and all config servers. Ensure that your network and security systems, including all interfaces and firewalls, allow these connections.


Creating Config Server Database Instances (mongod)

The config server processes are simply mongod instances that store a cluster’s metadata instead of the collections. Each config server stores a complete copy of the cluster’s metadata. In production deployments, you must deploy exactly three config server instances, each running on different servers, to ensure high availability and data integrity.

To implement the config servers, you need to perform the following steps on each one:

1. Create a data directory to store the config database.

2. Start config server instances by passing the path to the data directory created in step 1 and also include the --configsvr option to denote that this is a config server. For example:

mongod --configsvr --dbpath <path> --port <port>

3. Once the mongod instance starts up, the config server is ready.


Note

The default port for config servers is 27019.


Starting Query Router Servers (mongos)

The query router (mongos) servers do not require database directories because the configuration is stored on the config servers, and the data is stored on the shard server. The mongos servers are very lightweight, and therefore it is acceptable to have a mongos instance on the same system that runs your application server.

You can create multiple instances of the mongos servers to route requests to the sharded cluster. However, to ensure high availability, you don’t want these instances running on the same system.

To start an instance of the mongos server, you need to pass in the --configdb parameter with a list of the DNS names/hostnames of the config servers you want to use for the cluster. For example:

mongos --configdb c1.test.net:27019,c2.test.net:27019,c3.test.net:27019

By default, a mongos instance runs on port 27017. However, you can also configure a different port address by using the --port <port> command-line option.


Tip

To avoid downtime, give each config server a logical DNS name (unrelated to the server’s physical or virtual hostname). Without logical DNS names, moving or renaming a config server requires shutting down every mongod and mongos instance in the sharded cluster.


Adding Shards to the Cluster (mongod)

The shard servers in a cluster are just standard MongoDB servers loaded by the mongod command. They can be stand-alone servers or a replica set. To add the MongoDB servers as shards in the cluster, all you need to do is access the mongos server from the MongoDB shell and use the sh.addShard() command.

The syntax for the sh.addShard() command is:

sh.addShard(<replica_set_or_server_address>)

For example, to add a replica set named rs1 on a server named mgo1.test.net as a shard in the cluster server, execute the following command from the MongoDB shell on the mongos server:

sh.addShard( "rs1/mgo1.test.net:27017" )

For example, to add a server named mgo1.test.net as a shard in the cluster server, execute the following command from the MongoDB shell on the mongos server:

sh.addShard( "mgo1.test.net:27017" )

Once you have added all the shards to the replica set, the cluster will be communicating and sharding the data. However, for predefined data, it will take some time for the chunks to be fully distributed.

Enabling Sharding on a Database

Prior to sharding a collection, you need to enable sharding on the database it resides in. Enabling sharding doesn’t automatically redistribute the data but just assigns a primary shard for the database and makes other configuration adjustments that make it possible to enable the collections for sharding.

To enable the database for sharding, you need to connect to a mongos instance using the MongoDB shell and issue the sh.enableSharding(database) command. For example, to enable a database named bigWords, you would use:

sh.enableSharding("bigWords");

Enabling Sharding on a Collection

Once the database has been enabled for sharding, you are ready to enable sharding at the collection level. You do not need to enable sharding for all collections in the database—just for the one that it makes sense on.

Use the following steps to enable sharding on a collection:

1. Determine which field(s) will be used for the shard key, as described above.

2. Create a unique index on the key field(s) by using ensureIndex(), as described earlier in this chapter:

db.myDB.myCollection.ensureIndex( { _id : "hashed" } )

3. Enable sharding on the collection by using sh.shardCollection(<database>.<collection>, shard_key). shard_key is the pattern used to create the index. For example:

sh.shardCollection("myDB.myCollection", { "_id": "hashed" } )

Setting Up Shard Tag Ranges

Once you have enabled sharding on a collection, you might want to add tags to target specific ranges of the shard key values. A really good example of this is where the collection is sharded by zip codes. To improve performance, you can add tags for specific city codes, like NYC and SFO, and specify the zip code ranges for those cities. This ensures that documents for a specific city will be stored on a single shard in the cluster, which can improve query performance for queries based on multiple zip codes for the same city.

To set up shard tags, you simply need to add a tag to the shard by using the sh.addShardTag(shard_server, tag_name) command from a mongos instance. For example:

sh.addShardTag("shard0001", "NYC")
sh.addShardTag("shard0002", "SFO")

Then you specify a range for a specific tag. In this case, you would add the zip code ranges for each city tag by using the sh.addTagRange(collection_path, startValue, endValue, tag_name) command from the mongos instance. For example:

sh.addTagRange("records.users", { zipcode: "10001" }, { zipcode: "10281" }, "NYC")
sh.addTagRange("records.users", { zipcode: "11201" }, { zipcode: "11240" }, "NYC")
sh.addTagRange("records.users", { zipcode: "94102" }, { zipcode: "94135" }, "SFO")

Notice that multiple ranges are added for the NYC. This allows you to specify multiple ranges within the same tag that is assigned to a single shard.

If you need to remove a shard tag later, you can do so by using the sh.removeShardTag(shard_server, tag_name) method. For example:

sh.removeShardTag("shard0002", "SFO")

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.150.123