Chapter 16. Choosing a Shard Key

The most important task when using sharding is choosing how your data will be distributed. To make intelligent choices about this, you have to understand how MongoDB distributes data. This chapter helps you make a good choice of shard key by covering:

  • How to decide among multiple possible shard keys

  • Shard keys for several use cases

  • What you can’t use as a shard key

  • Some alternative strategies if you want to customize how data is distributed

  • How to manually shard your data

It assumes that you understand the basic components of sharding as covered in the previous two chapters.

Taking Stock of Your Usage

When you shard a collection you choose a field or two to use to split up the data. This key (or keys) is called a shard key. Once you shard a collection you cannot change your shard key, so it is important to choose correctly.

To choose a good shard key, you need to understand your workload and how your shard key is going to distribute your application’s requests. This can be difficult to picture, so try to work out some examples—or, even better, try it out on a backup dataset with sample traffic. This section has lots of diagrams and explanations, but there is no substitute for trying it on your own data.

For each collection that you’re planning to shard, start by answering the following questions:

  • How many shards are you planning to grow to? A three-shard cluster has a great deal more flexibility than a thousand-shard cluster. As a cluster gets larger, you should not plan to fire off queries that can hit all shards, so almost all queries must include the shard key.

  • Are you sharding to decrease read or write latency? (Latency refers to how long something takes; e.g., a write takes 20 ms, but you need it to take 10 ms.) Decreasing write latency usually involves sending requests to geographically closer or more powerful machines.

  • Are you sharding to increase read or write throughput? (Throughput refers to how many requests the cluster can handle at the same time; e.g., the cluster can do 1,000 writes in 20 ms, but you need it to do 5,000 writes in 20 ms.) Increasing throughput usually involves adding more parallelization and making sure that requests are distributed evenly across the cluster.

  • Are you sharding to increase system resources (e.g., give MongoDB more RAM per GB of data)? If so, you want to keep the working set size as small as possible.

Use these answers to evaluate the following shard key descriptions and decide whether the shard key you’re considering would work well in your situation. Does it give you the targeted queries that you need? Does it change the throughput or latency of your system in the ways you need? If you need a compact working set, does it provide that?

Picturing Distributions

The most common ways people choose to split their data are via ascending, random, and location-based keys. There are other types of keys that could be used, but most use cases fall into one of these categories. The different types of distributions are discussed in the following sections.

Ascending Shard Keys

Ascending shard keys are generally something like a "date" field or ObjectId—anything that steadily increases over time. An autoincrementing primary key is another example of an ascending field, albeit one that doesn’t show up in MongoDB much (unless you’re importing from another database).

Suppose that we shard on an ascending field, like "_id" on a collection using ObjectIds. If we shard on "_id", then the data will be split into chunks of "_id" ranges, as in Figure 16-1. These chunks will be distributed across our sharded cluster of, let’s say, three shards, as shown in Figure 16-2.

Figure 16-1. The collection is split into ranges of ObjectIds; each range is a chunk

Suppose we create a new document. Which chunk will it be in? The answer is the chunk with the range ObjectId("5112fae0b4a4b396ff9d0ee5") through $maxKey. This is called the max chunk, as it is the chunk containing $maxKey.

If we insert another document, it will also be in the max chunk. In fact, every subsequent insert will be into the max chunk! Every insert’s "_id" field will be closer to infinity than the previous one (because ObjectIds are always ascending), so they will all go into the max chunk.

Figure 16-2. Chunks are distributed across shards in a random order

This has a couple of interesting (and often undesirable) properties. First, all of your writes will be routed to one shard (shard0002, in this case). This chunk will be the only one growing and splitting, as it is the only one that receives inserts. As you insert data, new chunks will “fall off” of this chunk, as shown in Figure 16-3.

Figure 16-3. The max chunk continues growing and being split into multiple chunks

This pattern often makes it more difficult for MongoDB to keep chunks evenly balanced because all the chunks are being created by one shard. Therefore, MongoDB must constantly move chunks to other shards instead of correcting the small imbalances that might occur in more evenly distributed systems.

Note

In MongoDB 4.2, the move of the autosplit functionality to the shard primary mongod added top chunk optimization to address the ascending shard key pattern. The balancer will decide in which other shard to place the top chunk. This helps avoid a situation in which all new chunks are created on just one shard.

Randomly Distributed Shard Keys

At the other end of the spectrum are randomly distributed shard keys. Randomly distributed keys could be usernames, email addresses, UUIDs, MD5 hashes, or any other key that has no identifiable pattern in your dataset.

Suppose the shard key is a random number between 0 and 1. We’ll end up with a random distribution of chunks on the various shards, as shown in Figure 16-4.

Figure 16-4. As in the previous section, chunks are distributed randomly around the cluster

As more data is inserted, the data’s random nature means that inserts should hit every chunk fairly evenly. You can prove this to yourself by inserting 10,000 documents and seeing where they end up:

> var servers = {}
> var findShard = function (id) {
...     var explain = db.random.find({_id:id}).explain();
...     for (var i in explain.shards) {
...         var server = explain.shards[i][0];
...         if (server.n == 1) {
...             if (server.server in servers) {
...                 servers[server.server]++;
...             } else {
...                 servers[server.server] = 1;
...             }
...         }
...     }
... }
> for (var i = 0; i < 10000; i++) {
...     var id = ObjectId();
...     db.random.insert({"_id" : id, "x" : Math.random()});
...     findShard(id);
... }
> servers
{
    "spock:30001" : 2942,
    "spock:30002" : 4332,
    "spock:30000" : 2726
}

As writes are randomly distributed, the shards should grow at roughly the same rate, limiting the number of migrates that need to occur.

The only downside to randomly distributed shard keys is that MongoDB isn’t efficient at randomly accessing data beyond the size of RAM. However, if you have the capacity or don’t mind the performance hit, random keys nicely distribute load across your cluster.

Location-Based Shard Keys

Location-based shard keys may be things like a user’s IP, latitude and longitude, or address. They’re not necessarily related to a physical location field: the “location” might be a more abstract way that data should be grouped together. In any case, a location-based key is a key where documents with some similarity fall into a range based on this field. This can be handy for both putting data close to its users and keeping related data together on disk. It may also be a legal requirement to remain compliant with GDPR or other similar data privacy legislation. MongoDB uses Zoned Sharding to manage this.

Note

In MongoDB 4.0.3+, you can define the zones and the zone ranges prior to sharding a collection, which populates chunks for both the zone ranges and for the shard key values as well as performing an initial chunk distribution of these. This greatly reduces the complexity for sharded zone setup.

For example, suppose we have a collection of documents that are sharded on IP address. Documents will be organized into chunks based on their IPs and randomly spread across the cluster, as shown in Figure 16-5.

Figure 16-5. A sample distribution of chunks in the IP address collection

If we wanted certain chunk ranges to be attached to certain shards, we could zone these shards and then assign chunk ranges to each zone. In this example, suppose that we wanted to keep certain IP blocks on certain shards: say, 56.*.*.* (the United States Postal Service’s IP block) on shard0000 and 17.*.*.* (Apple’s IP block) on either shard0000 or shard0002. We do not care where the other IPs live. We could request that the balancer do this by setting up zones:

> sh.addShardToZone("shard0000", "USPS")
> sh.addShardToZone("shard0000", "Apple")
> sh.addShardToZone("shard0002", "Apple")

Next, we create the rules:

> sh.updateZoneKeyRange("test.ips", {"ip" : "056.000.000.000"}, 
... {"ip" : "057.000.000.000"}, "USPS")

This attaches all IPs greater than or equal to 56.0.0.0 and less than 57.0.0.0 to the shard zoned as "USPS". Next, we add a rule for Apple:

> sh.updateZoneKeyRange("test.ips", {"ip" : "017.000.000.000"}, 
... {"ip" : "018.000.000.000"}, "Apple")

When the balancer moves chunks, it will attempt to move chunks with those ranges to those shards. Note that this process is not immediate. Chunks that were not covered by a zone key range will be moved around normally. The balancer will continue attempting to distribute chunks evenly among shards.

Shard Key Strategies

This section presents a number of shard key options for various types of applications.

Hashed Shard Key

For loading data as fast as possible, hashed shard keys are the best option. A hashed shard key can make any field randomly distributed, so it is a good choice if you’re going to be using an ascending key in a lot of queries but want writes to be randomly distributed.

The trade-off is that you can never do a targeted range query with a hashed shard key. If you will not be doing range queries, though, hashed shard keys are a good option.

To create a hashed shard key, first create a hashed index:

> db.users.createIndex({"username" : "hashed"})

Next, shard the collection with:

> sh.shardCollection("app.users", {"username" : "hashed"})
{ "collectionsharded" : "app.users", "ok" : 1 }

If you create a hashed shard key on a nonexistent collection, shardCollection behaves interestingly: it assumes that you want evenly distributed chunks, so it immediately creates a bunch of empty chunks and distributes them around your cluster. For example, suppose our cluster looked like this before creating the hashed shard key:

> sh.status()
--- Sharding Status --- 
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
        {  "_id" : "shard0000",  "host" : "localhost:30000" }
        {  "_id" : "shard0001",  "host" : "localhost:30001" }
        {  "_id" : "shard0002",  "host" : "localhost:30002" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "test",  "partitioned" : true,  "primary" : "shard0001" }

Immediately after shardCollection returns there are two chunks on each shard, evenly distributing the key space across the cluster:

> sh.status()
--- Sharding Status --- 
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
    {  "_id" : "shard0000",  "host" : "localhost:30000" }
    {  "_id" : "shard0001",  "host" : "localhost:30001" }
    {  "_id" : "shard0002",  "host" : "localhost:30002" }
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    {  "_id" : "test",  "partitioned" : true,  "primary" : "shard0001" }
        test.foo
            shard key: { "username" : "hashed" }
            chunks:
                shard0000       2
                shard0001       2
                shard0002       2
            { "username" : { "$MinKey" : true } } 
                -->> { "username" : NumberLong("-6148914691236517204") } 
                on : shard0000 { "t" : 3000, "i" : 2 } 
            { "username" : NumberLong("-6148914691236517204") } 
                -->> { "username" : NumberLong("-3074457345618258602") } 
                on : shard0000 { "t" : 3000, "i" : 3 } 
            { "username" : NumberLong("-3074457345618258602") } 
                -->> { "username" : NumberLong(0) } 
                on : shard0001 { "t" : 3000, "i" : 4 } 
            { "username" : NumberLong(0) } 
                -->> { "username" : NumberLong("3074457345618258602") } 
                on : shard0001 { "t" : 3000, "i" : 5 } 
            { "username" : NumberLong("3074457345618258602") } 
                -->> { "username" : NumberLong("6148914691236517204") } 
                on : shard0002 { "t" : 3000, "i" : 6 } 
            { "username" : NumberLong("6148914691236517204") } 
                -->> { "username" : { "$MaxKey" : true } } 
                on : shard0002 { "t" : 3000, "i" : 7 }

Note that there are no documents in the collection yet, but when you start inserting them, writes should be evenly distributed across the shards from the get-go. Ordinarily, you would have to wait for chunks to grow, split, and move to start writing to other shards. With this automatic priming, you’ll immediately have chunk ranges on all shards.

Note

There are some limitations on what your shard key can be if you’re using a hashed shard key. First, you cannot use the unique option. As with other shard keys, you cannot use array fields. Finally, be aware that floating-point values will be rounded to whole numbers before hashing, so 1 and 1.999999 will both be hashed to the same value.

Hashed Shard Keys for GridFS

Before attempting to shard GridFS collections, make sure that you understand how GridFS stores data (see Chapter 6 for an explanation).

In the following explanation, the term “chunks” is overloaded since GridFS splits files into chunks and sharding splits collections into chunks. Thus, the two types of chunks are referred to as “GridFS chunks” and “sharding chunks.”

GridFS collections are generally excellent candidates for sharding, as they contain massive amounts of file data. However, neither of the indexes that are automatically created on fs.chunks are particularly good shard keys: {"_id" : 1} is an ascending key and {"files_id" : 1, "n" : 1} picks up fs.files’s "_id" field, so it is also an ascending key.

However, if you create a hashed index on the "files_id" field, each file will be randomly distributed across the cluster, and a file will always be contained in a single chunk. This is the best of both worlds: writes will go to all shards evenly and reading a file’s data will only ever have to hit a single shard.

To set this up, you must create a new index on {"files_id" : "hashed"} (as of this writing, mongos cannot use a subset of the compound index as a shard key). Then shard the collection on this field:

> db.fs.chunks.ensureIndex({"files_id" : "hashed"})
> sh.shardCollection("test.fs.chunks", {"files_id" : "hashed"})
{ "collectionsharded" : "test.fs.chunks", "ok" : 1 }

As a side note, the fs.files collection may or may not need to be sharded, as it will be much smaller than fs.chunks. You can shard it if you would like, but it is not likely to be necessary.

The Firehose Strategy

If you have some servers that are more powerful than others, you might want to let them handle proportionally more load than your less-powerful servers. For example, suppose you have one shard that can handle 10 times the load of your other machines. Luckily, you have 10 other shards. You could force all inserts to go to the more powerful shard, and then allow the balancer to move older chunks to the other shards. This would give lower-latency writes.

To use this strategy, we have to pin the highest chunk to the more powerful shard. First, we zone this shard:

> sh.addShardToZone("<shard-name>", "10x")

Then we pin the current value of the ascending key through infinity to that shard, so all new writes go to it:

> sh.updateZoneKeyRange("<dbName.collName>", {"_id" : ObjectId()}, 
... {"_id" : MaxKey}, "10x")

Now all inserts will be routed to this last chunk, which will always live on the shard zoned "10x".

However, ranges from now through infinity will be trapped on this shard unless we modify the zone key range. To get around this, we could set up a cron job to update the key range once a day, like this:

> use config
> var zone = db.tags.findOne({"ns" : "<dbName.collName>", 
... "max" : {"<shardKey>" : MaxKey}})
> zone.min.<shardKey> = ObjectId()
> db.tags.save(zone)

Then all of the previous day’s chunks would be able to move to other shards.

Another downside of this strategy is that it requires some changes to scale. If your most powerful server can no longer handle the number of writes coming in, there is no trivial way to split the load between this server and another.

If you do not have a high-performance server to firehose into or you are not using zone sharding, do not use an ascending key as the shard key. If you do, all writes will go to a single shard.

Multi-Hotspot

Standalone mongod servers are most efficient when doing ascending writes. This conflicts with sharding, in that sharding is most efficient when writes are spread over the cluster. The technique described here basically creates multiple hotspots—optimally several on each shard—so that writes are evenly balanced across the cluster but, within a shard, ascending.

To accomplish this, we use a compound shard key. The first value in the compound key is a rough, random value with low-ish cardinality. You can picture each value in the first part of the shard key as a chunk, as shown in Figure 16-6. This will eventually work itself out as you insert more data, although it will probably never be divided up this neatly (right on the $minKey lines). However, if you insert enough data, you should eventually have approximately one chunk per random value. As you continue to insert data, you’ll end up with multiple chunks with the same random value, which brings us to the second part of the shard key.

Figure 16-6. A subset of the chunks: each chunk contains a single state and a range of “_id” values

The second part of the shard key is an ascending key. This means that within a chunk, values are always increasing, as shown in the sample documents in Figure 16-7. Thus, if you had one chunk per shard, you’d have the perfect setup: ascending writes on every shard, as shown in Figure 16-8. Of course, having n chunks with n hotspots spread across n shards isn’t very extensible: add a new shard and it won’t get any writes because there’s no hotspot chunk to put on it. Thus, you want a few hotspot chunks per shard (to give you room to grow), but not too many. Having a few hotspot chunks will keep the effectiveness of ascending writes, but having, say, a thousand hotspots on a shard will end up being equivalent to random writes.

Figure 16-7. A sample list of inserted documents (note that all “_id” values are increasing)
Figure 16-8. The inserted documents, split into chunks (note that, within each chunk, the “_id” values are increasing)

You can picture this setup as each chunk being a stack of ascending documents. There are multiple stacks on each shard, each ascending until the chunk is split. Once a chunk is split, only one of the new chunks will be a hotspot chunk: the other chunk will essentially be “dead” and never grow again. If the stacks are evenly distributed across the shards, writes will be evenly distributed.

Shard Key Rules and Guidelines

There are several practical restrictions to be aware of before choosing a shard key.

Determining which key to shard on and creating shard keys should be reminiscent of indexing because the two concepts are similar. In fact, often your shard key may just be the index you use most often (or some variation on it).

Shard Key Limitations

Shard keys cannot be arrays. sh.shardCollection() will fail if any key has an array value, and inserting an array into that field is not allowed.

Once inserted, a document’s shard key value may be modified unless the shard key field is an immutable _id field. In older versions of MongoDB prior to 4.2, it was not possible to modify a document’s shard key value.

Most special types of indexes cannot be used for shard keys. In particular, you cannot shard on a geospatial index. Using a hashed index for a shard key is allowed, as covered previously.

Shard Key Cardinality

Whether your shard key jumps around or increases steadily, it is important to choose a key with values that will vary. As with indexes, sharding performs better on high-cardinality fields. If, for example, you had a "logLevel" key that had only values "DEBUG", "WARN", or "ERROR", MongoDB wouldn’t be able to break up your data into more than three chunks (because there would be only three different values for the shard key). If you have a key with little variation and want to use it as a shard key anyway, you can do so by creating a compound shard key on that key and a key that varies more, like "logLevel" and "timestamp". It is important that the combination of keys has high cardinality.

Controlling Data Distribution

Sometimes, automatic data distribution will not fit your requirements. This section gives you some options beyond choosing a shard key and allowing MongoDB to do everything automatically.

As your cluster gets larger or busier, these solutions become less practical. However, for small clusters, you may want more control.

Using a Cluster for Multiple Databases and Collections

MongoDB evenly distributes collections across every shard in your cluster, which works well if you’re storing homogeneous data. However, if you have a log collection that is “lower value” than your other data, you might not want it taking up space on your more expensive servers. Or, if you have one powerful shard, you might want to use it for only a real-time collection and not allow other collections to use it. You can create separate clusters, but you can also give MongoDB specific directions about where you want it to put certain data.

To set this up, use the sh.addShardToZone() helper in the shell:

> sh.addShardToZone("shard0000", "high")
> // shard0001 - no zone
> // shard0002 - no zone
> // shard0003 - no zone
> sh.addShardToZone("shard0004", "low")
> sh.addShardToZone("shard0005", "low")

Then you can assign different collections to different shards. For instance, for your super-important real-time collection:

> sh.updateZoneKeyRange("super.important", {"<shardKey>" : MinKey}, 
... {"<shardKey>" : MaxKey}, "high")

This says, “for negative infinity to infinity for this collection, store it on shards tagged "high".” This means that no data from the super.important collection will be stored on any other server. Note that this does not affect how other collections are distributed: they will still be evenly distributed between this shard and the others.

You can perform a similar operation to keep the log collection on a low-quality server:

> sh.updateZoneKeyRange("some.logs", {"<shardKey>" : MinKey}, 
... {"<shardKey>" : MaxKey}, "low")

The log collection will now be split evenly between shard0004 and shard0005.

Assigning a zone key range to a collection does not affect it instantly. It is an instruction to the balancer stating that, when it runs, these are the viable targets to move the collection to. Thus, if the entire log collection is on shard0002 or evenly distributed among the shards, it will take a little while for all of the chunks to be migrated to shard0004 and shard0005.

As another example, perhaps you have a collection that you don’t want on the shard zoned "high", but you don’t care which other shard it goes on. You can zone all of the non-high-performance shards to create a new grouping. Shards can have as many zones as you need:

> sh.addShardToZone("shard0001", "whatever")
> sh.addShardToZone("shard0002", "whatever")
> sh.addShardToZone("shard0003", "whatever")
> sh.addShardToZone("shard0004", "whatever")
> sh.addShardToZone("shard0005", "whatever")

Now you can specify that you want this collection (call it normal.coll) distributed across these five shards:

> sh.updateZoneKeyRange("normal.coll", {"<shardKey>" : MinKey}, 
... {"<shardKey>" : MaxKey}, "whatever")
Tip

You cannot assign collections dynamically—i.e., you can’t say, “when a collection is created, randomly home it to a shard.” However, you could have a cron job that went through and did this for you.

If you make a mistake or change your mind, you can remove a shard from a zone with sh.removeShardFromZone():

> sh.removeShardFromZone("shard0005", "whatever")

If you remove all shards from zones described by a zone key range (e.g., if you remove shard0000 from the zone "high"), the balancer won’t distribute the data anywhere because there aren’t any valid locations listed. All the data will still be readable and writable; it just won’t be able to migrate until you modify your tags or tag ranges.

To remove a key range from a zone, use sh.removeRangeFromZone(). The following is an example. The range specified must be an exact match to a range previously defined for the namespace some.logs and a given zone:

> sh.removeRangeFromZone("some.logs", {"<shardKey>" : MinKey}, 
... {"<shardKey>" : MaxKey})

Manual Sharding

Sometimes, for complex requirements or special situations, you may prefer to have complete control over which data is distributed where. You can turn off the balancer if you don’t want data to be automatically distributed and use the moveChunk command to manually distribute data.

To turn off the balancer, connect to a mongos (any mongos is fine) using the mongo shell and disable the balancer using the shell helper sh.stopBalancer():

> sh.stopBalancer()

If there is currently a migrate in progress, this setting will not take effect until the migrate has completed. However, once any in-flight migrations have finished, the balancer will stop moving data around. To verify no migrations are in progress after disabling, issue the following in the mongo shell:

> use config
> while(sh.isBalancerRunning()) {
...  print("waiting...");
...  sleep(1000);
... }

Once the balancer is off, you can move data around manually (if necessary). First, find out which chunks are where by looking at config.chunks:

> db.chunks.find()

Now, use the moveChunk command to migrate chunks to other shards. Specify the lower bound of the chunk to be migrated and give the name of the shard that you want to move the chunk to:

> sh.moveChunk(
... "test.manual.stuff", 
... {user_id: NumberLong("-1844674407370955160")}, 
... "test-rs1")

However, unless you are in an exceptional situation, you should use MongoDB’s automatic sharding instead of doing it manually. If you end up with a hotspot on a shard that you weren’t expecting, you might end up with most of your data on that shard.

In particular, do not combine setting up unusual distributions manually with running the balancer. If the balancer detects an uneven number of chunks it will simply reshuffle all of your work to get the collection evenly balanced again. If you want uneven distribution of chunks, use the zone sharding technique discussed in “Using a Cluster for Multiple Databases and Collections”.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.10.246