5. Indexing

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5. Indexing

Indexes are the way to make queries go vroom. Database indexes are similar to a book’s index: instead of looking through the whole book, the database takes a shortcut and just looks in the index, allowing it to do queries orders of magnitude faster. Once it finds the entry in the index, it can jump right to the location of the desired document.

Extending this metaphor to the breaking point, creating database indexes is like deciding how a book’s index will be organized. You have the advantage of knowing what kinds of queries you’ll be doing and thus what types of information the database will need to find quickly. If all of your queries involve the "date" key, for example, you probably need an index on "date" (at least). If you will be querying for usernames, you don’t need to index the "user_num" key, because you aren’t querying on it.

Introduction to Indexing

It can be tricky to figure out what the optimal index for your queries is, but it is well worth the effort. Queries that otherwise take minutes can be instantaneous with the proper indexes.

Tip

MongoDB’s indexes work almost identically to typical relational database indexes, so if you are familiar with those, you can skim this section for syntax specifics. We’ll go over some indexing basics, but keep in mind that it’s an extensive topic and most of the material out there on index optimization for MySQL/Oracle/SQLite will apply equally well to MongoDB.

Suppose that you are querying for a single key, such as the following:

> db.people.find({"username" : "mark"})

When only a single key is used in the query, that key can be indexed to improve the query’s speed. In this case, you would create an index on "username". To create the index, use the ensureIndex method:

> db.people.ensureIndex({"username" : 1})

An index needs to be created only once for a collection. If you try to create the same index again, nothing will happen.

An index on a key will make queries on that key fast. However, it may not make other queries fast, even if they contain the indexed key. For example, this wouldn’t be very performant with the previous index:

> db.people.find({"date" : date1}).sort({"date" : 1, "username" : 1})

The server has to “look through the whole book” to find the desired dates. This process is called a table scan, which is basically what you’d do if you were looking for information in a book without an index: you start at page 1 and read through the whole thing. In general, you want to avoid making the server do table scans, because it is very slow for large collections.

As a rule of thumb, you should create an index that contains all of the keys in your query. For instance, to optimize the previous query, you should have an index on date and username:

> db.ensureIndex({"date" : 1, "username" : 1})

The document you pass to ensureIndex is of the same form as the document passed to the sort function: a set of keys with value 1 or -1, depending on the direction you want the index to go. If you have only a single key in the index, direction is irrelevant. A single key index is analogous to a book’s index that is organized in alphabetical order: whether it goes from A–Z or Z–A, it’s going to be fairly obvious where to find entries starting with M.

If you have more than one key, you need to start thinking about index direction. For example, suppose we have a collection of users:

{ "_id" : ..., "username" : "smith", "age" : 48, "user_id" : 0 }
{ "_id" : ..., "username" : "smith", "age" : 30, "user_id" : 1 }
{ "_id" : ..., "username" : "john", "age" : 36, "user_id" : 2 }
{ "_id" : ..., "username" : "john", "age" : 18, "user_id" : 3 }
{ "_id" : ..., "username" : "joe", "age" : 36, "user_id" : 4 }
{ "_id" : ..., "username" : "john", "age" : 7, "user_id" : 5 }
{ "_id" : ..., "username" : "simon", "age" : 3, "user_id" : 6 }
{ "_id" : ..., "username" : "joe", "age" : 27, "user_id" : 7 }
{ "_id" : ..., "username" : "jacob", "age" : 17, "user_id" : 8 }
{ "_id" : ..., "username" : "sally", "age" : 52, "user_id" : 9 }
{ "_id" : ..., "username" : "simon", "age" : 59, "user_id" : 10 }

Let’s say we index them with {"username" : 1, "age" : -1}. MongoDB will organize the users as follows:

{ "_id" : ..., "username" : "jacob", "age" : 17, "user_id" : 8 }
{ "_id" : ..., "username" : "joe", "age" : 36, "user_id" : 4 }
{ "_id" : ..., "username" : "joe", "age" : 27, "user_id" : 7 }
{ "_id" : ..., "username" : "john", "age" : 36, "user_id" : 2 }
{ "_id" : ..., "username" : "john", "age" : 18, "user_id" : 3 }
{ "_id" : ..., "username" : "john", "age" : 7, "user_id" : 5 }
{ "_id" : ..., "username" : "sally", "age" : 52, "user_id" : 9 }
{ "_id" : ..., "username" : "simon", "age" : 59, "user_id" : 10 }
{ "_id" : ..., "username" : "simon", "age" : 3, "user_id" : 6 }
{ "_id" : ..., "username" : "smith", "age" : 48, "user_id" : 0 }
{ "_id" : ..., "username" : "smith", "age" : 30, "user_id" : 1 }

The usernames are in strictly increasing alphabetical order, and within each name group the ages are in decreasing order. This optimizes sorting by {"username" : 1, "age" : -1} but is less efficient at sorting by {"username" : 1, "age" : 1}. If we wanted to optimize {"username" : 1, "age" : 1}, we would create an index on {"username" : 1, "age" : 1} to organize ages in ascending order.

The index on username and age also makes queries on username fast. In general, if an index has N keys, it will make queries on any prefix of those keys fast. For instance, if we have an index that looks like {"a" : 1, "b" : 1, "c" : 1, ..., "z" : 1}, we effectively have an index on {"a" : 1}, {"a" : 1, "b" : 1}, {"a" : 1, "b" : 1, "c" : 1}, and so on. Queries that would use the index {"b" : 1}, {"a" : 1, "c" :1}, and so on will not be optimized: only queries that can use a prefix of the index can take advantage of it.

The MongoDB query optimizer will reorder query terms to take advantage of indexes: if you query for {"x" : "foo", "y" : "bar"} and you have an index on {"y" : 1, "x" : 1}, MongoDB will figure it out.

The disadvantage to creating an index is that it puts a little bit of overhead on every insert, update, and remove. This is because the database not only needs to do the operation but also needs to make a note of it in any indexes on the collection. Thus, the absolute minimum number of indexes should be created. There is a built-in maximum of 64 indexes per collection, which is more than almost any application should need.

Warning

Do not index every key. This will make inserts slow, take up lots of space, and probably not speed up your queries very much. Figure out what queries you are running, what the best indexes are for these queries, and make sure that the server is using the indexes you’ve created using the explain and hint tools described in the next section.

Sometimes the most efficient solution is actually not to use an index. In general, if a query is returning a half or more of the collection, it will be more efficient for the database to just do a table scan instead of having to look up the index and then the value for almost every single document. Thus, for queries such as checking whether a key exists or determining whether a boolean value is true or false, it may actually be better to not use an index at all.

Scaling Indexes

Suppose we have a collection of status messages from users. We want to query by user and date to pull up all of a user’s recent statuses. Using what we’ve learned so far, we might create an index that looks like the following:

> db.status.ensureIndex({user : 1, date : -1})

This will make the query for user and date efficient, but it is not actually the best index choice.

Imagine this as a book index again. We would have a list of documents sorted by user and then subsorted by date, so it would look something like the following:

User 123 on March 13, 2010
User 123 on March 12, 2010
User 123 on March 11, 2010
User 123 on March 5, 2010
User 123 on March 4, 2010
User 124 on March 12, 2010
User 124 on March 11, 2010
...

This looks fine at this scale, but imagine if the application has millions of users who have dozens of status updates per day. If the index entries for each user’s status messages take up a page’s worth of space on disk, then for every “latest statuses” query, the database will have to load a different page into memory. This will be very slow if the site becomes popular enough that not all of the index fits into memory.

If we flip the index order to {date : -1, user : 1}, the database can keep the last couple days of the index in memory, swap less, and thus query for the latest statuses for any user much more quickly.

Thus, there are several questions to keep in mind when deciding what indexes to create:

What are the queries you are doing? Some of these keys will need to be indexed.
What is the correct direction for each key?
How is this going to scale? Is there a different ordering of keys that would keep more of the frequently used portions of the index in memory?

If you can answer these questions, you are ready to index your data.

Indexing Keys in Embedded Documents

Indexes can be created on keys in embedded documents in the same way that they are created on normal keys. For example, if we want to be able to search blog post comments by date, we can create an index on the "date" key in the array of embedded "comments" documents:

> db.blog.ensureIndex({"comments.date" : 1})

Indexes on keys in embedded documents are identical to those on top-level keys, and the two can be combined in compound indexes.

Indexing for Sorts

As your collection grows, you’ll need to create indexes for any large sorts your queries are doing. If you call sort on a key that is not indexed, MongoDB needs to pull all of that data into memory to sort it. Thus, there’s a limit on the amount you can sort without an index: you can’t sort a terabyte of data in memory. Once your collection is too big to sort in memory, MongoDB will just return an error for queries that try.

Indexing the sort allows MongoDB to pull the sorted data in order, allowing you to sort any amount of data without running out of memory.

Uniquely Identifying Indexes

Each index in a collection has a string name that uniquely identifies the index and is used by the server to delete or manipulate it. Index names are, by default, keyname1_dir1_keyname2_dir2_..._keynameN_dirN, where keynameX is the index’s key and dirX is the index’s direction (1 or -1). This can get unwieldy if indexes contain more than a couple keys, so you can specify your own name as one of the options to ensureIndex:

> db.foo.ensureIndex({"a" : 1, "b" : 1, "c" : 1, ..., "z" : 1}, {"name" : "alphabet"})

There is a limit to the number of characters in an index name, so complex indexes may need custom names to be created. A call to getLastError will show whether the index creation succeeded or why it didn’t.

Unique Indexes

Unique indexes guarantee that, for a given key, every document in the collection will have a unique value. For example, if you want to make sure no two documents can have the same value in the "username" key, you can create a unique index:

> db.people.ensureIndex({"username" : 1}, {"unique" : true})

Keep in mind that insert, by default, does not check whether the document was actually inserted. Therefore, you may want to use safe inserts if you are inserting documents that might have a duplicate value for a unique key. This way, you will get a duplicate key error when you attempt to insert such a document.

A unique index that you are probably already familiar with is the index on "_id", which is created whenever you create a normal collection. This is a normal unique index, aside from the fact that it cannot be deleted.

Warning

If a key does not exist, the index stores its value as null. Therefore, if you create a unique index on a key and try to insert more than one document that is missing the indexed key, the inserts will fail because you already have a document with a value of null.

Dropping Duplicates

When building unique indexes for an existing collection, some values may be duplicates. If there are any duplicates, this will cause the index building to fail. In some cases, you may just want to drop all of the documents with duplicate values. The dropDups option will save the first document found and remove any subsequent documents with duplicate values:

> db.people.ensureIndex({"username" : 1}, {"unique" : true, "dropDups" : true})

This is a bit of a drastic option; it might be better to create a script to preprocess the data if it is important.

Compound Unique Indexes

You can also create a compound unique index. If you do this, individual keys can have the same values, but the combination of values for all keys must be unique.

GridFS, the standard method for storing large files in MongoDB (see Chapter 7), uses a compound unique index. The collection that holds the file content has a unique index on {files_id : 1, n : 1}, which allows documents that look like (in part) the following:

{files_id : ObjectId("4b23c3ca7525f35f94b60a2d"), n : 1}
{files_id : ObjectId("4b23c3ca7525f35f94b60a2d"), n : 2}
{files_id : ObjectId("4b23c3ca7525f35f94b60a2d"), n : 3}
{files_id : ObjectId("4b23c3ca7525f35f94b60a2d"), n : 4}

Note that all of the values for "files_id" are the same, but "n" is different. If you attempted to insert {files_id : ObjectId("4b23c3ca7525f35f94b60a2d"), n : 1} again, the database would complain about the duplicate key.

Using explain and hint

explain is an incredibly handy tool that will give you lots of information about your queries. You can run it on any query by tacking it on to a cursor. explain returns a document, not the cursor itself, unlike most cursor methods:

> db.foo.find().explain()

explain will return information about the indexes used for the query (if any) and stats about timing and the number of documents scanned.

Let’s take an example. The index {"username" : 1} works well to speed up a simple key/value pair lookup, but many queries are more complicated. For example, suppose you are querying and sorting the following:

> db.people.find({"age" : 18}).sort({"username" : 1})

You may be unsure if the database is using the index you created or how effective it is. If you run explain on the query, it will return the index currently being used for the query, how long it took, and how many documents the database needed to scan to find the results.

For a very simple query ({}) on a database with no indexes (other than the index on "_id") and 64 documents, the output for explain looks like this:

> db.people.find().explain()
{
    "cursor" : "BasicCursor",
    "indexBounds" : [ ],
    "nscanned" : 64,
    "nscannedObjects" : 64,
    "n" : 64,
    "millis" : 0,
    "allPlans" : [
        {
            "cursor" : "BasicCursor",
            "indexBounds" : [ ]
        }
    ]
}

The important parts of this result are as follows:

"cursor" : "BasicCursor": This means that the query did not use an index (unsurprisingly, because there was no query criteria). We’ll see what this value looks like for an indexed query in a moment.
"nscanned" : 64: This is the number of documents that the database looked through. You want to make sure this is as close to the number returned as possible.
"n" : 64: This is the number of documents returned. We’re doing pretty well here, because the number of documents scanned exactly matches the number returned. Of course, given that we’re returning the entire collection, it would be difficult to do otherwise.
"millis" : 0: The number of milliseconds it took the database to execute the query. 0 is a good time to shoot for.

Now, let’s suppose we have an index on the "age" key and we’re querying for users in their 20s. If we run explain on this query, we’ll see the following:

> db.c.find({age : {$gt : 20, $lt : 30}}).explain()
{
    "cursor" : "BtreeCursor age_1",
    "indexBounds" : [
        [
            {
                "age" : 20
            },
            {
                "age" : 30
            }
        ]
    ],
    "nscanned" : 14,
    "nscannedObjects" : 12,
    "n" : 12,
    "millis" : 1,
    "allPlans" : [
        {
            "cursor" : "BtreeCursor age_1",
            "indexBounds" : [
                [
                    {
                        "age" : 20
                    },
                    {
                        "age" : 30
                    }
                ]
            ]
        }
    ]
}

There are some differences here, because this query uses an index. Thus, some of explain’s outputted keys have changed:

"cursor" : "BtreeCursor age_1"

This query is not using a BasicCursor, like the first query was. Indexes are stored in data structures called B-trees, so, when a query uses an index, it uses a special type of cursor called a BtreeCursor.

This value also identifies the name of the index being used: age_1. Using this name, we could query the system.indexes collection to find out more about this index (e.g., whether it is unique or what keys it includes):

> db.system.indexes.find({"ns" : "test.c", "name" : "age_1"})
{
    "_id" : ObjectId("4c0d211478b4eaaf7fb28565"), 
    "ns" : "test.c", 
    "key" : { 
        "age" : 1 
    }, 
    "name" : "age_1" 
}

"allPlans" : [ ... ]

This key lists all of the plans MongoDB considered for the query. The choice here was fairly obvious, because we had an index on "age" and we were querying for "age". If we had multiple overlapping indexes and a more complex query, "allPlans" would contain information about all of the possible plans that could have been used.

Let’s take a slightly more complicated example: suppose you have an index on {"username" : 1, "age" : 1} and an index on {"age" : 1, "username" : 1}. What happens if you query for username and age? Well, it depends on the query:

> db.c.find({age : {$gt : 10}, username : "sally"}).explain()
{
        "cursor" : "BtreeCursor username_1_age_1",
        "indexBounds" : [
                [
                        {
                                "username" : "sally",
                                "age" : 10
                        },
                        {
                                "username" : "sally",
                                "age" : 1.7976931348623157e+308
                        }
                ]
        ],
        "nscanned" : 13,
        "nscannedObjects" : 13,
        "n" : 13,
        "millis" : 5,
        "allPlans" : [
                {
                        "cursor" : "BtreeCursor username_1_age_1",
                        "indexBounds" : [
                                [
                                        {
                                                "username" : "sally",
                                                "age" : 10
                                        },
                                        {
                                                "username" : "sally",
                                                "age" : 1.7976931348623157e+308
                                        }
                                ]
                        ]
                }
        ],
        "oldPlan" : {
                "cursor" : "BtreeCursor username_1_age_1",
                "indexBounds" : [
                        [
                                {
                                        "username" : "sally",
                                        "age" : 10
                                },
                                {
                                        "username" : "sally",
                                        "age" : 1.7976931348623157e+308
                                }
                        ]
                ]
        }
}

We are querying for an exact match on "username" and a range of values for "age", so the database chooses to use the {"username" : 1, "age" : 1} index, reversing the terms of the query. If, on the other hand, we query for an exact age and a range of names, MongoDB will use the other index:

> db.c.find({"age" : 14, "username" : /.*/}).explain()
{
        "cursor" : "BtreeCursor age_1_username_1 multi",
        "indexBounds" : [
                [
                        {
                                "age" : 14,
                                "username" : ""
                        },
                        {
                                "age" : 14,
                                "username" : {
                                        
                                }
                        }
                ],
                [
                        {
                                "age" : 14,
                                "username" : /.*/
                        },
                        {
                                "age" : 14,
                                "username" : /.*/
                        }
                ]
        ],
        "nscanned" : 2,
        "nscannedObjects" : 2,
        "n" : 2,
        "millis" : 2,
        "allPlans" : [
                {
                        "cursor" : "BtreeCursor age_1_username_1 multi",
                        "indexBounds" : [
                                [
                                        {
                                                "age" : 14,
                                                "username" : ""
                                        },
                                        {
                                                "age" : 14,
                                                "username" : {
                                                        
                                                }
                                        }
                                ],
                                [
                                        {
                                                "age" : 14,
                                                "username" : /.*/
                                        },
                                        {
                                                "age" : 14,
                                                "username" : /.*/
                                        }
                                ]
                        ]
                }
        ]
}

If you find that Mongo is using different indexes than you want it to for a query, you can force it to use a certain index by using hint. For instance, if you want to make sure MongoDB uses the {"username" : 1, "age" : 1} index on the previous query, you could say the following:

> db.c.find({"age" : 14, "username" : /.*/}).hint({"username" : 1, "age" : 1})

Hinting is usually unnecessary. MongoDB has a query optimizer and is very clever about choosing which index to use. When you first do a query, the query optimizer tries out a number of query plans concurrently. The first one to finish will be used, and the rest of the query executions are terminated. That query plan will be remembered for future queries on the same keys. The query optimizer periodically retries other plans, in case you’ve added new data and the previously chosen plan is no longer best. The only part you should need to worry about is giving the query optimizer useful indexes to choose from.

Index Administration

Metainformation about indexes is stored in the system.indexes collection of each database. This is a reserved collection, so you cannot insert documents into it or remove documents from it. You can manipulate its documents only through ensureIndex and the dropIndexes database command.

The system.indexes collection has detailed information about each index, but the system.namespaces collection also lists their names. If you look at this collection, you can see that there are at least two documents for each collection: one for the collection itself and one for each index it contains. For a collection with just the standard "_id" index, its system.namespaces entries would look like this:

{ "name" : "test.foo" }
{ "name" : "test.foo.$_id_"}

If we add a compound index on the name and age keys, a new document is added to system.namespaces with the following form:

{ "name" : "test.foo.$name_1_age_1" }

In Chapter 2, we mentioned that collection names are limited to 121 bytes in length. This somewhat strange limit is because the "_id" index’s namespace needs 6 extra bytes (".$_id_"), bringing the namespace length for that index up to a more sensical 127 bytes.

It is important to remember that the size of the collection name plus the index name cannot exceed 127 bytes. If you approach the maximum namespace size or have large index names, you may have to use custom index names to avoid creating namespaces that are too long. It’s generally easier just to keep database, collection, and key names to reasonable lengths.

Changing Indexes

As you and your application grow old together, you may find that your data or queries have changed and that the indexes that used to work well no longer do. You can add new indexes to existing collections at any time with ensureIndex:

> db.people.ensureIndex({"username" : 1}, {"background" : true})

Building indexes is time-consuming and resource-intensive. Using the {"background" : true} option builds the index in the background, while handling incoming requests. If you do not include the background option, the database will block all other requests while the index is being built.

Blocking lets index creation go faster but means your application will be unresponsive during the build. Even building an index in the background can take a toll on normal operations, so it is best to do it during an “off” time; building indexes in the background will still add extra load to your database, but it will not bring it grinding to a halt.

Creating indexes on existing documents is slightly faster than creating the index first and then inserting all of the documents. Of course, creating an index beforehand is an option only if you’re populating a new collection from scratch.

If an index is no longer necessary, you can remove it with the dropIndexes command and the index name. Often, you may have to look at the system.indexes collection to figure out what the index name is, because even the autogenerated names vary somewhat from driver to driver:

> db.runCommand({"dropIndexes" : "foo", "index" : "alphabet"})

You can drop all indexes on a collection by passing in * as the value for the index key:

> db.runCommand({"dropIndexes" : "foo", "index" : "*"})

Another way to delete all indexes is to drop the collection. This will also delete the index on _id (and all of the documents in the collection). Removing all of the documents in a collection (with remove) does not affect the indexes; they will be repopulated when new documents are inserted.

Geospatial Indexing

There is another type of query that is becoming increasingly common (especially with the emergence of mobile devices): finding the nearest N things to a current location. MongoDB provides a special type of index for coordinate plane queries, called a geospatial index.

Suppose we wanted to find the nearest coffee shop to where we are now, given a latitude and longitude. We need to create a special index to do this type of query efficiently, because it needs to search in two dimensions. A geospatial index can be created using the ensureIndex function, but by passing "2d" as a value instead of 1 or -1:

> db.map.ensureIndex({"gps" : "2d"})

The "gps" key must be some type of pair value, that is, a two-element array or embedded document with two keys. These would all be valid:

{ "gps" : [ 0, 100 ] }
{ "gps" : { "x" : -30, "y" : 30 } }
{ "gps" : { "latitude" : -180, "longitude" : 180 } }

The keys can be anything; for example, {"gps" : {"foo" : 0, "bar" : 1}} is valid.

By default, geospatial indexing assumes that your values are going to range from -180 to 180 (which is convenient for latitude and longitude). If you are using it for other values, you can specify what the minimum and maximum values will be as options to ensureIndex:

> db.star.trek.ensureIndex({"light-years" : "2d"}, {"min" : -1000, "max" : 1000})

This will create a spatial index calibrated for a 2,000 × 2,000 light-year square.

Geospatial queries can be performed in two ways: as a normal query (using find) or as a database command. The find approach is similar to performing a nongeospatial query, except the conditional "$near" is used. It takes an array of the two target values:

> db.map.find({"gps" : {"$near" : [40, -73]}})

This finds all of the documents in the map collection, in order by distance from the point (40, -73). A default limit of 100 documents is applied if no limit is specified. If you don’t need that many results, you should set a limit to conserve server resources. For example, the following code returns the 10 nearest documents to (40, -73):

> db.map.find({"gps" : {"$near" : [40, -73]}}).limit(10)

This same type of query can be performed using the geoNear command:

> db.runCommand({geoNear : "map", near : [40, -73], num : 10});

geoNear also returns the distance of each result document from the query point. The distance is in whatever units you inserted the data in; if you inserted degrees of longitude and latitude, the distance is in degrees. find with "$near" doesn’t give you the distance for each point but must be used if you have more than 4MB of results.

MongoDB also allows you to find all of the documents within a shape, as well as near a point. To find points in a shape, we use the "$within" conditional instead of the "$near" conditional. "$within" can take an increasing number of shapes; check the online documentation for geospatial indexing to see the most up-to-date list of what’s supported (http://www.mongodb.org/display/DOCS/Geospatial+Indexing). As of this writing, there are two options: you can query for all points within a rectangle or a circle. To use a rectangle, use the "$box" option:

> db.map.find({"gps" : {"$within" : {"$box" : [[10, 20], [15, 30]]}}})

"$box" takes a two-element array: the first element specifies the coordinates of the lower-left corner, the second element the upper right.

Also, you can find all points within a circle with "$center", which takes an array with the center point and then a radius:

> db.map.find({"gps" : {"$within" : {"$center" : [[12, 25], 5]}}})

Compound Geospatial Indexes

Applications often need to search for more complex criteria than just a location. For example, a user might want to find all coffee shops or pizza parlors near where they are. To facilitate this type of query, you can combine geospatial indexes with normal indexes. In this situation, for instance, we might want to query on both the "location" key and a "desc" key, so we’d create a compound index:

> db.ensureIndex({"location" : "2d", "desc" : 1})

Then we can quickly find the nearest coffee shop:

> db.map.find({"location" : {"$near" : [-70, 30]}, "desc" : "coffeeshop"}).limit(1)
{
    "_id" : ObjectId("4c0d1348928a815a720a0000"),
    "name" : "Mud",
    "location" : [x, y],
    "desc" : ["coffee", "coffeeshop", "muffins", "espresso"]
}

Note that creating an array of keywords is a good way to perform user-defined searches.

The Earth Is Not a 2D Plane

MongoDB’s geospatial indexes assumes that whatever you’re indexing is a flat plane. This means that results aren’t perfect for spherical shapes, like the earth, especially near the poles. The problem is that the distance between lines of longitude in the Yukon is much shorter than the distance between them at the equator. There are various projections that can be used to convert the earth to a 2D plane with varying levels of accuracy and corresponding levels of complexity.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 5. Indexing

Create new playlist

Sign In

Sign Up

Chapter 5. Indexing

Introduction to Indexing

Tip

Warning

Scaling Indexes

Indexing Keys in Embedded Documents

Indexing for Sorts

Uniquely Identifying Indexes

Unique Indexes

Warning

Dropping Duplicates

Compound Unique Indexes

Using explain and hint

Index Administration

Changing Indexes

Geospatial Indexing

Compound Geospatial Indexes

The Earth Is Not a 2D Plane

Table of Contents for
5. Indexing