Multikey indexes

Indexing scalar (single) values is explained in the preceding sections. However, one of the advantages we get from using MongoDB is the ability to easily store vector values in the form of arrays.

In the relational world, storing arrays is generally frowned upon as it violates the normal forms. In a document-oriented database such as MongoDB, it is frequently part of our design as we can store and query easily on complex structs of data.

Indexing arrays of documents is achieved by using the multikey index. A multikey index can store both arrays of scalar values as well as arrays of nested documents.

Creating a multikey index is the same as creating a regular index:

> db.books.createIndex({"tags":1})

Our new index will be a multikey index, allowing us to find documents by any of the tags stored in our array:

> db.books.find({tags:"new"})
{
"_id" : ObjectId("5969f4bc14ae9238fe76d7f2"),
"name" : "MongoDB Multikeys Cheatsheet",
"isbn" : "1002",
"available" : 1,
"meta_data" : {
"page_count" : 128,
"average_customer_review" : 3.9
},
"tags" : [
"mongodb",
"index",
"cheatsheet",
"new"
]
}

We can also create compound indexes with a multikey index but we can have, at the most, one array in each and every index document. Given that in MongoDB we don't specify the type of each field, this means that creating an index with two or more fields having an array value will fail at creation time and trying to insert a document with two or more fields as arrays will fail at insertion time.

For example, a compound index on tags, analytics_data will fail to be created if we have the following document in our database:

{
"_id" : ObjectId("5969f71314ae9238fe76d7f3"),
"name": "Mastering parallel arrays indexing",
"tags" : [
"A",
"B"
],
"analytics_data" : [
"1001",
"1002"
]
}

> db.books.createIndex({tags:1, analytics_data:1})
{
"ok" : 0,
"errmsg" : "cannot index parallel arrays [analytics_data] [tags]",
"code" : 171,
"codeName" : "CannotIndexParallelArrays"
}

Consequently, if we create the index first on an empty collection and try to insert this document, the insert will fail with the following error:

> db.books.find({isbn: "1001"}).hint("international_standard_book_number_index")
.explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "mongo_book.books",
"indexFilterSet" : false,
"parsedQuery" : {
"isbn" : {
"$eq" : "1001"
}
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"isbn" : 1
},
"indexName" : "international_standard_book_numbe
r_index",
"isMultiKey" : false,
"multiKeyPaths" : {
"isbn" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"isbn" : [
"["1001", "1001"]"
]
}
}
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "PPMUMCPU0142",
"port" : 27017,
"version" : "3.4.7",
"gitVersion" : "cf38c1b8a0a8dca4a11737581beafef4fe120bcd"
},
"ok" : 1
 Hashed indexes cannot be multikey indexes.

Another limitation we will likely run into when trying to fine-tune our database is that multikey indexes cannot entirely cover a query. Covering a query with the index means that we can get our result data entirely from the index without accessing the data in our database at all. This can result in dramatically increased performance as indexes are most likely to be stored in RAM.

Querying for multiple values in multikey indexes will result in a two-step process from the index's perspective.

In the first step, index will be used to retrieve the first value of the array and then a sequential scan will run through the rest of the elements in the array. For example:

> db.books.find({tags: [ "mongodb", "index", "cheatsheet", "new" ] })

This will first search for all entries in multikey index tags that have a mongodb value and then sequentially scan through them to find the ones that also have the index, cheatsheet, and new tags.

A multikey index cannot be used as a shard key. However, if the shard key is a prefix index of a multikey index, it can be used. More on this in Chapter 11, Sharding.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.255.225