Creating and using an index

As you have learned, the previous query requires a table scan; that is, RethinkDB had to look through the name field in every document. Indexes are a great way to fix queries such as this because they organize data by a given field to let RethinkDB find it quickly.

Our test query uses the name field, so we will now create a simple index on that field.

The quickest way to create an index in RethinkDB is using the web interface.

First, select the people table from the Tables tab, and then click on the Create a new secondary index link. Enter the name of the field on which you wish to create the index. In our example, this is the name field:

Creating and using an index

When you click on Create, RethinkDB will begin creating the desired index. Depending on your system and how large your dataset is, creating an index may take a few seconds:

Creating and using an index

When the index is ready to use, you will see a screen similar to this one:

Creating and using an index

Congratulations! You've just created your very first index.

Now, it's time to use it in a query.

RethinkDB provides the getAll() command to query an index. This command accepts two parameters as input: the value to search and the name of the index to query.

Let's try repeating the original query using this index:

r.db('test').table('people').getAll("Alex", {index: "name"})

As you can see, we tell RethinkDB that we want to use an index by calling the getAll() command and passing the name we are looking for (Alex) along with the name of the index (name).

You may have noticed that the query result is exactly the same as before; however, what about the query time? If we run the query one more time using the query profiler, you will get an output similar to this:

Creating and using an index

As you can see, the query now took 1 millisecond. Previously, when we ran the query without using an index, the query time was 132 milliseconds. What this means is that we've just gained an increment of 130% in the performance by just creating and using an index!

Compound indexes

More often than not, queries are going to be much more complex than the one we wrote in the previous step.

Building on from the previous example, suppose we want to write a query that returns all the documents of people called Alex and are aged 22. As usual, we can use the filter command to write the query:

r.db('test').table('people').filter({name: "Alex", age: 22})

If you run this query from the Data Explorer, you will get an output similar to this:

Compound indexes

The output is correct as we got all people called Alex who are also 22 years old; however, how can we optimize this query? The name index that we created previously only indexes the name field, so it isn't going to be terribly helpful to us. Compound indexes to the rescue!

A compound index is an index on more than one field. Let's create a compound index that indexes both: the name and age fields. This time, we can create the index directly from the Data Explorer by running the following query:

r.db('test').table('people').indexCreate("name_and_age", [r.row("name"), r.row("age")])

This query uses the indexCreate command to create a compound index. This command accepts two parameters: a name for the index (name_and_age) and an array of fields to index (name and age).

Now that we've created our index, let's check if it is ready by running this query:

r.db("test").table("people").indexStatus("name_and_age")

This query calls the indexStatus command that returns the status of the requested index. The resulting JSON document will be similar to the following:

{
  "function": <binary, 125 bytes, "24 72 65 71 6c 5f..."> ,
  "geo": false ,
  "index": "name_and_age" ,
  "multi": false ,
  "outdated": false ,
  "ready": true
}

As you can see from the last field in the JSON document, our index is ready to use. Let's run the previous query once again, this time using the compound index that we just created.

The query will be as follows:

r.db("test").table("people").getAll(["Alex", 22], {index: "name_and_age"})

As you can see, we first pass the getAll command, the value for the name field, and then the value for the age field. Finally, we tell RethinkDB the name of the index that we want to query; in this case, it is name_and_age.

The result of the query will be exactly identical to the result of the previous query; however, if we profile this last query, we will notice a boost in the performance:

Compound indexes

As you can see from this image, the time it took the database to process the query is less than 1 millisecond. The exact same query, without the compound index, took 80 milliseconds on my system. This is a +80% improvement in the performance!

By now, you should have quite a good understanding of how indexing works, how to create an index, and how to use one. We will now focus on more advanced queries.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.51.134