Executing nested aggregations

This kind of aggregation allows executing analytics on nested documents. When working with complex structures, the nested objects are very common.

Getting ready

You need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command line, you need to install curl for your operative system.

To correctly execute the following command, you need an index populated with the script (chapter_08/populate_aggregations.sh) available in the online code.

How to do it...

For executing nested aggregations, we will perform the following steps:

  1. We must index documents with a nested type, as discussed in the Managing nested objects recipe in Chapter 3, Managing Mappings:
            { 
                "product" : { 
                    "properties" : { 
                        "resellers" : { 
                            "type" : "nested" 
                            "properties" : { 
                                "username" : { "type" : "string", "index" :      
                                "not_analyzed" }, 
                                 "price" : { "type" : "double" } 
                            } 
                        }, 
                        "tags" : { "type" : "string", 
                        "index":"not_analyzed"}
                    } 
                } 
             } 
    
  2. To return the minimum price products can be purchased at, we create a nested aggregation with a code similar to the following:
            curl -XGET 'http://127.0.0.1:9200/test-index/product/_search?
            pretty&size=0' -d ' { 
                "query" : { 
                    "match" : { "name" : "my product" } 
                }, 
                "aggs" : { 
                    "resellers" : { 
                        "nested" : { 
                            "path" : "resellers" 
                        }, 
                        "aggs" : { 
                            "min_price" : { "min" : { "field" : 
                            "resellers.price" } } 
                         } 
                    } 
                } 
            }' 
    
  3. The result returned by Elasticsearch, if everything is okay, should be as follows:
           { 
                "took" : 7, 
                "timed_out" : false, 
                "_shards" : {...truncated...}, 
                "hits" : {...truncated...}, 
                "aggregations": { 
                    "resellers": { 
                        "min_price": { 
                            "value" : 130 
                        } 
                    } 
                } 
            } 
    

In this case, the result aggregation is a simple min metric that we have already seen in the second recipe of this chapter.

How it works...

The nested aggregation requires only the path of the field, relative to the parent, which contains the nested documents.

After having defined the nested aggregation, all the other kinds of aggregations can be used in the sub-aggregations.

There's more...

Elasticsearch provides a way to aggregate values from nested documents to their parent: this aggregation is called reverse_nested.

In the preceding example, we can aggregate the top tags for the reseller with a similar query as follows:

curl -XGET 'http://127.0.0.1:9200/test-index/product/_search?pretty&size=0' -d ' { 
    "query" : { 
        "match" : { "name" : "my product" } 
    } 
    "aggs" : { 
        "resellers" : { 
            "nested" : { 
                "path" : "resellers" 
            }, 
            "aggs" : { 
                "top_resellers" : { 
                    "terms" : { 
                        "field" : "resellers.username" 
                    } 
                }, 
                "aggs" : { 
                    "resellers_to_product" : { 
                        "reverse_nested" : {}, 
                        "aggs" : { 
                            "top_tags_per_reseller" : { 
                                "terms" : { "field" : "tags" } 
                            } 
                        } 
                    } 
                } 
            } 
        } 
    } 
}' 

In this example, there are several steps:

  1. We aggregate initially for nested resellers.
  2. Having activated the nested resellers documents, we are able to term aggregate by the username field (resellers.username).
  3. From the top resellers aggregation, we go back to aggregate on the parent via the "reverse_nested".
  4. Now we can aggregate the tags of the parent document.

The response is similar to this one:

{ 
    "took" : 93, 
    "timed_out" : false, 
    "_shards" : {...truncated...}, 
    "hits" : {...truncated...}, 
    "aggregations": { 
        "resellers": { 
            "top_usernames": { 
                "buckets" : [ 
                    { 
                        "key" : "username_1", 
                        "doc_count" : 17, 
                        "resellers_to_product" : { 
                            "top_tags_per_reseller" : { 
                                "buckets" : [ 
                                    { 
                                        "key" : "tag1", 
                                        "doc_count" : 9 
                                    },... 
                                ] 
                            } 
                        },... 
                    } 
                ] 
            } 
        } 
    } 
} 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.120.161