Using the exists query

One of the main characteristics of Elasticsearch is schema-less indexing capability. Records in Elasticsearch can have missing values. Due to its schema-less nature, two kinds of queries are required:

  • Exists field: This is used to check if a field exists in a document
  • Missing field: This is used to check if a field is missing

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command line, you need to install curl for your operative system.

To correctly execute the following commands, you need an index populated with the chapter_05/populate_query.sh script available in the online code.

How to do it...

For executing existing and missing filters, we will perform the following steps:

  1. To search all the test-type documents that have a field called parsedtext, the query will be:
            curl -XPOST 'http://127.0.0.1:9200/test-index/test-
            type/_search?pretty=true' -d '{
                "query": {
                "exists": {
                            "field":"parsedtext"
                     }
                }
            }'
    
  2. To search all the test-type documents that do not have a field called parsedtext, the query will be as follows:
            curl -XPOST 'http://127.0.0.1:9200/test-index/test-
            type/_search?pretty=true' -d '{
              "query": {
                "bool": {
                  "must_not": {
                    "exists": {
                      "field": "parsedtext"
                    }
                  }
                }
              }
            }'
    

How it works...

The exists and missing filters take only a field parameter, which contains the name of the field to be checked.

Using simple fields, there are no pitfalls; but if you are using a single embedded object or a list of them, you need to use a subobject field due to how Elasticsearch/Lucene works.

An example helps you to understand how Elasticsearch maps JSON objects to Lucene documents internally. If you are trying to index a JSON document:

{ 
    "name":"Paul", 
    "address":{ 
        "city":"Sydney", 
        "street":"Opera House Road", 
        "number":"44" 
    } 
} 

Elasticsearch will internally index it as follows:

name:paul 
address.city:Sydney 
address.street:Opera House Road 
address.number:44 

As we can see, there is no field address indexed, so exists filter on address fails. To match documents with an address, you must search for a subfield (that is, address.city).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.172.130