One of the main characteristics of Elasticsearch is schema-less indexing capability. Records in Elasticsearch can have missing values. Due to its schema-less nature, two kinds of queries are required:
You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command line, you need to install curl
for your operative system.
To correctly execute the following commands, you need an index populated with the chapter_05/populate_query.sh
script available in the online code.
For executing existing and missing filters, we will perform the following steps:
parsedtext
, the query will be:curl -XPOST 'http://127.0.0.1:9200/test-index/test- type/_search?pretty=true' -d '{ "query": { "exists": { "field":"parsedtext" } } }'
parsedtext
, the query will be as follows:curl -XPOST 'http://127.0.0.1:9200/test-index/test- type/_search?pretty=true' -d '{ "query": { "bool": { "must_not": { "exists": { "field": "parsedtext" } } } } }'
The exists and missing filters take only a field
parameter, which contains the name of the field to be checked.
Using simple fields, there are no pitfalls; but if you are using a single embedded object or a list of them, you need to use a subobject field due to how Elasticsearch/Lucene works.
An example helps you to understand how Elasticsearch maps JSON objects to Lucene documents internally. If you are trying to index a JSON document:
{ "name":"Paul", "address":{ "city":"Sydney", "street":"Opera House Road", "number":"44" } }
Elasticsearch will internally index it as follows:
name:paul address.city:Sydney address.street:Opera House Road address.number:44
As we can see, there is no field address
indexed, so exists filter on address
fails. To match documents with an address, you must search for a subfield (that is, address.city
).
3.138.172.130