The Elasticsearch document is in JavaScript Object Notation (JSON) format. In JSON, the valid datatypes are string, number, JSON object, array, Boolean, and null. The JSON data value and the mapping rules will determine the final datatype of the document field. The following table describes the mapping rules:
JSON datatype |
Value |
Setting in mapping |
Mapped datatype |
string |
string |
text |
|
date |
date_detection=true |
date |
|
integer |
numeric_detection=true |
long |
|
floating point |
numeric_detection=true |
float |
|
number |
integer |
long |
|
floating point |
float |
||
JSON object |
object |
||
array |
supported data type |
Datatype of the first element |
|
boolean |
true/false |
Boolean |
|
null |
null |
Not mapped |
By default, the date_detection setting is enabled and the numeric_detection setting is disabled. Let me explain with the help of an example with the two indices, default_mappings_index and custom_mappings_index. For our convenience, we use the default settings with the mapping date type in yyyy/MM/dd format.
The mappings creation is shown as the following steps:
- Create a default_mappings_index index using the simple date format as shown in the following screenshot:
- The sample document contains 12 fields to examine the variations in different JSON values and different settings. Let's first index the sample document into default_mappings_index, index as shown in the following screenshot:
- The other index is designed to turn off the detection of the date string value and turn on the detection of the numeric string. custom_mappings_index is created index with the date_detection setting and the numeric_detection setting on, as shown in the following screenshot:
- Index the same sample document into custom_mappings_index:
- Now, get the mapping from the two indices and then compile them into a table for comparison. To get the mappings of default_mappings_index, use the following API, as shown in the following screenshot:
- To get the mappings of custom_mappings_index, use the following API, as shown in the following screenshot:
The following table shows the results from comparing default_mappings_index and custom_mappings_index:
Field |
Field mapping in default_mappings_index |
Field mapping in custom_mappings_index |
simple_string_value |
text_type_with_keyword |
Same |
string_w_date_value |
{"type": "date","format": "yyyy/MM/dd"} |
text_type_with_keyword |
string_w_floating_point_value |
text_type_with_keyword |
{"type": "float"} |
string_w_integer_value |
text_type_with_keyword |
{"type": "long"} |
integer_value |
{"type": "long"} |
Same |
floating_point_value |
{"type": "float"} |
Same |
json_object |
corresponding_object_type |
Same |
array_of_integer |
{"type": "long"} |
Same |
array_of_float |
{"type": "float"} |
Same |
array_of_simple_string |
text_type_with_keyword |
Same |
array_of_integer_string |
text_type_with_keyword |
{"type": "long"} |
array_of_float_string |
text_type_with_keyword |
{"type": "float"} |
boolean_value |
{"type": "boolean"} |
Same |
null_value |
Not indexed |
Not indexed |
From the preceding table, text_type_with_keyword is {"type": "text","fields": {"keyword":{"type":"keyword","ignore_above": 256}}} and corresponding_object_type is {"properties": {"level_1": {"properties": {"level2_1": {"type": "text","fields": {"keyword": {"type": keyword","ignore_above": 256}}},"level_2_2": {"type": "long"},"level_2_3": {"type": "float"}}}}} where each field of the object follows the same mapping rules.
We can also see that the numeric detection setting has an effect on the array of numeric strings. Finally, we can establish that any field with a null value will not be indexed, and the same is true for empty arrays.