Normalizing the document

Although we should avoid putting different types of document in the same index, still if it is required to put them in a single index, there are few things that we should take into consideration such as normalizing the fields so that we can easily perform any operations on them. For example, consider that we have a date field on which we can perform any operation, and in one document, this date field is timestamp while in an other field this date field is create_date. Then, this will be a problem when performing any operation on the date field for these situations.

Refer to the following example where we have two documents in the index; in the first document, the date field is create_date, whereas in the other, the date field is timestamp:

[{
"_index": "bqstack",
"_type": "blogs",
"_id": "EQJnGWQBnhG38eKPq5Bo",
"_score": 1,
"_source": {
"category_name": "Railways",
"name": "Rocky Paul",
"edit_approved": false,
"email": "rocky.paul.9867@xyz.com",
"edited_blog_content": null,
"category_id": 24,
"author_id": 75,
"create_date": "rocky.paul.9867@abcd.com"
}
},
{
"_index": "bqstack",
"_type": "blogs",
"_id": "EwJnGWQBnhG38eKPq5Bo",
"_score": 1,
"_source": {
"category_name": "Cars",
"name": "Rocky Paul",
"edit_approved": false,
"email": "rocky.paul.9867@abcd.com",
"edited_blog_content": null,
"category_id": 35,
"author_id": 75,
"timestamp": "2018-05-09T13:28:20.917Z"
}
}
]

If we have not normalized these fields, then during the search it will pick only those documents where the field is defined and will leave other documents with a different field name.

So, to avoid these issues, we can rename one document field name to match it with an other document's field of the same type. This will increase Elasticsearch performance, and we can get the results from the whole document in the event we perform any operation on such a type of fields.