ElasticSearch has two ways to limit the number of documents to return, depending on the context. This tutorial gives you overview of these ways which we call Query and Filter.
Related Posts:
– Elasticsearch Overview
– ElasticSearch – Structure of a Search Request/Response
– ElasticSearch Full Text Queries – Basic
1. Query Context
In this context, the query clause answers the question:
“How well does this document match this query clause?”
>> We have 2 main requirements:
– whether or not the document matches
– how well the document matches, relative to other documents (that _score
represents)
For example, this query:
GET javasampleapproach/tutorial/_search { "query": { "match": { "title": "Angular 4" } } }
will have the response like this:
{ ... "hits": { "total": 4, "max_score": 0.5753642, "hits": [ { "_index": "javasampleapproach", "_type": "tutorial", "_id": "1", "_score": 0.5753642, "_source": { "title": "Angular 4 Elasticsearch Introduction", "post_date": "2017-10-25", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "elasticsearch" ] } }, { "_index": "javasampleapproach", "_type": "tutorial", "_id": "3", "_score": 0.5649868, "_source": { "title": "Angular 4 Firebase Quick Start", "post_date": "2017-10-25", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } }, { "_index": "javasampleapproach", "_type": "tutorial", "_id": "2", "_score": 0.37227193, "_source": { "title": "Angular 4 Elasticsearch Create Index", "post_date": "2017-10-25", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "elasticsearch" ] } }, { "_index": "javasampleapproach", "_type": "tutorial", "_id": "4", "_score": 0.3256223, "_source": { "title": "Angular 4 Firebase - CRUD Operations example", "post_date": "2017-10-26", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } } ] } }
Look at the _score
for responsed items. We can see that the _score
decreases in order.
Change the query to:
GET javasampleapproach/tutorial/_search { "query": { "match": { "title": "angular firebase" } } }
The response will be:
{ ... "hits": { "total": 4, "max_score": 0.78178394, "hits": [ { "_index": "javasampleapproach", "_type": "tutorial", "_id": "4", "_score": 0.78178394, "_source": { "title": "Angular 4 Firebase - CRUD Operations example", "post_date": "2017-10-26", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } }, { "_index": "javasampleapproach", "_type": "tutorial", "_id": "3", "_score": 0.5649868, "_source": { "title": "Angular 4 Firebase Quick Start", "post_date": "2017-10-25", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } }, { "_index": "javasampleapproach", "_type": "tutorial", "_id": "1", "_score": 0.2876821, "_source": { "title": "Angular 4 Elasticsearch Introduction", "post_date": "2017-10-25", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "elasticsearch" ] } }, { "_index": "javasampleapproach", "_type": "tutorial", "_id": "2", "_score": 0.18613596, "_source": { "title": "Angular 4 Elasticsearch Create Index", "post_date": "2017-10-25", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "elasticsearch" ] } } ] } }
We recognise that the order is changed with _score
.
2. Filter Context
In Filter context, a query clause answers the question:
“Does this document match this query clause?”
>> The response is just a simple Yes or No (without _score
).
Frequently used filters will be cached automatically by Elasticsearch, to speed up performance. This context is mostly used for filtering structured data.
filtered
query is replaced by the bool
query.For example:
– Is post_date from “2017-10-25”?
– Does tags contain “firebase”?
So with this query:
GET javasampleapproach/tutorial/_search { "query": { "bool": { "filter": [ { "term": { "tags": "firebase" } }, { "range": { "post_date": { "gte": "2017-10-25" } } } ] } } }
We have the response:
{ ... "hits": { "total": 2, "max_score": 0, "hits": [ { "_index": "javasampleapproach", "_type": "tutorial", "_id": "4", "_score": 0, "_source": { "title": "Angular 4 Firebase - CRUD Operations example", "post_date": "2017-10-26", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } }, { "_index": "javasampleapproach", "_type": "tutorial", "_id": "3", "_score": 0, "_source": { "title": "Angular 4 Firebase Quick Start", "post_date": "2017-10-25", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } } ] } }
Notice that _score
is constant
.
3. Query & Filter Context
Now we mix 2 types of context in a Query Request:
GET javasampleapproach/tutorial/_search { "query": { "bool": { "must": [ { "match": { "title": "angular 4" } } ], "filter": [ { "term": { "tags": "firebase" } }, { "range": { "post_date": { "gte": "2017-10-25" } } } ] } } }
– The must
and two match
clauses are used in Query context, which means that they are used to calculate _score
for how well each document matches.
– The filter
indicates Filter context in which term
and range
are used. They will filter out documents which do not match, but NOT affect the _score
.
So, we can look at the response:
{ ... "hits": { "total": 2, "max_score": 0.5649868, "hits": [ { "_index": "javasampleapproach", "_type": "tutorial", "_id": "3", "_score": 0.5649868, "_source": { "title": "Angular 4 Firebase Quick Start", "post_date": "2017-10-25", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } }, { "_index": "javasampleapproach", "_type": "tutorial", "_id": "4", "_score": 0.3256223, "_source": { "title": "Angular 4 Firebase - CRUD Operations example", "post_date": "2017-10-26", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } } ] } }
We can see that _score
is calculated.
Now we change range
of filter
:
GET javasampleapproach/tutorial/_search { "query": { "bool": { "must": [ { "match": { "title": "angular 4" } } ], "filter": [ { "term": { "tags": "firebase" } }, { "range": { "post_date": { "gte": "2017-10-26" } } } ] } } }
The response is:
{ ... "hits": { "total": 1, "max_score": 0.3256223, "hits": [ { "_index": "javasampleapproach", "_type": "tutorial", "_id": "4", "_score": 0.3256223, "_source": { "title": "Angular 4 Firebase - CRUD Operations example", "post_date": "2017-10-26", "author": { "name": "JavaSampleApproach", "role": "admin" }, "tags": [ "angular", "angular 4", "firebase" ] } } ] } }
The _score
for “Angular 4 Firebase – CRUD Operations example” still does not change the value.
Last updated on April 25, 2019.