Elasticsearch Compound Queries

Compound queries wrap other compound or leaf queries to combine results and scores, to change behaviour, or to switch from query to filter context. In this tutorial, we’re gonna look at types of compound query: Constant Score, Bool, Dis Max, Function Score and Boosting Query.

Related post: Elasticsearch Compound Queries – Function Score Query

I. Constant Score Query

constant_score query wraps another query, and:
– executes it in filter context (more about query context & filter context: ElasticSearch Filter vs Query)
– give the same _score for all matching documents (constant).

Response:

II. Bool Query

bool query matches all documents which match boolean combinations of queries. It is built using one or more boolean clauses:

must: must appear in matching documents, influence the score.

must_not: must not appear in the matching documents, ignore the score.

filter: must appear in matching documents, ignore the score.

should: should appear in the matching document.
+ in query context and there is a must or filter clause: a document will match the bool query even if none of the should queries match, influence the score.
For example:

This document will match (although it doesn’t contain any required term in tags field):

+ in filter context or there is no must or filter clause: at least one of the should queries must match a document to match the bool query, ignore the score.
For example, the result of this query will not contain document in example above:

We can controll this behavior of should by settings the minimum_should_match parameter.

For example:

This query does not require at least “publisher” or “subscriber” in tags field of document. The response may contain:

Add minimum_should_match = 1, response will not have documents which tags field does
not contain at least “publisher” or “subscriber”:

Change to minimum_should_match = 2, only documents contain both “publisher” and “subscriber” in tags field are accepted.

III. Dis Max Query

dis_max query generates the union of documents produced by its subqueries:
– it scores document with the maximum score produced by any subquery
– it pluses a tie breaking increment for any additional matching subquery (default tie_breaker is 0.0)

It is useful when searching for a word in multiple fields, when we want the primary score to be the one associated with the highest boost, not the sum of the field scores (as boolean query would give).

For example, we search for “spring” in title field and “integration” in tags field:

Response:

We can see that the document with only “integration” in tags field gets more score than documents with both “spring” in title field and “integration” in tags field.

Now we add tie_breaker to the query:

Response:

– document with only “integration” in tags field does not change its score
– documents with both “spring” in title field and “integration” in tags field increase their score by the result of multiple tie_breaker with additional matching subquery score.

IV. Function Score Query

function_score can help us modify score of documents in the result.

To use function_score, we have to define a query, then add one or more functions:
– one function:

In the example above, we use just one function: random_score with boost_mode (sum to add boost value to score). So each document in result will have score = random_value[0..1] + 3.

– combine several functions:

The example above:
– get all documents by match_all query (then some of them will be excluded by min_score parameter).

boost them by score 2.

– use 2 functions:
+ random_score with weight for first filter
+ weight for second filter
(supported functions are here)

score_mode: sum means new score is the sum of given functions’ score. More score mode’s here.

boost_mode: sum means query score (match_all with boost) and function score (after applying score_mode) are added. More boost mode’s here.

max_boost restricts new score to not exceed the specified value. Default to FLT_MAX.
In the example (score_mode is sum):

=> new_score could be from 0 to 5. So max_boost = 4 will limit it to 4.

So the max score of all documents will not be over 6 because with boost_mode is sum:
max_score = boost + max_boost = 2 + 4 = 6

min_score: exclude documents which do not meet minimum score.
Because modifying the score does not change which documents match, so using min_score = 3 will exclude all documents with score (after all calculating) lower than 3.

*Note: For min_score to work, all documents returned by the query need to be scored.

V. Boosting Query

boosting query helps us:
– return documents which match a positive query
– reduce the score of documents which also match a negative query

For example, we query documents which title contain “firebase”, and reduce their scores by a half (multiply with 0.5) if titles contain “example”:


Response:

If we change "negative_boost" value to 1:
– first document’s score is not changed.
– second document’s score increases 2 times (0.33373752 to 0.66747504) – gets its real score without reducing.

By grokonez | November 22, 2017.


Related Posts


Got Something To Say:

Your email address will not be published. Required fields are marked *

*