Custom Scoring & Relevance
Custom Scoring
You can use should
and must
clauses for boosting and ranking results. They can be used to implement a custom scoring function.
should
clause
should
clauseThe should operator allows you to specify optional clauses which result in matching entities appearing higher in the search.
For example: to search for all employees currently employed at Google but boost the ones employed in leadership roles, you can use this query:
type:Person employments.{isCurrent:true employer.name:"Google" should:categories.name:"leadership"}
The should:categories.name:"leadership"
clause does an optional match on employments.categories.name
but boosts those results.
You can specify multiple should
clauses. For example: to search for all employees currently employed at Google but boost the ones employed in leadership roles and located in United States, you can use this query:
When using multiple should
clauses, you can specify different weights for clauses using the should[<weight>]
syntax where weight
is a number between 1 and 100.
For example, to specify double the weight to the categories
field, you can use this query:
The clause should[100]:categories.name
has a weight of 100
and the clause should[50]:location.country.name
has a weight of 50
.
When using a should
clause, you can also name the clause to figure out which clause matched the result. The syntax for naming a should clause
is should[weight,clause_name]
For example, the following query names the following clauses:
should[100,"isLeader"]:categories.name:"leadership"
clause is named asisLeader
should[50,"isUS"]:location.country.name:"United States of America"
clause is named asisUS
The json response indicates the matched clauses in the data[].entity_ctx.matched_clauses
element indicating that both isLeader
and isUS
named clauses matched:
"data": [
{
"score": 1,
"entity": {...},
"entity_ctx": {
"matched_clauses": [
"isLeader",
"isUS"
]
}
]
must
clause
must
clauseYou can also specify weight for non-optional clauses using the must
keyword.
For example, to search for AI companies with optional Series A investment, but giving more weight to AI companies:
Relevance
inner_hits
inner_hits
When matching multi-valued fields (nested fields) such as employments
, locations
etc. the response message contains the data[].entity_ctx.inner_hits
element which indicates the zero-based index of the employments
, locations
etc. which matched the query. If multiple records match the query, they are sorted in order of relevance (generally, more matching clauses indicates higher relevance). For example, this response indicates that the second (index=1) and third (index=2) employments match the query:
"data": [
{
"score": 1,
"entity": {...},
"entity_ctx": {
"inner_hits": {
"employments": [
1,
2
]
}
}
]
Updated over 1 year ago