Faceted queries allow you to gain insight into large data sets by displaying summary data about the values of some field, as they are represented across all entities returned by the primary query. For instance, a faceted query could provide a list of the most widely-held skills of employees at a company and the number of employees possessing each of those skills. A single facet query may return up to 1000 results.
To perform a faceted query, you first write your primary DQL query describing the entities you want to analyze, then add the facet keyword and specify the field you want summary data about.
We can use facet queries to view the entire list of values for some field, across an entire result set.
For example, to view a list of all sicClassification codes and their distribution across all companies in the KG, you can query:
Or, to view a list of all industry tags associated with any organization in the KG, you can query:
If you wanted to narrow your results to only show industry tags applied to organizations located in San Francisco that have 5000 or more employees and have a SIC classification, you can add those criteria to limit the primary result set:
To view a breakdown of how SIC classification codes are distributed among corporations that had over $100,000,000 in revenue for the 2019 fiscal year, we can query:
Facet queries return values in order of greatest number of matching entities, allowing you to gain insight about the most frequent values for some field, at a glance.
For example, to find the most common universities attended by Facebook employees who also have skills in Java, you can query:
Or, to list the most common skills held by Facebook employees who studied at Carnegie Mellon:
When faceting on a numeric field, instead of displaying each possible value of that field, the facet will be performed against a series of smaller ranges that encompass the entire range of values for that field.
For example, to view the number of companies in the Computer Hardware industry whose company size by maximum number of employees falls into each of a range of possible values, you can query:
The query then performs the facet against multiple ranges of numerical values, with those ranges encompassing the entire range of possible values for the field we are faceting against. In the results for the above query, you can see the number of companies having between 30 and 40 employees, or between 1,000 and 5,000 employees, for example.
When faceting on a date field, we can define the size of the time intervals over which aggregated facet values are returned, by adding a
For example, this query would return a breakdown of the number of articles related to "hacking" published over any particular week:
When faceting on a numeric field, we can also enumerate specific intervals over which we would like to view aggregated values.
For example, to view the number of organizations in the Computer Hardware industry that have between 100 and 200 employees, or that have between 200 and 500 employees, you can query:
We can also use these numeric date ranges to facet by arbitrary date ranges, using epoch/Unix time formats.
For example, to view the number of articles related to "hacking" that were published between 2018 and 2019, we could query:
We can also enumerate the specific values which we would like our facet results to be grouped into. The facet will be grouped by these values only.
For example, to view a breakdown of the number of organizations located in either San Francisco or Los Angeles, we could query:
Successful facet queries will return a
data object at the top level of the JSON response. The resulting field values are returned in order of the number of matching entities.
Each facet will contain the following fields:
count(integer) - Number of documents in the overall search query matching this facet value or range.
value(string) - Value of the specific facet.
callbackQuery(string) - The query to return entities matching the value for this facet.
Updated 5 months ago