Entity ID and diffbotUri
A unique identifier for an entity in the Knowledge Graph
Each entity in the Diffbot Knowledge Graph is identified by a unique identifier, an alphanumeric string like EYX1i02YVPsuT7fPLUYgRhQ
that uniquely identifies an entity called an Entity ID.
To access an entity directly in the Diffbot Knowledge Graph, you need the entity’s diffbotUri
.
The diffbotUri
is a combination of the entity ID and the URL prefix <https://www.diffbot.com/entity
>.
For example, Diffbot's entity ID is EYX1i02YVPsuT7fPLUYgRhQ
and its diffbotUri
is https://www.diffbot.com/entity/EYX1i02YVPsuT7fPLUYgRhQ.
allOriginHashes
allOriginHashes
While Entity ID is the primary ID of the entity, an entity can have multiple secondary IDs. These are captured in the allOriginHashes
. Unlike the entity ID, the IDs in allOriginHashes
are not prefixed with E
.
Example
"allOriginHashes": [
"oGpT1BoMP76FMP3Rs42qDw",
"-EwZZqt5PiiIuqGTfnijTw",
"g_oCv3ilO0mQwxf7_XUjLQ",
"vaU1IZacPS-dfuVgLo8Qrg",
"cULALJTPNruUuXptqBRlyw",
"zhucRbSUNzKSg4SFOPulJw",
"XKUPRt-sNAOvPi2b9wj4kg",
"Q4QZ7bTCNCyDVMP-h867Zw",
"MTgMVRcgPuOc6LMFP6f3QQ",
"y4k9dud9NXCabK9rjyEAiA",
"ptr_SmZDMIqTFWd6qja_YA",
"DTvi0UEgPS2aX_mIoJavRg",
"tR45XKHsN5ampBO9Gkqq3A",
"Bc2nFg0qMvmvQQ4rao2j5A",
"oaeeIFWAPMecjkTqdsOvyA",
"pqtp1KNtO6ySQppbhyliNA",
"og2Glq0tOGCpnrOCvfLG5Q"
]
Stability of a Entity ID
Organization entity IDs rarely change but when they do it is most often due to one of the following factors:
- An undermerged entity has been merged into an existing entity that becomes the entities new primary identifier
- An overmerged entity is split apart and the new entity created is assigned a new primary identifier
- The company has been acquired, and merged or dissolved as a standalone entity
- The company went bankrupt and eventually was dissolved
When you search the graph using an entity ID but gets a different ID returned in the JSON response, the user has been redirected from a prior entity ID to a new one.
When the ID of an entity changes in the graph, we internally redirect the old ID to the current ID. This redirection is transparent to KG users during the match process, i.e. both the old ID and the current ID will work seamlessly if accessed via DQL or Enhance. However, requests using an old ID will return a different entity ID than the one requested since the KG endpoints and JSON outputs always return current IDs in the id
field along with all 'ids' ever affiliated with an entity under allOriginHashes
.
We have worked hard to prevent unnecessary changes to the main ID, but it is impossible to avoid them altogether. Unfortunately, this behavior will never go away. Every dataset of significant size has changes that create these redirects.
If you are doing any id
equality operations in your systems, we advise that you should be looking out for these redirects and keeping the most recent id
in your database. So each time you see a mismatch between the ID in the query and the ID in the JSON, you should update your system to use the ID in the JSON.
However, if you are doing a one-time data pull from Diffbot using IDs, there's no need to do anything. Diffbot resolves the redirects automatically.
Example of Entity id migration for 7-Eleven
If you lookup 7-Eleven using this identifier EvaU1IZacPS-dfuVgLo8Qrg
, the JSON response returned with a different ID: EBc2nFg0qMvmvQQ4rao2j5A
. However vaU1IZacPS-dfuVgLo8Qrg
, (the original ID EvaU1IZacPS-dfuVgLo8Qrg
without the E
suffix) is still available.
API Call
https://kg.diffbot.com/kg/v3/dql?token=<DIFFBOT-TOKEN>&query=id:EvaU1IZacPS-dfuVgLo8Qrg&filter=$.name;$.id;$.allOriginHashes
Output
{
"version": 3,
"hits": 1,
"results": 1,
"kgversion": "281",
"diffbot_type": "entity",
"facet": false,
"textFallback": false,
"data": [
{
"entity": {
"id": "EBc2nFg0qMvmvQQ4rao2j5A",
"name": "7-Eleven",
"allOriginHashes": [
"oGpT1BoMP76FMP3Rs42qDw",
"-EwZZqt5PiiIuqGTfnijTw",
"g_oCv3ilO0mQwxf7_XUjLQ",
"vaU1IZacPS-dfuVgLo8Qrg",
"cULALJTPNruUuXptqBRlyw",
"zhucRbSUNzKSg4SFOPulJw",
"XKUPRt-sNAOvPi2b9wj4kg",
"Q4QZ7bTCNCyDVMP-h867Zw",
"MTgMVRcgPuOc6LMFP6f3QQ",
"y4k9dud9NXCabK9rjyEAiA",
"ptr_SmZDMIqTFWd6qja_YA",
"DTvi0UEgPS2aX_mIoJavRg",
"tR45XKHsN5ampBO9Gkqq3A",
"Bc2nFg0qMvmvQQ4rao2j5A",
"oaeeIFWAPMecjkTqdsOvyA",
"pqtp1KNtO6ySQppbhyliNA",
"og2Glq0tOGCpnrOCvfLG5Q"
]
}
}
]
}
Updated almost 2 years ago