Recent Updates to the KnowledgeGraph and DQL
by Kris NegulescuAdded Organization.suppliers to the graph, i.e. type:Organization suppliers.name:"Diffbot"
Added support for non-software technologies such as:
- type:Organization technographics.technology.name:"3D food printing" and
- type:Technology technologyCategories:"Manufacturing"
Added new organization industries to the graph including:
- Real Estate > Real Estate Investment Trusts (naics: 525930)
- Financial Services > Banks > Central Banks (naics: 521110)
- Financial Services > Credit Unions (naics: 522130)
- Financial Services > Money Exchange Providers (to include Foreign exchange companies, and Online remittance providers)
- Services > Laundry Companies (naics: 812320)
- Food > Dairy Companies (naics: 112120)
- Food > Cocoa Companies (naics: 311351)
- Medical Organizations > Cannabis Companies
- Construction Companies > Landscaping Services (naics: 561730)
- Environmental Organizations > Recyclable Material Companies (naics: 423930) (we separated Waste Organizations And Recycling Facilities into "Waste Organizations" and "Recyclable Material Companies")
- Consumer Service Companies > Parking Companies (naics: 812930)
- Hospitality Companies > Adult Entertainment Clubs
- Retailers > Vending machine operators (naics: 454210)
- Retailers > Used Merchandise Retailers (naics: 459510)
- Educational Organizations > | Educational Institutions > K12 Schools (to check the differences with Schools and clean in case)
Organization Data Additions
by Kris NegulescuAdded Place.headOfPlace and corresponding employment
Heads of state and heads of government
Added Organization.employeeCategories
Organizations with a particular number of people in a department (e.g., sales)
Coverage Reports Now Available for Bulk Enhance
by Kris NegulescuCoverage Reports for Bulk Enhance Jobs (CSV)
The Coverage Report is a detailed summary of attribute coverage, per entity, and includes the % coverage per field in the dataset overall. The report is downloadable as a CSV from the Report API: https://kg.diffbot.com/kg/v3/enhance/bulk//. The report information is also included in the Enhance Bulk job Status API response: https://kg.diffbot.com/kg/v3/enhance/bulk//status, and can be specified for inclusion with the parameters filter
or exportspec
Organization.revenue model - improved quality & coverage
by Kris NegulescuWe improved the Organization.revenue estimation model and extended coverage to 95+% of all organization profiles in the KG. Search: type:Organization has:revenue to explore all or narrow in on corporations by searching type:Corporation has:revenue.
3x More Non-Profit Organizations Identified in the KG
by Kris NegulescuWe added a new ML model for Organization.isNonProfit, increasing the accuracy and coverage for non-profit organizations in the KG.
Test Our APIs Without Leaving Diffbot Docs
by Kris Negulescudocs.diffbot.com got a HUGE makeover recently. We've migrated over 250 pages from 3 separate documentation sites and organized multiple API specifications into the same view. Our favorite new feature is the ability to test APIs directly in the docs. Enter some input parameters and submit to get a response immediately. Then copy the request code into your application. Try it out with the DQL Search API.
We've also added some brand new pages on often asked topics:
• A complete Diffbot product overview.
• An explainer on how Credits work at Diffbot.
• An authentication section with its own endpoint.
• A subscribable changelog integrated into the docs.
Optimized and expanded industry classifications
by Kris NegulescuWe've added a new Organization attribute "diffbotClassification" that expands industry classification to three or more levels, e.g.
"diffbotClassification": [
{
"level": 3,
"isPrimary": true,
"name": "Display Technology Companies",
"diffbotUri": "https://diffbot.com/entity/IC_qvY0Oloiyj"
},
{
"level": 2,
"isPrimary": true,
"name": "Computer Hardware Companies",
"diffbotUri": "https://diffbot.com/entity/IC_D6llNR8xOo"
},
{
"level": 1,
"isPrimary": true,
"name": "Software Companies",
"diffbotUri": "https://diffbot.com/entity/IC_H04NbzO6L8"
},
...
]
For NAICS and SIC classifications, we nominate a "primary" classification tier in the array of matching classification codes/labels. Documentation of these fields can be found in the Ontology section of docs.diffbot.com.
Added 'technographics' as an attribute of Organizations (Alpha)
by Kris NegulescuWe are slowly rolling out support for 'technographics' as an attribute of Organizations. Ultimately, there will be many sources of this data and coverage across all Organizations in the graph. To start, more than 8.5Mil companies include some level of technographic data. Below is an example of how the technology is represented in the default JSON output (excerpted from the IBM entity).
"technographics": [
{
"technology": {
"recordId": "EPdsrDmLiMQCskvBLp_dloQ@2275",
"name": "React",
"websiteUris": [
"reactjs.org"
],
"surfaceForm": "React",
"position": "companyTechnographicsTechnology",
"type": "DiffbotEntity"
},
"categories": [
"JavaScript frameworks"
]
},
{
"technology": {
"recordId": "EPdsrDmLiMQCskvBLp_dloQ@2276",
"name": "TrustArc",
"websiteUris": [
"trustarc.com"
],
"surfaceForm": "TrustArc",
"position": "companyTechnographicsTechnology",
"type": "DiffbotEntity"
},
"categories": [
"Cookie compliance"
]
},
...
]
To browse organizations with technographic data, try this query:
type:Organization has:technographics.technology.name
We recently improved the accuracy of the estimate populating the 'nbEmployees' attribute for publicly-traded US organizations in the Knowledge Graph by expanding analysis of data from sec.gov reports. 70%+ of the SEC-10k documents now include nbEmployees for each reporting period.
We also added 'secForms' as an attribute of Organizations. Below is a JSON excerpt from IBM featuring a filing from 2019:
"secForms": [
{
"formType": "8-K/A",
"periodOfReport": {
"str": "d2019-06-30",
"precision": 3,
"timestamp": 1561852800000
},
"filingDate": {
"str": "d2019-09-20",
"precision": 3,
"timestamp": 1568937600000
},
"documentUrl": "https://www.sec.gov/ix?doc=/Archives/edgar/data/51143/000155837019008675/ibm-20190709x8ka.htm",
"filingUrl": "https://www.sec.gov/Archives/edgar/data/51143/000155837019008675/0001558370-19-008675-index.htm"
},
{
"formType": "10-Q",
"periodOfReport": {
"str": "d2019-06-30",
"precision": 3,
"timestamp": 1561852800000
},
"filingDate": {
"str": "d2019-07-30",
"precision": 3,
"timestamp": 1564444800000
},
"documentUrl": "https://www.sec.gov/ix?doc=/Archives/edgar/data/51143/000155837019006560/ibm-20190630x10q.htm",
"filingUrl": "https://www.sec.gov/Archives/edgar/data/51143/000155837019006560/0001558370-19-006560-index.htm"
},
...
]