NACE Code Rev 2.1 Updates

by Kris Negulescu

KG DATA CHANGE NOTIFICATION - Organization.naceClassification

We will be updating Organization.naceClassification to NACE Rev. 2.1 in build v437 of the Diffbot Knowledge Graph, targeted to go live in about two weeks. Please read on for more details.

Ordinarily, we take extraordinary measures to avoid breaking changes in the Diffbot Knowledge Graph ontologies. However, in some cases, there is no benefit in retaining a prior version of the data, so we replace an existing attribute with a new data format. The Organization.naceClassification field is one such case. The current version of the NACE codes in the KG lacks level, isPrimary, and ancestor codes. And, some of the codes are no longer valid in the latest NACE Rev. 2.1 version.

In Rev 2.1 of the NACE codes:

  • NACE codes are no longer strictly 4-digit numbers.
  • NACE codes are structured into:
    Sections (letters A–V, level 1) →
    Divisions (2 digits, level 2) →
    Groups (3 digits with dot, level 3) →
    Classes (4 digits with dot, level 4).
  • Codes are unique. For example, both 28 and 29 share the same parent C, but it is not repeated after 28 because it already appears earlier in the primary chain.
  • There is at most one primary code per level.
  • Primary codes are listed before non-primaries.
  • Specific codes (e.g., 29.10) are listed before broader ones (e.g., 29.1).

For a comparison of the existing code format versus the new Rev 2.1 format, see below.

CURRENT DATA FORMAT: NACE codes - Organization.naceClassification

Volkswagen's current NACE classification in the KG appears as the following

[  
  {  
    "code": "2910",  
    "isPrimary": false,  
    "name": "Manufacture of motor vehicles"  
  },  
  {  
    "code": "7022",  
    "isPrimary": false,  
    "name": "Business and other management consultancy activities"  
  },  
  {  
    "code": "7021",  
    "isPrimary": false,  
    "name": "Public relations and communication activities"  
  }  
]

Issues with this data:

  • Missing level information
  • All codes are marked as non-primary
  • Parent codes are missing
  • Codes 7022 and 7021 are no longer valid in the Rev. 2.1 version of the codes
  • Volkswagen should not be classified under those industries in 7022 and 7021.

NEW DATA FORMAT: NACE Rev 2.1 Codes

When the updates deploy, Volkswagen's Organization.naceClassification NACE codes will look like this:

[  
  {  
    "code": "29.10",  
    "level": 4,  
    "isPrimary": true,  
    "name": "Manufacture of motor vehicles",  
    "version": "Rev 2.1"  
  },  
  {  
    "code": "29.1",  
    "level": 3,  
    "isPrimary": true,  
    "name": "Manufacture of motor vehicles",  
    "version": "Rev 2.1"  
  },  
  {  
    "code": "29",  
    "level": 2,  
    "isPrimary": true,  
    "name": "Manufacture of motor vehicles, trailers and semi-trailers",  
    "version": "Rev 2.1"  
  },  
  {  
    "code": "C",  
    "level": 1,  
    "isPrimary": true,  
    "name": "MANUFACTURING",  
    "version": "Rev 2.1"  
  },  
  {  
    "code": "28.11",  
    "level": 4,  
    "isPrimary": false,  
    "name": "Manufacture of engines and turbines, except aircraft, vehicle and cycle engines",  
    "version": "Rev 2.1"  
  },  
  {  
    "code": "28.1",  
    "level": 3,  
    "isPrimary": false,  
    "name": "Manufacture of general-purpose machinery",  
    "version": "Rev 2.1"  
  },  
  {  
    "code": "28",  
    "level": 2,  
    "isPrimary": false,  
    "name": "Manufacture of machinery and equipment n.e.c.",  
    "version": "Rev 2.1"  
  }  
]

Diffbot on Postman

by Jerome Choo

You can now find us on Postman! We're starting with Extract API, and moving quickly to get the rest of our APIs on Postman as well.

Postman is an API testing platform that eliminates the need to manually write cURL. The API testing UI is quite similar to what we have in the docs, with even more features to setup your environment, testing scripts, and more.

Note that our primary documentation platform will continue to live on docs.diffbot.com. Postman is an extension of our docs presence to make it easier for Postman users to test Diffbot APIs on their preferred platform.

Fork and watch our Diffbot API collection on Postman!

Investment Transactions are now searchable on LeadGraph! This makes it possible to:
⁃ Stay on top of recent funding rounds
⁃ Find investors that have invested in companies with particular industries, keywords, company size, etc.
⁃ See funding insights for investors, industries, funding rounds, and more....

Company Reports in the KG

by Kris Negulescu

You can now use the DQL search API to get company reports for Organizations in your database, at scale, including 10-Ks, 10-Qs, 8-K, etc. To date, we have exported ~3M SEC EDGAR reports . And, we have started to download reports from Forbes Global 2000 company websites as well with ~400K reports downloaded so far to date . This data is still a work in progress so review outputs carefully, e.g. we are working to improve report titles extracted from PDFs.

The report types we support include Current Reports, Quarterly Reports, Annual Reports, and more . Please let us know if there are reports you'd like us to add to the graph.

When exporting data from collections via DQL, you have always had the option of specifying ONLY the fields you want to be returned in the JSON output by using the '&filter=' param (i.e. &filter=id%20name%20homepageUri added to a query like this).

https://kg.diffbot.com/kg/v3/dql?type=query&token=TOKEN&query=type%3AOrganization+types%3A%22Company%22&size=25&filter=id%20name%20homepageUri

But this approach can be unwieldy if you have a long list of attributes to include or if you only want to exclude a few attributes per entity in the output.

Now, instead of specifying the fields you want to include, you can exclude fields you do not want returned when exporting data by using the &filterExclude= param (i.e. &filterExclude=subsidiaries%20technologies%20customers)

https://kg.diffbot.com/kg/v3/dql?type=query&token=TOKEN&query=type%3AOrganization+types%3A%22Company%22&size=25&filterExclude=subsidiaries%20technologies%20customers