What NAICs Classifications are supported in the Graph?

What are NAICS classifications?

The North American Industry Classification System (NAICS) is the standard used by United States Federal statistical agencies in classifying business establishments for the purpose of collecting, analyzing, and publishing statistical data related to the U.S. business economy.

NAICS was developed under the auspices of the United States Office of Management and Budget (OMB), and adopted in 1997 to replace the Standard Industrial Classification (SIC) system. It was developed jointly by the U.S. Economic Classification Policy Committee (ECPC), Statistics Canada, and Mexico's Instituto Nacional de Estadistica y Geografia, to allow for a high level of comparability in business statistics among the North American countries.

The four principles of NAICS are:

  1. NAICS is erected on a production-oriented conceptual framework. This means that producing units that use the same or similar production processes are grouped together in NAICS.
  2. NAICS gives special attention to developing production-oriented classifications for (a) new and emerging industries, (b) service industries in general, and (c) industries engaged in the production of advanced technologies.
  3. Time series continuity is maintained to the extent possible.
  4. The system strives for compatibility with the two-digit level of the International Standard Industrial Classification of All Economic Activities (ISIC Rev. 4) of the United Nations.

How Does the Diffbot KG Support NAICS?

The Diffbot KnowledgeGraph supports both the 2017 and 2022 versions of the NAICs classification codes and labels. We extract NAICS codes from various web sources and integrate them with the diffbotClassification. Specifically, we extract NAICS codes from either official sources like sec.gov, or less official ones like business directory websites. These codes are often obtained without reference to a specific version, so efforts are made to canonicalize them against both 2017 and 2022 versions and translate them to the other version when they only refer to one of the two.

It can be difficult to distinguish primary codes from non-primary codes within the extracted data. To address this, we leverage diffbotClassifications that are inferred during the KG build, which assigns a score to each code and can be used to derive primary/non primary classifications. Crosswalks from diffbotClassification to NAICS are then utilized to integrate missing codes and provide a ranking of NAICS codes for both the 2017 and 2022 versions.
It is very likely that NAICS inferred from diffbotClassification will be used most of the time (when the extractions are not from official government records). We are working on adding more specific diffbotClassification codes to improve mapping to NAICS.

The field Organization.naicsClassification2017 populates NAICS codes and labels for the 2017 version.

The field Organization.naicsClassification populates NAICS codes and labels for the 2022 version.

Type: ClassificationCode


    "naicsClassification": [  
            "code": "334419",  
            "level": 5,  
            "isPrimary": true,  
            "name": "Other Electronic Component Manufacturing"  
            "code": "334410",  
            "level": 4,  
            "isPrimary": true,  
            "name": "Semiconductor and Other Electronic Component Manufacturing"  
            "code": "334400",  
            "level": 3,  
            "isPrimary": true,  
            "name": "Semiconductor and Other Electronic Component Manufacturing"  

Examples on Census.gov:

Additional References:
Diffbot Documentation: Organization.industries, KG Ontology Organization Entity