Jump to Content
Guides
API Reference
Changelog
Log In
Guides
Log In
Moon (Dark Mode)
Sun (Light Mode)
Guides
API Reference
Changelog
What is confidence score?
Search
General
New to Diffbot?
Products Overview
Credits
Knowledge Graph
Getting Started with Knowledge Graph
General Concepts
Entity ID and diffbotUri
Origin
Importance
crawlTimestamp
Confidence Score
nbIncomingEdges
nbOrigins
KnowledgeGraph Sources - Places
Search (DQL)
Query Types
Simple & Nested Paths
Has Operator
Regex Operator
Comparison Operators
Or Operator
Min/Max Operators
Get Operator
Not Operator
Near Operator
SimilarTo Operator
Sorting Results
Custom Scoring & Relevance
Facet Queries
Dates and Timestamps
Article Tags and Categories
Search a Crawl/Bulk job using DQL
Exporting Columnar Format
Search Tutorials
Search (DQL) Basics
Useful DQL Queries
How to Find Articles By Topic Sentiment
DQL Workflow Example
Creating Effective Queries
Tutorial: How to Build a News Monitoring App
Enhance
Accepted Inputs for Enhance by Entity Type
Enhance Tutorials
Enhance Basics
Tutorial: How to Enhance a CSV
Ontology
All Entities
Article
Organization
Person
Place
CreativeWork
Product
Image
Video
Event
FAQ
JobPost
LegalEntity
Research
Microsoft Excel Integration/Add-In
Installation
Getting Started
Google Sheets Integration/Add-On
Common Questions with Knowledge Graph
Where is data for the Knowledge Graph sourced?
What is the importance of the importance field?
What is confidence score?
What is nbIncomingEdges?
How are IsAcquired and IsDissolved determined?
What does nbOrigins mean?
How are subsidiaries of an organization defined?
What Organization Classifications are supported in the graph?
What NAICs Classifications are supported in the Graph?
What is diffbotUri?
What is the crawlTimestamp field?
How do I search for AdministrativeAreas by ISO 3166 codes?
What financial information is present in the KG?
What are skills in the Knowledge Graph?
Natural Language Processing
Getting Started with Natural Language
Extract
Getting Started with Extract
Getting Started with Custom API
Common Questions with Extract API
How Diffbot handles multi-page articles and discussions
Does Diffbot extract non-English pages?
How long can a single Extract API request take?
Can Extract APIs Extract Content from PDFs or Other Documents?
Can I send HTML or text directly to Extract APIs?
How do I improve Extract API response times?
Do Extract APIs execute Javascript?
Do Extract APIs follow redirects?
How to Extract Product Prices in Other Currencies with Product API
Can I limit extraction to articles written before, after or between certain dates?
Common Questions with Custom API
What happens when a Custom API rule "breaks"?
Creating Custom Rules without a Browser Preview
How do custom APIs handle different templates?
Can I create multiple custom rules for a single site?
Can I access meta tags using Custom API?
How to Apply a Custom API to Multiple Domains
How to Use Custom User Agents with Extract APIs
Extract Tutorials
Tutorial: How to extract content behind logins
Tutorial: How to override the ‘images’ field in the Article API
Tutorial: How to backup and restore Custom API rulesets
Tutorial: How to Fix an Incorrect Extract API Field
Tutorial: How to Extract Custom Product Variant Data
Tutorial: How to use Prefilters to Ignore Website Elements
Tutorial: A Tiny, Zero Dependency Price Tracker
Tutorial: How to Pull Data From a Website to Google Sheets
Bulk & Crawl
Getting Started with Bulk Extract
Getting Started with Crawl
Crawl and Processing Patterns and Regexes
Common Questions with Bulk & Crawl
The Difference Between Crawling and Processing
How to Read the URL Report
Restricting Crawls to Domains and Subdomains
How does Diffbot handle duplicate pages/content while crawling?
Can I spider multiple sites in the same crawl?
Can multiple Diffbot Extract APIs be used in a single crawl?
Can Crawl use a site map (or sitemap) as a crawling seed?
Can Diffbot crawl sites that use “infinite” or “endless” scrolling?
How to find and access JavaScript-generated links while crawling
Why is my crawl not crawling (and other uncommon crawl problems)?
How do I stop a “never-ending” crawl due to dynamic URLs or querystrings?
Does Crawl follow “hashtag” links / internal links / fragment identifiers?
How are repeating/recurring crawls scheduled?
How can I crawl (news) sites and monitor/extract only recent content?
How long does it take to crawl a site?
How to Improve Crawl Efficiency
Is there a limit to the number of crawls/bulk jobs?
How to Use Querystrings in Crawl and Bulk Extract
Bulk & Crawl Tutorials
Tutorial: How to get all the URLs on a website
Taxonomy
Organization Industries
Organization Industries (Legacy)
Product Categories
Article Categories
Employment Categories
Technology Categories
Accounts & Billing
What is Diffbot's CCPA Policy/Privacy Policy for CA Residents?
Is Diffbot Compliant with GDPR/EU Data Laws?
More Account Questions
Can I Create Multiple Tokens Under my Account?
Where do I check my billing history with Diffbot?
How can I update my credit card details?
Does Diffbot offer manual invoicing, custom terms or other payment options?
What counts as an API credit?
Powered by
What is confidence score?
Suggest Edits
See
Confidence Score
Updated about 2 years ago