Docs Suite

Docs Suite

  • Debugging
Edit

Can multiple Diffbot extraction APIs be used in a single crawl?

Crawlbot crawls are meant to work with a single Diffbot extraction API. If you wish to process multiple types of pages through separate APIs, your options are:

Use the Analyze API

The Analyze API will automatically determine the page-type of each page crawled, and structure the data from supported pages. This content can be filtered using the Search API, or the JSON downloaded in full and filtered using the type field.

Set up Multiple Crawls

To explicitly use multiple APIs for crawling a single site, you'll need to set-up multiple crawls, each using an independent API. You can use Crawlbot's multiple mechanisms for controlling/narrowing your crawls to ensure that each separate crawl job only processes the right type of pages.

Last updated by Bruno Skvorc
  • Use the Analyze API
  • Set up Multiple Crawls
Docs Suite
Docs
ExtractionCrawlingKnowledge GraphDiffbot and GDPR
Community
Stack OverflowTwitter
More
BlogHelpGitHub
Diffbot.com
Copyright © 2021 Diffbot.com