- Determine the
typeof any URL submitted.
- Return the full Diffbot extraction for any supported types: articles, products, images, discussion threads, videos — and more coming soon. See all Automatic APIs.
- Or, just return the type of page you’re looking for by using the
Why would you use the Analyze API over calling the Product API, Article API, or another API directly?
- If you are handling web pages of unknown origin (e.g., end-user submitted/shared links), the Analyze API will prevent spurious extractions from unsupported pages.
- When spidering a site using Crawlbot, the Analyze API will prevent extracting every site page via a single API. For instance, using Analyze makes it easy to “retrieve all the product data from ECommerceStore.com” without additional configuration.
And why would you opt for a specific API over the general Analyze endpoint?
- If you are certain of your web-page type (e.g., all articles), sending calls directly to the specific API endpoint will result in 100% extractions. There is always a small chance that the Analyze API will mis-classify confusing pages.
Note that you can also use the
fallback argument if you’d like to ensure that unsupported pages are processed by a specific API of your choosing.
Custom rules applied to a specific API, and specific API parameters (e.g.,
fields=meta,videos,html for the Article API) will be handled appropriately regardless of using the Analyze or specific extraction APIs.