Analyze

Automatically classify a page and extract data according to its type.

Query Params
string
required
Defaults to https://www.technologyreview.com/2020/09/04/1008156/knowledge-graph-ai-reads-web-machine-learning-natural-language-processing/

Web page URL of the analyze to process (URL encoded)

string
enum

By default the Analyze API will fully extract all pages that match an existing Extract API. Set mode to a specific Extract API (e.g., mode="article") to extract content only from that specific page-type. All other pages will simply return the default Analyze fields.

Allowed:
string
enum

If an appropriate API cannot be determined (pages classified with type "other"), fall back to this API.

Allowed:
string
enum

Specify optional fields to be returned from any fully-extracted pages (e.g. fields=querystring,links)

Allowed:
boolean

Pass discussion=false to disable automatic extraction of comments or reviews from pages identified as articles or products. This will not affect pages identified as discussions.

int32

Sets a value in milliseconds to wait for the retrieval/fetch of content from the requested URL. The default timeout for the third-party response is 30 seconds (30000).

string

Use for jsonp requests. Needed for cross-domain ajax.

string

Specify an IP address of a custom proxy that will be used to fetch the target page. (Ex: &proxy or &proxy=0.0.0.0)

string

Used to specify the authentication parameters that will be used with a custom proxy specified in the &proxy parameter. (Ex: proxyAuth=username:password)

string

Set to default to use Diffbot's datacenter proxy for this request. none will instruct Extract to not use proxies, even if proxies have been enabled for this particular URL globally.

integer
≤ 180000

Add additional time for rendering before the page is closed and the DOM is extracted. This can cause page timeouts, so a timeout parameter may be needed to extend the timeout. Note that the renderer closes automatically at 180 seconds.

string
enum

Direct the browser to scroll down the page, to trigger lazy-loaded content.

Allowed:
Responses

Language
Credentials
Query
LoadingLoading…
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json