Optional Fields

Available with Extract APIs using the &fields= parameter

These fields are not included in the output of any Extract APIs by default, but are accessible by specifying a comma-delimited list of fields in the &fields= parameter of any Extract API.

Example Request

curl --request GET \
     --url 'https://api.diffbot.com/v3/analyze?url=https%3A%2F%2Fwww.diffbot.com&fields=links&token=<YOURTOKEN>' \
     --header 'Accept: application/json'

Available Optional Fields

These optional fields are available for all Extract APIs.

FieldDescription
linksReturns a top-level object (links) containing all visible hyperlinks found on the page.
extlinksReturns a top-level object (links) containing every hyperlink found in header and footer nodes (visible or not). Links in the body node are not retrieved.
metaReturns a top-level object (meta) containing the full contents of page meta tags, including sub-arrays for OpenGraph tags, Twitter Card metadata, schema.org microdata, and oEmbed metadata (if available).
querystringReturns any key/value pairs present in the URL querystring. Items without a discrete value will be returned as true.
breadcrumbReturns a top-level array (breadcrumb) of URLs and link text from page breadcrumbs.
contentReturns a top-level field (content) of the text on the page extracted by Article Extract API .
allContentReturns a top-level field (allContent) of all the visible rendered text on the page. Similar to running document.querySelector('body').innerText.
domReturns a top-level field (dom) of the the rendered page source.

If they have it, API-specific optional fields may also be found in their respective API Reference pages