Available with Extract APIs using the &fields=
parameter
These fields are not included in the output of any Extract APIs by default, but are accessible by specifying a comma-delimited list of fields in the &fields=
parameter of any Extract API.
Example Request
curl --request GET \
--url 'https://api.diffbot.com/v3/analyze?url=https%3A%2F%2Fwww.diffbot.com&fields=links&token=<YOURTOKEN>' \
--header 'Accept: application/json'
Available Optional Fields
These optional fields are available for all Extract APIs.
Field | Description |
---|---|
links | Returns a top-level object (links ) containing all visible hyperlinks found on the page. |
extlinks | Returns a top-level object (links ) containing every hyperlink found in header and footer nodes (visible or not). Links in the body node are not retrieved. |
meta | Returns a top-level object (meta ) containing the full contents of page meta tags, including sub-arrays for OpenGraph tags, Twitter Card metadata, schema.org microdata, and oEmbed metadata (if available). |
querystring | Returns any key/value pairs present in the URL querystring. Items without a discrete value will be returned as true . |
breadcrumb | Returns a top-level array (breadcrumb ) of URLs and link text from page breadcrumbs. |
content | Returns a top-level field (content ) of the text on the page extracted by Article Extract API . |
allContent | Returns a top-level field (allContent ) of all the visible rendered text on the page. Similar to running document.querySelector('body').innerText . |
dom | Returns a top-level field (dom ) of the the rendered page source. |
If they have it, API-specific optional fields may also be found in their respective API Reference pages