Extract with Custom API

Extracts a page using a modified Extract API or a custom ruleset.

If you need just one more field from an Extract API, or if one or more field values are incorrect, you may use a Custom API to override or add those fields using rules.

Correcting a field’s output takes immediate effect for your account, and also serves to train our system, improving Diffbot extraction over the long run.

Extracting a page with a Custom API works just like all other Extract APIs. Simply pass a URL to your Custom API's unique endpoint.

You may wish to start with creating a Custom API.

Response

The Custom API returns data in JSON format.

Each response includes a request object (which returns request-specific metadata), and an objects array, which will include the extracted information for all objects on a submitted page.

For Custom APIs the objects array will always contain a single object, and all custom fields and collections will be returned therein.

Optional Fields

Custom API may also return some optional fields if specified. (comma delimited) in the &fields= argument.

Already have the source HTML? POST it to Custom API.

Custom API supports a POST option that allows you to upload HTML or plain text for extraction. See Extract Content Not Available Online.

Language
Authorization
Query
Click Try It! to start a request and see the response here!