Extracts a page using a modified Extract API or a custom ruleset.
If you need just one more field from an Extract API, or if one or more field values are incorrect, you may use a Custom API to override or add those fields using rules.
Correcting a field’s output takes immediate effect for your account, and also serves to train our system, improving Diffbot extraction over the long run.
Extracting a page with a Custom API works just like all other Extract APIs. Simply pass a URL to your Custom API's unique endpoint.
You may wish to start with creating a Custom API.
The Custom API returns data in JSON format.
Each response includes a
request object (which returns request-specific metadata), and an
objects array, which will include the extracted information for all objects on a submitted page.
For Custom APIs the
objects array will always contain a single object, and all custom fields and collections will be returned therein.
Custom API may also return some optional fields if specified. (comma delimited) in the
Already have the source HTML? POST it to Custom API.
Custom API supports a POST option that allows you to upload HTML or plain text for extraction. See Extract Content Not Available Online.