POST markup or plain text directly to any Extract API endpoint
Note that the quality of analysis is dependent on many factors, among them the accessibility of page assets (images, CSS) and how reliant the page layout is on those that are unavailable.
https://api.diffbot.com/v3/analyze?token=...&url=...
Please note that the url
argument is still required, and will be used to resolve any relative links contained in the markup.
Provide the content to analyze as your POST body, and specify the Content-Type
header as text/html
(for full markup) or text/plain
(for text-only).
Example Request for Extracting HTML Markup
curl --request POST \
--header 'Content-Type: text/html' \
--url 'https://api.diffbot.com/v3/article?token=<YOURTOKEN>&url=http%3A%2F%2Fstore.diffbot.com' \
-d '<html><head><title>Diffbot Extract makes web data fun</title></head><body><h1>Make web data fun</h1><p>Do you know what makes working with web data fun? Diffbot! </p></body></html>'
Example Request for Extracting Plain Text
Only available with Article API.
curl --request POST \
--header 'Content-Type: text/plain' \
--url 'https://api.diffbot.com/v3/article?token=<YOURTOKEN>&url=http%3A%2F%2Fstore.diffbot.com' \
-d 'Now is the time for all good robots to come to the aid of their-- oh never mind, run!'