List API - NEW!

by Kris Negulescu

Today we've added The List API to the Diffbot API suite of extraction APIs. The List API automatically extracts data from any single web page that contains a primary list of items, such as news index pages, product listings pages, and search engine results pages. It is also embedded under the Diffbot Analyze API which will now attempt to identify and extract pages that match the criteria for a List page. Please refer to the List API documentation for more information.

similarTo 'id1 and ID2'

by Kris Negulescu

We've added the ability to query on look-alikes using two or more Diffbot Org IDs as inputs.

similarTo(id:or(id1, id2, ...)): e.g. 'Organizations similar to Target and Walmart'

type:Organization similarTo(id:or("ExADb18D6MAmunRrlVELe8A", "EOU1WEvHYN6K83Etm91H9fQ"))

We have introduced a new model to predict revenue for private companies. Almost every company in the KG (over 243M) now has revenue either extracted from the web or estimated with this model.

Try 'similarTo' Searches

by Kris Negulescu

Use the Knowledge Graph to find look-alikes:

type:Organization similarTo(type:Organization name:"Walmart")

You can now filter your Custom API rules with search.

We added a 'Graph view' to the Diffbot Knowledge Graph for Person and Organization profile data.

Word Count Parameter

by Jerome Choo

We've added a new Article API parameter that returns the word count for article text extracted as part of a Crawlbot or BulkAPI job: &wordcount