December 1st, 2018

December 2018

by Jerome Choo

October 1st, 2018

October 2018

by Jerome Choo

Added DQL support for type:Product has:breadcrumb.name
Added support for computation of total investment when individual investments have different currencies (Organization Profile).
Added support for svg image file type for Entity images.
Added indexing of Entity description fields.
Improved tokenization for Chinese/Japanese tagging.
Added hit count for facets.

August 1st, 2018

by Jerome Choo

Launched the Diffbot Knowledge Graph including a new developer Dashboard, embedded ontology documentation, and an OpenAPI spec.

February 27th, 2018

by Jerome Choo

URL Report downloads are now sorted in newest-first order
Crawlbot now indexes the seed URL of each extracted object in the fromSeedUrl field.

January 5th, 2018

by Jerome Choo

Crawlbot API: Added the useCanonical argument to allow disabling of canonical URL deduplication on specific crawls.

November 10th, 2017

by Jerome Choo

October 30th, 2017

by Jerome Choo

Custom API fields using the attribute filter will now return all matching selector values, not just the first attribute match.

October 25th, 2017

by Jerome Choo

Crawlbot and Bulk Service data retrieval no longer requires access to port :18100. Data downloads are also now HTTPS-only.

October 16th, 2017

by Jerome Choo

Fixed a rare issue where custom rules could be accidentally deleted.
Significant performance improvements in the Search API.
Improved crawling performance and site coverage in the Global Index.
Improved ability to identify, analyze and return background images in all extraction APIs.

August 31st, 2017

by Jerome Choo

Fixed an issue in the Video API where the url value would retain HTML escaping if present within the original page source.
Fixed a rare crawling issue that occasionally resulted in "Bad IP" status messages for individual pages.
Fixed an issue where empty <video> elements could be returned in the Article API.