February 2019

by Jerome Choo
  • Extended coverage of Entities located or residing in Asia to the Diffbot KnowledgeGraph.
  • Added support for the strict operator to DQL.

December 2018

by Jerome Choo
  • Improved date/time extraction, timezone support in Diffbot extraction APIs.
  • Added support for 'has:'operator to DQL for Articles and Products.

October 2018

by Jerome Choo
  • Added DQL support for type:Product has:breadcrumb.name
  • Added support for computation of total investment when individual investments have different currencies (Organization Profile).
  • Added support for svg image file type for Entity images.
  • Added indexing of Entity description fields.
  • Improved tokenization for Chinese/Japanese tagging.
  • Added hit count for facets.

August 2018

by Jerome Choo
  • Launched the Diffbot Knowledge Graph including a new developer Dashboard, embedded ontology documentation, and an OpenAPI spec.

2018-02-27

by Jerome Choo
  • URL Report downloads are now sorted in newest-first order
  • Crawlbot now indexes the seed URL of each extracted object in the fromSeedUrl field.

2018-01-05

by Jerome Choo
  • Crawlbot API: Added the useCanonical argument to allow disabling of canonical URL deduplication on specific crawls.

2017-11-10

by Jerome Choo
  • Significant improvements to Video API site support.

2017-10-30

by Jerome Choo
  • Custom API fields using the attribute filter will now return all matching selector values, not just the first attribute match.

2017-10-25

by Jerome Choo

Crawlbot and Bulk Service data retrieval no longer requires access to port :18100. Data downloads are also now HTTPS-only.

2017-10-16

by Jerome Choo
  • Fixed a rare issue where custom rules could be accidentally deleted.
  • Significant performance improvements in the Search API.
  • Improved crawling performance and site coverage in the Global Index.
  • Improved ability to identify, analyze and return background images in all extraction APIs.