2015-12-18

by Jerome Choo
  • Fixed an issue where plain-text POSTed to the Article API would not perform text analysis (tags, sentiment, language-detection).
  • Improved Crawlbot behavior on Ajax-heavy sites so that pages with the exact same HTML source are no longer deduplicated.
  • Fixed an issue within the Crawlbot and Bulk interfaces where the "Last 500" URL Report was incorrectly returning the first 500.
  • Improved author detection within the Article API.

2015-12-07

by Jerome Choo

The Analyze API now supports POSTed content.

2015-11-27

by Jerome Choo

The Account API now returns a list of child or sub-tokens.

2015-11-19

by Jerome Choo
  • Fixed an issue in the Analyze API where products with an API-Toolkit-overridden price field would not reflect changes in the "details" field (offerPriceDetails, regularPriceDetails, etc.).
  • Fixed an Article API issue for certain top-level domains where articles dated in the near future (e.g., tomorrow) would incorrectly be returned with a date from the prior year.

2015-11-11

by Jerome Choo
  • Crawlbot will now successfully spider URLs that contain (invalid) UTF-8 characters.
  • Global Index API: search-by-tag can optionally be performed using a tag-match shorthand.

2015-10-16

by Jerome Choo
  • Fixed an issue where Crawlbot and Bulk API data downloads did not include a filename.
  • The breadcrumb element is now a default field in the Article API.

2015-10-22

by Jerome Choo
  • We now offer an Account API for tracking token API usage and billing history.
  • Global Index API: negative search queries (diffbot AND -"machine learning") are now functioning as documented;

2015-10-08

by Jerome Choo
  • APIs no longer ignore "format characters"—invisible characters that may have an effect on neighboring characters. For example, ‌.
  • Crawlbot and Bulk Service URL Reports now offer an option to download the last 500 URLs crawled.
  • Global Index API: Faceted date queries will no longer return a min value of 0.

2015-10-01

by Jerome Choo

Hey, as of today we're publishing a changelog. It's visible... here.

2015-09-23

by Jerome Choo
  • Additional token support within a single account has been added. Additional tokens are available on a case-by-case basis to paying customers. Please contact [email protected] if you would like to discuss additional tokens.
  • API Toolkit now allows direct update of URL pattern / regular expression without having to create a new ruleset.
  • API Toolkit rule output automatically trims fields to remove leading or trailing blank spaces.
  • The diffbotUri field is now computed based on rule-based output, if a custom rule is used to override default output.
  • The resolvedPageUrl is correctly returned in Custom APIs (if a submitted page is redirected).