October 1st, 2019
Added Longitude and/or Latitude data to 53M Organizations
Added sicClassification attribute to Organizations
Created a more robust employment category taxonomy and ML model in support of employment data
Improved coverage of parentCompany attribute for subsidiary organization entities
Normalized stock exchange labels to improve filtering and discoverability.
Deployed bug fixes to developer Dashboards
September 1st, 2019
Added support for the inclusion of RocketReach email contact data (in addition to LeadIQ).
Added support for extraction of the headquarter address from headquarter building entity.
Began improvements to Record Linking for Organizations with emphasis on improving subsidiary data accuracy.
Improved coverage of Org and Person data records with a focus on: 'educated at', 'member of', 'owner of', and 'position held' data fields.
Improved Role Classifications: separating CEO and Director.
Enhanced and extended the Visual Query Builder Tool in the Developer Dashboard.
August 1st, 2019
Size is now supported in facet queries for articles
Enabled access to crawls and bulk jobs created on child tokens from the app.diffbot.com UI when logged in under the parent token.
Enabled the cloning of a crawl from a crawl job page from the app.diffbot.com UI.
Made significant improvements to the performance of the app.diffbot.com UI.
Added location inference to the Natural Language API.
Improved how importance score is generated for spam profiles.
Improved deduplication on Organization Founders.
Now avoid linking to the same DiffbotURI for some fields, such as the parent and subsidiary entities cannot link to the same unique identifier - Google and Alphabet must have unique IDs.
Removed bad descriptions from the allDescriptions field.
Improved age calculation/inference logic.
July 1st, 2019
Added support for multiple Headquarters locations for Organizations.
Added support for multiple stock exchange symbol/pairs.
Improved extraction of city from neighborhoods.
Added support for display of English tags for Non-English taggers.
Trained a Dutch Entitylinker.
Improved RawDataSentinels supporting Organization data ingest including subsidiary data
Improved sub-record linking between Organizations and Founders.
Now force extraction of Headquarter address from HQ building entity.
Now ensure countries are always classified as administrative areas.
Populated missing address in location for 81Mil organizations.
Improved the error message returned for mismatched quotes in DQL queries.
Ensured users have the ability to stop or pause a crawl between crawl rounds from the Dashboard.
Forced the persistence of the assignment of a customAPI to a crawl job.
Set the article title in the field.
Now rank person images for Person profiles.
In DKG: facet-ing on parent key for enums now expand to <enum>.normalizedValue
Now cache Person and Organization images, including logos.
June 1st, 2019
Committed to delivering 100% accuracy of 'Fortune 1000' Company entity profile core facts (name, headquarters location, website, CEO, founders, logo, isPublic, parent organization, year founded, stock ticker symbol and exchanges, twitter handle, size attributes - employee count & annual top-line revenues) in the Diffbot KnowledgeGraph (DKG).
Enhanced isPublic field population in the DKG.
Enhanced stock ticker symbol extraction in the DKG.
Fixed rules for assigning min and max employees to an Organization in the DKG.
Enriched 3Mil organizations with no revenue data in the DKG.
Improved selection of location for Organization.location in the DKG.
Improved evaluation of postal codes when an address has no street address in the DKG.
Enhanced age calculation/inference in the DKG.
Improved Candidate selection for email address and phone number in the DKG.
Added support for > and < for date/time fields in DQL.
Querying on a DiffbotURI is now strict by default in DQL.
Added support for type:Post (discussions) to DQL.
Added contextually embedded links to docs from the Crawlbot UI.
May 1st, 2019
We addressed missing revenues for over 80Mil company entities in the Diffbot KnowledgeGraph (DKG).
Improved DKG entity postal code assignments.
Improved DKG entity Stock Exchange assignments
We removed cookie disclaimer text from DKG entity descriptions.
We improved Organization entity classification in the DKG.
We added the ability to facet on Organization name tokens in DQL.
We expanded currency support in the Diffbot extraction APIs to include ALL currencies in Europe in addition to the European Union (Euro currency standard).
We
improved DQL error messages.
We lifted the limit on facet pagination.
Organization size attributes are now supported in facets.
We normalized Organization entity importance in the DKG to score between 1 and 100.
April 1st, 2019
Improved Organization Data Quality (i.e. sub-record linking of CEOs and Founders) in the Diffbot KnowledgeGraph (DKG).
Added dedicated process to parse subsidiary entities in the DKG.
Added support for multiple Person/Organization descriptions in the DKG.
Fixed date/timestamp conversion bugs in DQL.
Optimized revenue.value and revenue.currency extractions for Organization profile data in the DKG.
Added support for pagination of facets in DQL.
Added support for querying by tags for type:Image in DQL.
Added facet count to the Diffbot KnowledgeGraph Search API response.
February 1st, 2019
Extended coverage of Entities located or residing in Asia to the Diffbot KnowledgeGraph.
Added support for the strict operator to DQL.
December 1st, 2018
Improved date/time extraction, timezone support in Diffbot extraction APIs.
Added support for 'has:'operator to DQL for Articles and Products.
October 1st, 2018
Added DQL support for type:Product has:breadcrumb.name
Added support for computation of total investment when individual investments have different currencies (Organization Profile).
Added support for svg image file type for Entity images.
Added indexing of Entity description fields.
Improved tokenization for Chinese/Japanese tagging.
Added hit count for facets.