Can Diffbot crawl sites that use “infinite” or “endless” scrolling?

No. Crawl will only pursue links that are available upon an initial page load.

Currently Crawl does not interact with sites to retrieve or pursue links that appear when a page is scrolled — so-called “infinite” or “endless” scrolling. Crawl will only pursue links that are available upon an initial page load.

In most cases sites will offer alternative means to find the same links:

  • related links (to other posts or products) on individual post or product pages
  • search filters or category links that narrow the number of results
  • a sitemap file (e.g. sitemap.xml) or similar map to individual item pages

If you find a site that is unable to be crawled without page-scrolling, you may be able to improve results via the following approach:

  • Write Custom Javascript via Diffbot’s custom X-Evaluate header, implementing a click or scroll event — or multiple click/scroll events.
  • Store your X-Evaluate header as a custom rule against the Analyze API for the site in question.
    Use the aforementioned method to execute Ajax/Javascript while crawling

For assistance with the above, feel free to contact us at [email protected].


Did this page help you?