Discussion

Automatically structure and extract entire threads of reviews/comments from articles, product pages, and forum threads.

The Discussion API automatically structures and extracts entire threads or lists of reviews/comments from most discussion pages, forums, and similarly structured web pages.

Test drive Discussion API without a trial token at diffbot.com/testdrive.

Response

The Discussion API returns data in JSON format.

Each response includes a request object (which returns request-specific metadata), and an objects array, which will include the extracted information for all objects on a submitted page.

Within the Article and Product APIs (to extract comments or review data), discussion data will be returned within a nested discussion object.

The following is an example response from a successful extraction of comments on a Reddit post.

{
  "request": {
    "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
    "api": "discussion",
    "version": 3
  },
  "objects": [
    {
      "numPages": 1,
      "humanLanguage": "en",
      "confidence": 0.05500000089407453,
      "diffbotUri": "discussion|3|-870809033",
      "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
      "numPosts": 13,
      "type": "discussion",
      "title": "[OC] 66% of Top 50 Russian Exposed Companies Have Announced Sanctions",
      "posts": [
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "images": [
            {
              "naturalHeight": 767,
              "width": 457,
              "diffbotUri": "image|3|-804821395",
              "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
              "url": "https://preview.redd.it/l76k59t8jsm81.png?width=457&auto=webp&s=632efa1f24e607358bbec99c161a6aa579aebfe1",
              "naturalWidth": 457,
              "height": 767
            }
          ],
          "humanLanguage": "en",
          "author": "hicheoo",
          "authorUrl": "https://old.reddit.com/user/hicheoo",
          "diffbotUri": "post|3|29462830",
          "html": "<figure><a href=\"https://i.redd.it/l76k59t8jsm81.png\"><img src=\"https://preview.redd.it/l76k59t8jsm81.png?width=457&auto=webp&s=632efa1f24e607358bbec99c161a6aa579aebfe1\"></img></a></figure>\n<h2>Want to add to the discussion?</h2>\n<p>Post a comment!</p>\n<p>Create an account</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 0,
          "text": "Want to add to the discussion?\nPost a comment!\n\n \nCreate an account",
          "type": "post",
          "title": "[OC] 66% of Top 50 Russian Exposed Companies Have Announced Sanctions"
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "not_mig",
          "authorUrl": "https://old.reddit.com/user/not_mig",
          "diffbotUri": "post|3|-720375378",
          "html": "<p>What's the difference between blue, yellow, and green?</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 1,
          "text": "What's the difference between blue, yellow, and green?",
          "type": "post",
          "parentId": 0
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "hicheoo",
          "authorUrl": "https://old.reddit.com/user/hicheoo",
          "diffbotUri": "post|3|-148816221",
          "html": "<p>They're exemptions. I should've clarified up top, but they're basically all in the description.</p>\n<p>Green: Typical Sanctions<br>\n Yellow: Sanctions, but might be a PR move.<br>\n Blue: Healthcare</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 2,
          "text": "They're exemptions. I should've clarified up top, but they're basically all in the description.\nGreen: Typical Sanctions\nYellow: Sanctions, but might be a PR move.\nBlue: Healthcare",
          "type": "post",
          "parentId": 1
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "Zealousideal-Lie7255",
          "authorUrl": "https://old.reddit.com/user/Zealousideal-Lie7255",
          "diffbotUri": "post|3|-683402068",
          "html": "<p>A lot of oil service companies have no reported sanctions. Like Schlumberger, Baker Hughes. Some Chinese companies too.</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 3,
          "text": "A lot of oil service companies have no reported sanctions. Like Schlumberger, Baker Hughes. Some Chinese companies too.",
          "type": "post",
          "parentId": 0
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "varnima",
          "authorUrl": "https://old.reddit.com/user/varnima",
          "diffbotUri": "post|3|-603833918",
          "html": "<p>JetBrains changed and imposed sanctions <a href=\"https://blog.jetbrains.com/blog/2022/03/11/jetbrains-statement-on-ukraine/\">https://blog.jetbrains.com/blog/2022/03/11/jetbrains-statement-on-ukraine/</a></p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 4,
          "text": "JetBrains changed and imposed sanctions https://blog.jetbrains.com/blog/2022/03/11/jetbrains-statement-on-ukraine/",
          "type": "post",
          "parentId": 0
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "hicheoo",
          "authorUrl": "https://old.reddit.com/user/hicheoo",
          "diffbotUri": "post|3|-296888207",
          "html": "<p>Yeah, they're green in the chart.</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 5,
          "text": "Yeah, they're green in the chart.",
          "type": "post",
          "parentId": 4
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "hicheoo",
          "authorUrl": "https://old.reddit.com/user/hicheoo",
          "diffbotUri": "post|3|624793084",
          "html": "<p><strong>Sources:</strong> - Diffbot Sanctions Tracker (<a href=\"https://www.diffbot.com/insights/every-company-affected-by-sanctions/\">https://www.diffbot.com/insights/every-company-affected-by-sanctions/</a>) - Diffbot Knowledge Graph (more detail on query below)</p>\n<p><strong>Data Viz Tool:</strong> Infogram</p>\n<p><strong>Disclaimer:</strong> I work for Diffbot</p>\n<p>I started by querying the Knowledge Graph for people who live in Russia but work for a non-Russian company. Faceting this query by their employer provides me with a list of non-Russian companies ranked by # of Russian employees.</p>\n<p><code>\ntype:Person location.country.name:&quot;Russia&quot; employments.{employer.{location.country.name!=&quot;Russia&quot; nbLocations&gt;0} isCurrent:true} facet:employments.{employer.name isCurrent:true}\n</code></p>\n<p>This data underrepresents actual employment figures, as there are many employees who do not maintain an internet presence linking them to their employer. Underrepresentation should be fairly equal across all companies, and relative position in the rankings should be accurate.</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 6,
          "text": "Sources: - Diffbot Sanctions Tracker (https://www.diffbot.com/insights/every-company-affected-by-sanctions/) - Diffbot Knowledge Graph (more detail on query below)\nData Viz Tool: Infogram\nDisclaimer: I work for Diffbot\nI started by querying the Knowledge Graph for people who live in Russia but work for a non-Russian company. Faceting this query by their employer provides me with a list of non-Russian companies ranked by # of Russian employees.\ntype:Person location.country.name:\"Russia\" employments.{employer.{location.country.name!=\"Russia\" nbLocations>0} isCurrent:true} facet:employments.{employer.name isCurrent:true}\nThis data underrepresents actual employment figures, as there are many employees who do not maintain an internet presence linking them to their employer. Underrepresentation should be fairly equal across all companies, and relative position in the rankings should be accurate.",
          "type": "post",
          "parentId": 0
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "zzzmick",
          "authorUrl": "https://old.reddit.com/user/zzzmick",
          "diffbotUri": "post|3|-130810969",
          "html": "<p>epam had over 10k employees in Russia</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 7,
          "text": "epam had over 10k employees in Russia",
          "type": "post",
          "parentId": 0
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "hicheoo",
          "authorUrl": "https://old.reddit.com/user/hicheoo",
          "diffbotUri": "post|3|-1458692070",
          "html": "<p>Yup. The data underrepresents actual employment figures, as there are many employees who do not maintain an internet presence linking them to their employer. Underrepresentation should be fairly equal across all companies, and relative position in the rankings should be accurate.</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 8,
          "text": "Yup. The data underrepresents actual employment figures, as there are many employees who do not maintain an internet presence linking them to their employer. Underrepresentation should be fairly equal across all companies, and relative position in the rankings should be accurate.",
          "type": "post",
          "parentId": 7
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "JanitorKarl",
          "authorUrl": "https://old.reddit.com/user/JanitorKarl",
          "diffbotUri": "post|3|-149138223",
          "html": "<p>Schlumberger and Baker Hughes are both in the oilfield services industry.</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 9,
          "text": "Schlumberger and Baker Hughes are both in the oilfield services industry.",
          "type": "post",
          "parentId": 0
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "flumenia",
          "authorUrl": "https://old.reddit.com/user/flumenia",
          "diffbotUri": "post|3|889762151",
          "html": "<p>What if Microsoft stops to extend licenses of Microsoft Office to Russia? That would make the biggest impact, I guess</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 10,
          "text": "What if Microsoft stops to extend licenses of Microsoft Office to Russia? That would make the biggest impact, I guess",
          "type": "post",
          "parentId": 0
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "Imperial_Empirical",
          "authorUrl": "https://old.reddit.com/user/Imperial_Empirical",
          "diffbotUri": "post|3|-179317804",
          "html": "<p>Putin ordered the development of Russian alternatives after the Crimean annexation due to dependancy/spying fears. I believe from 2016 onwards Microsoft was largely fased out internally.</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 11,
          "text": "Putin ordered the development of Russian alternatives after the Crimean annexation due to dependancy/spying fears. I believe from 2016 onwards Microsoft was largely fased out internally.",
          "type": "post",
          "parentId": 10
        },
        {
          "date": "Fri, 11 Mar 2022 00:00:00 GMT",
          "humanLanguage": "en",
          "author": "Nightblood83",
          "authorUrl": "https://old.reddit.com/user/Nightblood83",
          "diffbotUri": "post|3|-901046006",
          "html": "<p>A lot of accountants for commies...</p>",
          "pageUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/",
          "id": 12,
          "text": "A lot of accountants for commies...",
          "type": "post",
          "parentId": 0
        }
      ],
      "tags": [
        {
          "score": 0.8428076505661011,
          "count": 5,
          "label": "economic sanctions",
          "uri": "https://diffbot.com/entity/EWnXSPtH6Osi0pmx8-WPKAg",
          "rdfTypes": [
            "http://dbpedia.org/ontology/Miscellaneous"
          ]
        }
      ],
      "participants": 9,
      "rssUrl": "https://old.reddit.com/r/dataisbeautiful/comments/tbvdhu/oc_66_of_top_50_russian_exposed_companies_have/.rss"
    }
  ]
}

Optional Fields

Specify each field desired (comma delimited) in the &fields= argument. In addition to the fields listed below, there are also more fields available with all Extract APIs .

Field

Description

sentiment

Returns a sentiment score of each individual post, a value ranging from -1.0 (very negative) to 1.0 (very positive).

Already have the source HTML? POST it to Discussion API.

Discussion API supports a POST option that allows you to upload HTML or plain text for extraction. See Extract Content Not Available Online.

Language
Authentication
Query
Click Try It! to start a request and see the response here!