Google News Scraper API

Q: How does this differ from scraping news.google.com myself?

Google News uses internal RPCs and aggressive anti-bot detection. Scraping it directly means a headless browser, login token management, and stripping AMP wrappers from result links. This API consumes the structured feed directly and returns parsed JSON: direct publisher URLs (no redirect chain), ISO dates, bylines, thumbnails, all in a single call.

Q: What's the difference between a flat article and a cluster?

A flat article is a single piece: one publisher, one URL, one byline. A cluster is a story covered by multiple outlets. The outer entry holds the cluster headline and a stories[] array of individual articles, each the same schema as a flat article. Clusters carry a story_token you can pass to get the full multi-outlet coverage page.

Q: Why are there six driver parameters?

Each one is a different navigation primitive. q for keyword search, kgmid for entity-scoped feeds, topic_token for editorial topics, section_token for sections within a topic, story_token for full-coverage cluster pages, publication_token for a publisher's feed. You navigate Google's news graph by chaining tokens returned in any response.

Q: How do I paginate?

You don't. Google News doesn't paginate. A single response returns the first page (~100 articles for a keyword search). To go deeper, chain tokens . Pivot from a result into its topic_token , then into a section_token , then into a story_token , etc. Different angle, not deeper page.

Q: Are the article links direct, or do they go through Google's redirector?

Direct. news_results[].link is the publisher URL. No Google redirect, no AMP wrapper. You can fetch the article HTML straight from the link without an extra hop.

Q: How do I sort results by date?

Pass so=1 (default 0 = relevance). Only valid with q (keyword search) or kgmid (entity). Other drivers use Google's editorial ordering, recency-weighted but not strictly chronological.

Q: What's `kgmid`?

A Knowledge Graph entity ID. Each well-known person, place, or organization has a stable kgmid (e.g., /m/02_286 = New York City, /m/0k8z = Apple Inc.). Unlike topic_token , kgmid doesn't rotate. You can store it across days. Use kgmid + so=1 to get a date-sorted feed for a specific entity.

Q: What if a token I've stored stops working?

Tokens ( topic_token , section_token , story_token , publication_token ) rotate occasionally . When one stops working, the API returns 502 unexpected response . Recovery is straightforward: fetch a fresh token from a recent search response and retry. Don't store tokens long-term; treat them as ephemeral cursors. The two stable identifiers are q (strings) and kgmid (entity IDs).

Q: Can I get news in a specific language?

Yes. hl (language) and gl (country) work independently. hl=de&gl=de for German news from Germany; hl=ja&gl=jp for Japanese news from Japan. Localized publishers, localized menu_links , localized related_topics , the full Google News experience for that locale.

Q: Where can I find the full API reference?

The Google News API documentation covers the full driver list, response schema (flat article + cluster shapes), the navigation playbook with chaining patterns, and all error responses including stale-token recovery.

Google News Articles with Direct Publisher URLs

Scrape Google News by keyword, topic, story, publication, or Knowledge Graph entity. Each result returns the publisher URL directly. No redirect chains, no AMP wrappers. Bylines, ISO dates, thumbnails, and chained navigation tokens included.

START SCRAPING FOR FREE

Start scraping today with 1000 free credits. No Credit Card Required

Six Drivers, One Endpoint

Keyword search (`q`), topic stream (`topic_token`), section (`section_token`), story cluster (`story_token`), publication (`publication_token`), or Knowledge Graph entity (`kgmid`), all through one endpoint. Tokens chain across responses for graph-style navigation.

No pagination. A single response returns ~100 articles for keyword searches. Deeper browsing happens by chaining tokens through the news graph.

Direct Publisher URLs and Story Clusters

`news_results[].link` is the publisher URL directly. No Google redirect to follow, no AMP wrapper to strip. ISO 8601 dates, bylines (`source.authors`), publisher icons, and full + small thumbnails.

Multi-outlet story clusters come back with a `stories[]` array, each entry the same schema as a flat article. Pass the cluster's `story_token` to fetch full coverage across all outlets.

How It Works

Select your endpoint:

Select a Target

Send API Request

import requests
import json

token = "<SDO-token>"

url = f"https://api.scrape.do/plugin/google/news?token={token}&q=openai"

response = requests.request("GET", url)

print(json.dumps(response.json(), indent=2))

Get Structured JSON

JSON

{
  "search_parameters": {
    "engine": "google_news",
    "q": "openai",
    "google_domain": "google.com",
    "hl": "en",
    "gl": "us"
  },
  "news_results": [
    {
      "position": 1,
      "title": "Federal Reserve signals rate cut ahead",
      "stories": [
        {
          "position": 1,
          "title": "Federal Reserve signals rate cut ahead",
          "link": "https://www.reuters.com/markets/us/federal-reserve-signals-rate-cut-2026-04-21/",
          "source": { "name": "Reuters", "icon": "..." },
          "date": "2 hours ago",
          "iso_date": "2026-04-21T12:10:50Z",
          "thumbnail": "..."
        },
        {
          "position": 2,
          "title": "Wall Street rallies on Fed signal",
          "link": "https://www.wsj.com/...",
          "source": { "name": "The Wall Street Journal" },
          "date": "3 hours ago",
          "iso_date": "2026-04-21T11:00:00Z"
        }
      ]
    },
    {
      "position": 2,
      "title": "OpenAI Takes Aim at Google with New Image Model",
      "link": "https://www.theinformation.com/newsletters/ai-agenda/openai-takes-aim-google-new-image-model",
      "source": {
        "name": "The Information",
        "title": "The Information",
        "icon": "https://encrypted-tbn3.gstatic.com/faviconV2?...",
        "authors": ["Stephanie Palazzolo"]
      },
      "date": "20 hours ago",
      "iso_date": "2026-04-20T14:00:00Z",
      "thumbnail": "https://tii.imgix.net/production/articles/16959/46bed976.png",
      "topic_token": "CAAqKAgKIiJDQkFTRXdvTkwyY3ZNVEZuWW1oeGNqaHhlaElDWlc0b0FBUAE",
      "publication_token": "CAAqLggKIihDQklTR0FnTWFoUUtFblJvWldsdVptOXliV0YwYVc5dUxtTnZiU2dBUAE"
    }
  ],
  "menu_links": [
    { "position": 1, "title": "U.S.",       "topic_token": "CAAq..." },
    { "position": 2, "title": "World",      "topic_token": "CAAq..." },
    { "position": 3, "title": "Business",   "topic_token": "CAAq..." },
    { "position": 4, "title": "Technology", "topic_token": "CAAq..." }
  ],
  "related_topics": [
    { "title": "OpenAI", "topic_token": "CAAq..." },
    { "title": "Sam Altman", "topic_token": "CAAq..." }
  ]
}

Plans & pricing

How many Google News Scraper API calls per plan?

Every Google News Scraper API call costs 10 credits. Here's what each plan buys you.

Free

Credits

1,000

Google News Scraper API

100

calls / month

Hobby

$29/mo

Credits

250,000

Google News Scraper API

25,000

calls / month

Reliable, Scalable,Unstoppable Web Scraping

START SCRAPING FOR FREE

Frequently Asked Questions

Google News uses internal RPCs and aggressive anti-bot detection. Scraping it directly means a headless browser, login token management, and stripping AMP wrappers from result links. This API consumes the structured feed directly and returns parsed JSON: direct publisher URLs (no redirect chain), ISO dates, bylines, thumbnails, all in a single call.

A flat article is a single piece: one publisher, one URL, one byline. A cluster is a story covered by multiple outlets. The outer entry holds the cluster headline and a stories[] array of individual articles, each the same schema as a flat article. Clusters carry a story_token you can pass to get the full multi-outlet coverage page.

Each one is a different navigation primitive. q for keyword search, kgmid for entity-scoped feeds, topic_token for editorial topics, section_token for sections within a topic, story_token for full-coverage cluster pages, publication_token for a publisher's feed. You navigate Google's news graph by chaining tokens returned in any response.

You don't. Google News doesn't paginate. A single response returns the first page (~100 articles for a keyword search). To go deeper, chain tokens. Pivot from a result into its topic_token, then into a section_token, then into a story_token, etc. Different angle, not deeper page.

Direct. news_results[].link is the publisher URL. No Google redirect, no AMP wrapper. You can fetch the article HTML straight from the link without an extra hop.

Pass so=1 (default 0 = relevance). Only valid with q (keyword search) or kgmid (entity). Other drivers use Google's editorial ordering, recency-weighted but not strictly chronological.

A Knowledge Graph entity ID. Each well-known person, place, or organization has a stable kgmid (e.g., /m/02_286 = New York City, /m/0k8z = Apple Inc.). Unlike topic_token, kgmid doesn't rotate. You can store it across days. Use kgmid + so=1 to get a date-sorted feed for a specific entity.

Tokens (topic_token, section_token, story_token, publication_token) rotate occasionally. When one stops working, the API returns 502 unexpected response. Recovery is straightforward: fetch a fresh token from a recent search response and retry. Don't store tokens long-term; treat them as ephemeral cursors. The two stable identifiers are q (strings) and kgmid (entity IDs).

Yes. hl (language) and gl (country) work independently. hl=de&gl=de for German news from Germany; hl=ja&gl=jp for Japanese news from Japan. Localized publishers, localized menu_links, localized related_topics, the full Google News experience for that locale.

The Google News API documentation covers the full driver list, response schema (flat article + cluster shapes), the navigation playbook with chaining patterns, and all error responses including stale-token recovery.