Google News API

Scrape Google News articles, topic streams, story clusters, and publisher pages as structured JSON

Google News API

The Google News API is a specialized plugin that returns Google News results (articles, topic clusters, menu navigation, related topics) as clean JSON. One HTTP call per request. Deeper browsing happens by chaining tokens (topic_token, section_token, story_token, publication_token) returned inside any response.

Credit Usage: Each successful request costs 10 credits. For bulk processing, use the Async API with plugins.

Key Features

Six Driver Parameters: Search by keyword (q), browse a topic (topic_token), drill into a section (section_token), expand a story cluster (story_token), view a publication's page (publication_token), or pivot through a Knowledge Graph entity (kgmid).
Direct Publisher URLs: news_results[].link is the publisher URL directly. No Google redirect to follow, no AMP wrapper to unwrap.
Per-Article Metadata: Bylines (source.authors), publisher icons, relative date ("3 hours ago"), ISO 8601 timestamps, and thumbnails (full + small).
Story Clusters: Multi-outlet coverage of the same news story comes back as a cluster object with a stories[] array, same schema as flat articles.
Sort by Relevance or Date: so=0 / so=1 controls keyword and entity searches. Other drivers use Google's editorial ordering.
Localized Menus & Topics: menu_links[] (top nav: U.S., World, Business, Technology, …) and related_topics[] are returned localized for hl / gl.
No Pagination Quirks: A single response returns the first page (~100 articles). Deeper browsing is done by chaining tokens, with no manual offset arithmetic.
No Blocks or CAPTCHAs: All anti-bot measures are handled automatically by Scrape.do.

Endpoint

GET https://api.scrape.do/plugin/google/news

Request Parameters

Required

Parameter	Type	Description
`token`	string	Your Scrape.do API authentication token

Plus exactly one driver:

Driver	Type	Fetches
`q`	string	Keyword search (e.g., `q=openai`)
`topic_token`	string	A topic stream (U.S., World, Business, Technology, …)
`section_token`	string	A section within a topic (Latest, For You, Opinion, …)
`story_token`	string	Full-coverage page for a single news story
`publication_token`	string	A publisher's page (CNN, BBC, Reuters, …)
`kgmid`	string	A Knowledge Graph entity ID (e.g., `/m/02_286` for New York City)

Sending no driver returns 400 one of q, topic_token, section_token, story_token, publication_token, kgmid is required. Sending two returns 400 exactly one of ... may be set.

Tokens are returned inside news_results[], menu_links[], sub_menu_links[], related_topics[], and related_publications[] on every response. Chain them to navigate.

Tokens rotate occasionally. When a token stops working (502 unexpected response), fetch a fresh one from a recent response and retry.

Localization

Parameter	Type	Default	Description
`hl`	string	`en`	Language code (e.g., `en`, `tr`, `de`, `fr`, `ja`, `pt-br`)
`gl`	string	`us`	Country code (e.g., `us`, `gb`, `de`, `tr`, `jp`, `br`)
`google_domain`	string	`google.com`	Echoed back; Google News uses one global origin and the locale comes from `hl` / `gl`

Sort (search mode only)

Parameter	Values	Description
`so`	`0` or `1`	`0` = by relevance (default), `1` = by date. Only valid with `q` or `kgmid`

Example Usage

Keyword Search

curl --location --request GET 'https://api.scrape.do/plugin/google/news?token=<SDO-token>&q=openai'

import requests
import json

token = "<SDO-token>"

url = f"https://api.scrape.do/plugin/google/news?token={token}&q=openai"

response = requests.request("GET", url)

print(json.dumps(response.json(), indent=2))

const axios = require('axios');

const token = "<SDO-token>";

const url = `https://api.scrape.do/plugin/google/news?token=${token}&q=openai`;

axios.get(url)
  .then(response => {
    console.log(JSON.stringify(response.data, null, 2));
  })
  .catch(error => {
    console.error(error);
  });

package main

import (
	"fmt"
	"io/ioutil"
	"net/http"
)

func main() {
	token := "<SDO-token>"

	url := fmt.Sprintf(
		"https://api.scrape.do/plugin/google/news?token=%s&q=openai",
		token,
	)

	resp, err := http.Get(url)
	if err != nil {
		panic(err)
	}
	defer resp.Body.Close()

	body, _ := ioutil.ReadAll(resp.Body)
	fmt.Println(string(body))
}

require 'net/http'
require 'json'

token = "<SDO-token>"

url = URI("https://api.scrape.do/plugin/google/news?token=#{token}&q=openai")

response = Net::HTTP.get(url)

puts JSON.pretty_generate(JSON.parse(response))

import java.net.HttpURLConnection;
import java.net.URL;
import java.io.BufferedReader;
import java.io.InputStreamReader;

public class GoogleNews {
    public static void main(String[] args) throws Exception {
        String token = "<SDO-token>";

        String url = String.format(
            "https://api.scrape.do/plugin/google/news?token=%s&q=openai",
            token
        );

        HttpURLConnection conn = (HttpURLConnection) new URL(url).openConnection();
        conn.setRequestMethod("GET");

        BufferedReader reader = new BufferedReader(
            new InputStreamReader(conn.getInputStream())
        );
        String line;
        StringBuilder response = new StringBuilder();
        while ((line = reader.readLine()) != null) {
            response.append(line);
        }
        reader.close();

        System.out.println(response.toString());
    }
}

using System;
using System.Net.Http;
using System.Threading.Tasks;

class Program
{
    static async Task Main()
    {
        string token = "<SDO-token>";

        string url = $"https://api.scrape.do/plugin/google/news?token={token}&q=openai";

        using HttpClient client = new HttpClient();
        string response = await client.GetStringAsync(url);

        Console.WriteLine(response);
    }
}

<?php
$token = "<SDO-token>";

$url = "https://api.scrape.do/plugin/google/news?token={$token}&q=openai";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
curl_close($ch);

echo json_encode(json_decode($response), JSON_PRETTY_PRINT);
?>

curl "https://api.scrape.do/plugin/google/news?q=openai&hl=en&gl=us&token=$TOKEN"

Localized Keyword Search

curl "https://api.scrape.do/plugin/google/news?q=bundesliga&hl=de&gl=de&token=$TOKEN"

Sort by Date

curl "https://api.scrape.do/plugin/google/news?q=openai&so=1&token=$TOKEN"

Topic Stream

curl "https://api.scrape.do/plugin/google/news?topic_token=CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB&token=$TOKEN"

Story Cluster

curl "https://api.scrape.do/plugin/google/news?story_token=CAAqNggKIjBDQklTSGpvSmMzUnZjbmt0TXpZd1NoRUtEd2lLOWRfNUVCRllrZlFnbTlaN3F5Z0FQAQ&token=$TOKEN"

Knowledge Graph Entity

# /m/02_286 = New York City
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/02_286&so=1&token=$TOKEN"

Response

Top-Level Shape

{
  "search_parameters": { ... },
  "title": "U.S.",
  "news_results": [ ... ],
  "menu_links": [ ... ],
  "sub_menu_links": [ ... ],
  "related_topics": [ ... ],
  "related_publications": [ ... ]
}

news_results is always an array (empty array when no results, never null). Other fields are present contextually:

title: populated on topic and publication responses (e.g. "Technology", "CNN").
menu_links: top navigation strip; same on every page (localized by hl / gl).
sub_menu_links: sections within the current topic / publication.
related_topics: populated on keyword searches when the query resolves to a known entity.
related_publications: populated on publication pages.

`search_parameters`

{
  "engine": "google_news",
  "q": "openai",
  "google_domain": "google.com",
  "hl": "en",
  "gl": "us"
}

Optional fields (topic_token, section_token, story_token, publication_token, kgmid, so) appear when set.

`news_results[]`

Each entry is either a flat article or a cluster (a single story covered by multiple outlets).

Flat article

{
  "position": 2,
  "title": "OpenAI Takes Aim at Google with New Image Model",
  "link": "https://www.theinformation.com/newsletters/ai-agenda/openai-takes-aim-google-new-image-model",
  "source": {
    "name": "The Information",
    "title": "The Information",
    "icon": "https://encrypted-tbn3.gstatic.com/faviconV2?...",
    "authors": ["Stephanie Palazzolo"]
  },
  "date": "20 hours ago",
  "iso_date": "2026-04-20T14:00:00Z",
  "thumbnail": "https://tii.imgix.net/production/articles/16959/46bed976.png",
  "thumbnail_small": "https://tii.imgix.net/production/articles/16959/46bed976.png",
  "topic_token": "CAAqKAgKIiJDQkFTRXdvTkwyY3ZNVEZuWW1oeGNqaHhlaElDWlc0b0FBUAE",
  "publication_token": "CAAqLggKIihDQklTR0FnTWFoUUtFblJvWldsdVptOXliV0YwYVc5dUxtTnZiU2dBUAE"
}

Cluster

The outer entry holds only the cluster headline; individual articles live in stories[].

{
  "position": 1,
  "title": "Federal Reserve signals rate cut ahead",
  "stories": [
    {
      "position": 1,
      "title": "Federal Reserve signals rate cut ahead",
      "link": "https://www.reuters.com/markets/us/federal-reserve-signals-rate-cut-2026-04-21/",
      "source": { "name": "Reuters", "icon": "..." },
      "date": "2 hours ago",
      "iso_date": "2026-04-21T12:10:50Z",
      "thumbnail": "..."
    },
    {
      "position": 2,
      "title": "Wall Street rallies on Fed signal",
      "link": "https://www.wsj.com/...",
      "source": { "name": "The Wall Street Journal" },
      "date": "3 hours ago",
      "iso_date": "2026-04-21T11:00:00Z"
    }
  ]
}

Field reference

Field	Type	Description
`position`	int	1-based position
`title`	string	Article (or cluster) headline
`link`	string	Direct publisher URL with no redirect to follow
`source.name`	string	Publisher name
`source.title`	string	Publisher display title
`source.icon`	string	Publisher favicon URL
`source.authors`	string[]	Bylines when Google exposes them
`date`	string	Relative time (`"3 hours ago"`, `"2 days ago"`)
`iso_date`	string	RFC 3339 publication timestamp
`thumbnail`	string	Article image URL when available
`thumbnail_small`	string	Small-variant image URL
`topic_token`	string	Token for the parent topic. Pass back as `topic_token`
`story_token`	string	Token for the story cluster. Pass back as `story_token`
`publication_token`	string	Token for the publisher. Pass back as `publication_token`
`section_token`	string	Token for the section. Pass back as `section_token`
`stories`	array	Cluster entries only: related articles covering the same story

`menu_links[]`

The top navigation strip; same on every page (localized by hl / gl).

[
  { "position": 1, "title": "U.S.",      "topic_token": "CAAq..." },
  { "position": 2, "title": "World",     "topic_token": "CAAq..." },
  { "position": 3, "title": "Business",  "topic_token": "CAAq..." },
  { "position": 4, "title": "Technology","topic_token": "CAAq..." }
]

`sub_menu_links[]`

Present on topic and publication pages: sections within the current scope (Latest, For You, Opinion, …), each with a section_token.

`related_topics[]`

Populated on keyword searches when the query resolves to a known entity (person, place, organization). Each item carries a topic_token you can use to pivot to that entity's topic stream.

`related_publications[]`

Populated on publication pages: adjacent publishers Google suggests.

Notes

No redirects. link is the publisher URL directly.
Tokens are opaque. Pass them back exactly as returned. Don't parse, construct, or cache them across days.
Tokens rotate. When a token stops working (502 unexpected response), fetch a fresh one from a recent search response and retry.
No pagination. A single response is the first page (~100 articles for keyword searches). Deeper browsing happens by chaining tokens.

Navigating with Tokens

Google News doesn't paginate. Instead, it exposes a graph of topics, sections, stories, and publications. You navigate by chaining the opaque *_token values returned in any response.

Token Types

Token	Where you find it	What it fetches
`topic_token`	`news_results[].topic_token`, `menu_links[].topic_token`, `related_topics[].topic_token`	A topic stream (U.S., Business, Technology, Sports, …)
`section_token`	`sub_menu_links[].section_token`	A section within a topic (Latest, For You, Opinion, …)
`story_token`	`news_results[].story_token` (on cluster entries)	Full coverage for a single news story across multiple outlets
`publication_token`	`news_results[].publication_token`, `related_publications[].publication_token`	A publisher's page (CNN, BBC, Reuters, …)

You also have:

q: keyword search (not a token; a string).
kgmid: a Knowledge Graph entity ID (e.g., /m/02_286 for New York City). Stable across responses.

Tokens are opaque. Pass them back exactly as returned. Don't parse, construct, or cache them across days. Tokens rotate occasionally; when one stops working (502 unexpected response), fetch a fresh one from a recent response.

Workflow Patterns

Search → Topic Pivot

A keyword search response includes related_topics[] when the query resolves to an entity. Pivot into the topic to get the editorial feed for that entity instead of search relevance.

# Step 1: search
curl -s "https://api.scrape.do/plugin/google/news?q=apple&token=$TOKEN" \
  | jq '.related_topics'

# Output:
# [ { "title": "Apple Inc.", "topic_token": "CAAq..." } ]

# Step 2: pivot into the topic
TOPIC=$(curl -s "https://api.scrape.do/plugin/google/news?q=apple&token=$TOKEN" \
  | jq -r '.related_topics[0].topic_token')

curl "https://api.scrape.do/plugin/google/news?topic_token=$TOPIC&token=$TOKEN"

Topic → Section

A topic response includes sub_menu_links[] with sections like Latest, For You, Opinion, etc. Drill into one for that section's feed.

SECTION=$(curl -s "https://api.scrape.do/plugin/google/news?topic_token=$TOPIC&token=$TOKEN" \
  | jq -r '.sub_menu_links[] | select(.title=="Latest") | .section_token')

curl "https://api.scrape.do/plugin/google/news?section_token=$SECTION&token=$TOKEN"

Article → Story Cluster

Search and topic results sometimes return cluster entries: entries with a stories[] array and a story_token. Pass story_token as the driver to get the full cluster page (often more outlets than the inline stories[] snapshot).

STORY=$(curl -s "https://api.scrape.do/plugin/google/news?q=fed+rate+cut&token=$TOKEN" \
  | jq -r '.news_results[] | select(.story_token) | .story_token' \
  | head -n 1)

curl "https://api.scrape.do/plugin/google/news?story_token=$STORY&token=$TOKEN"

Article → Publication

Each article carries a publication_token. Pivot to the publisher's page to see their recent coverage.

PUB=$(curl -s "https://api.scrape.do/plugin/google/news?q=openai&token=$TOKEN" \
  | jq -r '.news_results[0].publication_token')

curl "https://api.scrape.do/plugin/google/news?publication_token=$PUB&token=$TOKEN"

Every response includes menu_links[], Google News's top navigation strip (U.S., World, Business, Technology, Sports, Entertainment, Science, Health). Each entry exposes a topic_token. Use it to navigate without first searching.

TECH=$(curl -s "https://api.scrape.do/plugin/google/news?q=anything&token=$TOKEN" \
  | jq -r '.menu_links[] | select(.title=="Technology") | .topic_token')

curl "https://api.scrape.do/plugin/google/news?topic_token=$TECH&token=$TOKEN"

Knowledge Graph Pivot

kgmid is a stable identifier for a Knowledge Graph entity (a person, place, organization). Unlike the *_token values, it doesn't rotate, so you can store it.

# /m/02_286 = New York City, stable across responses
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/02_286&so=1&token=$TOKEN"

# /m/0k8z = Apple Inc.
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/0k8z&token=$TOKEN"

kgmid accepts the so (sort) parameter; use so=1 for date-sorted entity feeds.

Sort Order Across Drivers

Driver	`so` accepted	Default ordering
`q`	✅	Relevance (`so=0`); switch with `so=1`
`kgmid`	✅	Same as `q`
`topic_token`	❌	Google's editorial ordering (recency-weighted)
`section_token`	❌	Same
`story_token`	❌	Same
`publication_token`	❌	Same

Sending so with a non-search driver returns 400 so is only valid with q or kgmid (search mode).

Handling Stale Tokens

Tokens rotate occasionally. Symptoms:

502 unexpected response from the API.
A previously-working token suddenly returns no results.

Recovery: fetch a fresh token from a recent response and retry. Don't store tokens long-term; treat them as ephemeral cursors. The two stable identifiers are q (keyword strings) and kgmid (entity IDs).

# Pseudocode for a resilient navigator
fresh_response=$(curl -s "https://api.scrape.do/plugin/google/news?q=$KEYWORD&token=$TOKEN")
fresh_topic=$(echo "$fresh_response" | jq -r '.related_topics[0].topic_token // empty')

# Use the fresh token even if you had one cached
curl "https://api.scrape.do/plugin/google/news?topic_token=$fresh_topic&token=$TOKEN"

Error Handling

{ "error": "error_code", "message": "Human readable error message" }

Common Error Codes

Status	Error	Description
`400`	`token is required`	Missing API token
`400`	`one of q, topic_token, section_token, story_token, publication_token, kgmid is required`	No driver parameter set
`400`	`exactly one of q, topic_token, section_token, story_token, publication_token, kgmid may be set`	More than one driver parameter set
`400`	`invalid google_domain`	Unrecognized Google domain
`400`	`so must be 0 (relevance) or 1 (date)`	Invalid `so` value
`400`	`so is only valid with q or kgmid (search mode)`	`so` passed with a non-search driver
`502`	`request failed`	Transient. Retry
`502`	`unexpected response`	Upstream returned an unexpected page (often a stale `_token`). Fetch a fresh token from a recent response and retry*
`500`	`decompression failed` / `failed to parse news results`	Transient. Retry

Google News API

Localization

Transient Errors

On this page

Meet with Scraping Pros

+10B Requests Every Month

+10B Requests
Every Month