logo

Google News API

Scrape Google News articles, topic streams, story clusters, and publisher pages as structured JSON

Google News API

The Google News API is a specialized plugin that returns Google News results (articles, topic clusters, menu navigation, related topics) as clean JSON. One HTTP call per request. Deeper browsing happens by chaining tokens (topic_token, section_token, story_token, publication_token) returned inside any response.

Credit Usage: Each successful request costs 10 credits. For bulk processing, use the Async API with plugins.

Key Features

  • Six Driver Parameters: Search by keyword (q), browse a topic (topic_token), drill into a section (section_token), expand a story cluster (story_token), view a publication's page (publication_token), or pivot through a Knowledge Graph entity (kgmid).
  • Direct Publisher URLs: news_results[].link is the publisher URL directly. No Google redirect to follow, no AMP wrapper to unwrap.
  • Per-Article Metadata: Bylines (source.authors), publisher icons, relative date ("3 hours ago"), ISO 8601 timestamps, and thumbnails (full + small).
  • Story Clusters: Multi-outlet coverage of the same news story comes back as a cluster object with a stories[] array, same schema as flat articles.
  • Sort by Relevance or Date: so=0 / so=1 controls keyword and entity searches. Other drivers use Google's editorial ordering.
  • Localized Menus & Topics: menu_links[] (top nav: U.S., World, Business, Technology, …) and related_topics[] are returned localized for hl / gl.
  • No Pagination Quirks: A single response returns the first page (~100 articles). Deeper browsing is done by chaining tokens, with no manual offset arithmetic.
  • No Blocks or CAPTCHAs: All anti-bot measures are handled automatically by Scrape.do.

Endpoint

GET https://api.scrape.do/plugin/google/news

Request Parameters

Required

ParameterTypeDescription
tokenstringYour Scrape.do API authentication token

Plus exactly one driver:

DriverTypeFetches
qstringKeyword search (e.g., q=openai)
topic_tokenstringA topic stream (U.S., World, Business, Technology, …)
section_tokenstringA section within a topic (Latest, For You, Opinion, …)
story_tokenstringFull-coverage page for a single news story
publication_tokenstringA publisher's page (CNN, BBC, Reuters, …)
kgmidstringA Knowledge Graph entity ID (e.g., /m/02_286 for New York City)

Sending no driver returns 400 one of q, topic_token, section_token, story_token, publication_token, kgmid is required. Sending two returns 400 exactly one of ... may be set.

Tokens are returned inside news_results[], menu_links[], sub_menu_links[], related_topics[], and related_publications[] on every response. Chain them to navigate.

Tokens rotate occasionally. When a token stops working (502 unexpected response), fetch a fresh one from a recent response and retry.

Localization

ParameterTypeDefaultDescription
hlstringenLanguage code (e.g., en, tr, de, fr, ja, pt-br)
glstringusCountry code (e.g., us, gb, de, tr, jp, br)
google_domainstringgoogle.comEchoed back; Google News uses one global origin and the locale comes from hl / gl

Sort (search mode only)

ParameterValuesDescription
so0 or 10 = by relevance (default), 1 = by date. Only valid with q or kgmid

Example Usage

curl --location --request GET 'https://api.scrape.do/plugin/google/news?token=<SDO-token>&q=openai'
import requests
import json

token = "<SDO-token>"

url = f"https://api.scrape.do/plugin/google/news?token={token}&q=openai"

response = requests.request("GET", url)

print(json.dumps(response.json(), indent=2))
const axios = require('axios');

const token = "<SDO-token>";

const url = `https://api.scrape.do/plugin/google/news?token=${token}&q=openai`;

axios.get(url)
  .then(response => {
    console.log(JSON.stringify(response.data, null, 2));
  })
  .catch(error => {
    console.error(error);
  });
package main

import (
	"fmt"
	"io/ioutil"
	"net/http"
)

func main() {
	token := "<SDO-token>"

	url := fmt.Sprintf(
		"https://api.scrape.do/plugin/google/news?token=%s&q=openai",
		token,
	)

	resp, err := http.Get(url)
	if err != nil {
		panic(err)
	}
	defer resp.Body.Close()

	body, _ := ioutil.ReadAll(resp.Body)
	fmt.Println(string(body))
}
require 'net/http'
require 'json'

token = "<SDO-token>"

url = URI("https://api.scrape.do/plugin/google/news?token=#{token}&q=openai")

response = Net::HTTP.get(url)

puts JSON.pretty_generate(JSON.parse(response))
import java.net.HttpURLConnection;
import java.net.URL;
import java.io.BufferedReader;
import java.io.InputStreamReader;

public class GoogleNews {
    public static void main(String[] args) throws Exception {
        String token = "<SDO-token>";

        String url = String.format(
            "https://api.scrape.do/plugin/google/news?token=%s&q=openai",
            token
        );

        HttpURLConnection conn = (HttpURLConnection) new URL(url).openConnection();
        conn.setRequestMethod("GET");

        BufferedReader reader = new BufferedReader(
            new InputStreamReader(conn.getInputStream())
        );
        String line;
        StringBuilder response = new StringBuilder();
        while ((line = reader.readLine()) != null) {
            response.append(line);
        }
        reader.close();

        System.out.println(response.toString());
    }
}
using System;
using System.Net.Http;
using System.Threading.Tasks;

class Program
{
    static async Task Main()
    {
        string token = "<SDO-token>";

        string url = $"https://api.scrape.do/plugin/google/news?token={token}&q=openai";

        using HttpClient client = new HttpClient();
        string response = await client.GetStringAsync(url);

        Console.WriteLine(response);
    }
}
<?php
$token = "<SDO-token>";

$url = "https://api.scrape.do/plugin/google/news?token={$token}&q=openai";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
curl_close($ch);

echo json_encode(json_decode($response), JSON_PRETTY_PRINT);
?>
curl "https://api.scrape.do/plugin/google/news?q=openai&hl=en&gl=us&token=$TOKEN"
curl "https://api.scrape.do/plugin/google/news?q=bundesliga&hl=de&gl=de&token=$TOKEN"

Sort by Date

curl "https://api.scrape.do/plugin/google/news?q=openai&so=1&token=$TOKEN"

Topic Stream

curl "https://api.scrape.do/plugin/google/news?topic_token=CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB&token=$TOKEN"

Story Cluster

curl "https://api.scrape.do/plugin/google/news?story_token=CAAqNggKIjBDQklTSGpvSmMzUnZjbmt0TXpZd1NoRUtEd2lLOWRfNUVCRllrZlFnbTlaN3F5Z0FQAQ&token=$TOKEN"

Knowledge Graph Entity

# /m/02_286 = New York City
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/02_286&so=1&token=$TOKEN"

Response

Top-Level Shape

{
  "search_parameters": { ... },
  "title": "U.S.",
  "news_results": [ ... ],
  "menu_links": [ ... ],
  "sub_menu_links": [ ... ],
  "related_topics": [ ... ],
  "related_publications": [ ... ]
}

news_results is always an array (empty array when no results, never null). Other fields are present contextually:

  • title: populated on topic and publication responses (e.g. "Technology", "CNN").
  • menu_links: top navigation strip; same on every page (localized by hl / gl).
  • sub_menu_links: sections within the current topic / publication.
  • related_topics: populated on keyword searches when the query resolves to a known entity.
  • related_publications: populated on publication pages.

search_parameters

{
  "engine": "google_news",
  "q": "openai",
  "google_domain": "google.com",
  "hl": "en",
  "gl": "us"
}

Optional fields (topic_token, section_token, story_token, publication_token, kgmid, so) appear when set.

news_results[]

Each entry is either a flat article or a cluster (a single story covered by multiple outlets).

Flat article

{
  "position": 2,
  "title": "OpenAI Takes Aim at Google with New Image Model",
  "link": "https://www.theinformation.com/newsletters/ai-agenda/openai-takes-aim-google-new-image-model",
  "source": {
    "name": "The Information",
    "title": "The Information",
    "icon": "https://encrypted-tbn3.gstatic.com/faviconV2?...",
    "authors": ["Stephanie Palazzolo"]
  },
  "date": "20 hours ago",
  "iso_date": "2026-04-20T14:00:00Z",
  "thumbnail": "https://tii.imgix.net/production/articles/16959/46bed976.png",
  "thumbnail_small": "https://tii.imgix.net/production/articles/16959/46bed976.png",
  "topic_token": "CAAqKAgKIiJDQkFTRXdvTkwyY3ZNVEZuWW1oeGNqaHhlaElDWlc0b0FBUAE",
  "publication_token": "CAAqLggKIihDQklTR0FnTWFoUUtFblJvWldsdVptOXliV0YwYVc5dUxtTnZiU2dBUAE"
}

Cluster

The outer entry holds only the cluster headline; individual articles live in stories[].

{
  "position": 1,
  "title": "Federal Reserve signals rate cut ahead",
  "stories": [
    {
      "position": 1,
      "title": "Federal Reserve signals rate cut ahead",
      "link": "https://www.reuters.com/markets/us/federal-reserve-signals-rate-cut-2026-04-21/",
      "source": { "name": "Reuters", "icon": "..." },
      "date": "2 hours ago",
      "iso_date": "2026-04-21T12:10:50Z",
      "thumbnail": "..."
    },
    {
      "position": 2,
      "title": "Wall Street rallies on Fed signal",
      "link": "https://www.wsj.com/...",
      "source": { "name": "The Wall Street Journal" },
      "date": "3 hours ago",
      "iso_date": "2026-04-21T11:00:00Z"
    }
  ]
}

Field reference

FieldTypeDescription
positionint1-based position
titlestringArticle (or cluster) headline
linkstringDirect publisher URL with no redirect to follow
source.namestringPublisher name
source.titlestringPublisher display title
source.iconstringPublisher favicon URL
source.authorsstring[]Bylines when Google exposes them
datestringRelative time ("3 hours ago", "2 days ago")
iso_datestringRFC 3339 publication timestamp
thumbnailstringArticle image URL when available
thumbnail_smallstringSmall-variant image URL
topic_tokenstringToken for the parent topic. Pass back as topic_token
story_tokenstringToken for the story cluster. Pass back as story_token
publication_tokenstringToken for the publisher. Pass back as publication_token
section_tokenstringToken for the section. Pass back as section_token
storiesarrayCluster entries only: related articles covering the same story

The top navigation strip; same on every page (localized by hl / gl).

[
  { "position": 1, "title": "U.S.",      "topic_token": "CAAq..." },
  { "position": 2, "title": "World",     "topic_token": "CAAq..." },
  { "position": 3, "title": "Business",  "topic_token": "CAAq..." },
  { "position": 4, "title": "Technology","topic_token": "CAAq..." }
]

Present on topic and publication pages: sections within the current scope (Latest, For You, Opinion, …), each with a section_token.

Populated on keyword searches when the query resolves to a known entity (person, place, organization). Each item carries a topic_token you can use to pivot to that entity's topic stream.

Populated on publication pages: adjacent publishers Google suggests.


Notes

  • No redirects. link is the publisher URL directly.
  • Tokens are opaque. Pass them back exactly as returned. Don't parse, construct, or cache them across days.
  • Tokens rotate. When a token stops working (502 unexpected response), fetch a fresh one from a recent search response and retry.
  • No pagination. A single response is the first page (~100 articles for keyword searches). Deeper browsing happens by chaining tokens.

Google News doesn't paginate. Instead, it exposes a graph of topics, sections, stories, and publications. You navigate by chaining the opaque *_token values returned in any response.

Token Types

TokenWhere you find itWhat it fetches
topic_tokennews_results[].topic_token, menu_links[].topic_token, related_topics[].topic_tokenA topic stream (U.S., Business, Technology, Sports, …)
section_tokensub_menu_links[].section_tokenA section within a topic (Latest, For You, Opinion, …)
story_tokennews_results[].story_token (on cluster entries)Full coverage for a single news story across multiple outlets
publication_tokennews_results[].publication_token, related_publications[].publication_tokenA publisher's page (CNN, BBC, Reuters, …)

You also have:

  • q: keyword search (not a token; a string).
  • kgmid: a Knowledge Graph entity ID (e.g., /m/02_286 for New York City). Stable across responses.

Tokens are opaque. Pass them back exactly as returned. Don't parse, construct, or cache them across days. Tokens rotate occasionally; when one stops working (502 unexpected response), fetch a fresh one from a recent response.

Workflow Patterns

Search → Topic Pivot

A keyword search response includes related_topics[] when the query resolves to an entity. Pivot into the topic to get the editorial feed for that entity instead of search relevance.

# Step 1: search
curl -s "https://api.scrape.do/plugin/google/news?q=apple&token=$TOKEN" \
  | jq '.related_topics'

# Output:
# [ { "title": "Apple Inc.", "topic_token": "CAAq..." } ]

# Step 2: pivot into the topic
TOPIC=$(curl -s "https://api.scrape.do/plugin/google/news?q=apple&token=$TOKEN" \
  | jq -r '.related_topics[0].topic_token')

curl "https://api.scrape.do/plugin/google/news?topic_token=$TOPIC&token=$TOKEN"

Topic → Section

A topic response includes sub_menu_links[] with sections like Latest, For You, Opinion, etc. Drill into one for that section's feed.

SECTION=$(curl -s "https://api.scrape.do/plugin/google/news?topic_token=$TOPIC&token=$TOKEN" \
  | jq -r '.sub_menu_links[] | select(.title=="Latest") | .section_token')

curl "https://api.scrape.do/plugin/google/news?section_token=$SECTION&token=$TOKEN"

Article → Story Cluster

Search and topic results sometimes return cluster entries: entries with a stories[] array and a story_token. Pass story_token as the driver to get the full cluster page (often more outlets than the inline stories[] snapshot).

STORY=$(curl -s "https://api.scrape.do/plugin/google/news?q=fed+rate+cut&token=$TOKEN" \
  | jq -r '.news_results[] | select(.story_token) | .story_token' \
  | head -n 1)

curl "https://api.scrape.do/plugin/google/news?story_token=$STORY&token=$TOKEN"

Article → Publication

Each article carries a publication_token. Pivot to the publisher's page to see their recent coverage.

PUB=$(curl -s "https://api.scrape.do/plugin/google/news?q=openai&token=$TOKEN" \
  | jq -r '.news_results[0].publication_token')

curl "https://api.scrape.do/plugin/google/news?publication_token=$PUB&token=$TOKEN"

Top-Level Menu

Every response includes menu_links[], Google News's top navigation strip (U.S., World, Business, Technology, Sports, Entertainment, Science, Health). Each entry exposes a topic_token. Use it to navigate without first searching.

TECH=$(curl -s "https://api.scrape.do/plugin/google/news?q=anything&token=$TOKEN" \
  | jq -r '.menu_links[] | select(.title=="Technology") | .topic_token')

curl "https://api.scrape.do/plugin/google/news?topic_token=$TECH&token=$TOKEN"

Knowledge Graph Pivot

kgmid is a stable identifier for a Knowledge Graph entity (a person, place, organization). Unlike the *_token values, it doesn't rotate, so you can store it.

# /m/02_286 = New York City, stable across responses
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/02_286&so=1&token=$TOKEN"

# /m/0k8z = Apple Inc.
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/0k8z&token=$TOKEN"

kgmid accepts the so (sort) parameter; use so=1 for date-sorted entity feeds.

Sort Order Across Drivers

Driverso acceptedDefault ordering
qRelevance (so=0); switch with so=1
kgmidSame as q
topic_tokenGoogle's editorial ordering (recency-weighted)
section_tokenSame
story_tokenSame
publication_tokenSame

Sending so with a non-search driver returns 400 so is only valid with q or kgmid (search mode).

Handling Stale Tokens

Tokens rotate occasionally. Symptoms:

  • 502 unexpected response from the API.
  • A previously-working token suddenly returns no results.

Recovery: fetch a fresh token from a recent response and retry. Don't store tokens long-term; treat them as ephemeral cursors. The two stable identifiers are q (keyword strings) and kgmid (entity IDs).

# Pseudocode for a resilient navigator
fresh_response=$(curl -s "https://api.scrape.do/plugin/google/news?q=$KEYWORD&token=$TOKEN")
fresh_topic=$(echo "$fresh_response" | jq -r '.related_topics[0].topic_token // empty')

# Use the fresh token even if you had one cached
curl "https://api.scrape.do/plugin/google/news?topic_token=$fresh_topic&token=$TOKEN"

Error Handling

{ "error": "error_code", "message": "Human readable error message" }

Common Error Codes

StatusErrorDescription
400token is requiredMissing API token
400one of q, topic_token, section_token, story_token, publication_token, kgmid is requiredNo driver parameter set
400exactly one of q, topic_token, section_token, story_token, publication_token, kgmid may be setMore than one driver parameter set
400invalid google_domainUnrecognized Google domain
400so must be 0 (relevance) or 1 (date)Invalid so value
400so is only valid with q or kgmid (search mode)so passed with a non-search driver
502request failedTransient. Retry
502unexpected responseUpstream returned an unexpected page (often a stale *_token). Fetch a fresh token from a recent response and retry
500decompression failed / failed to parse news resultsTransient. Retry

On this page