← Research

28 May 2026 · 3 min read

Google News blocks Cloudflare Workers. Bing News does not.

An hour-long debugging session that ended at a `503` and three half-truths about who actually serves the web's news index for bots.

The setup

We wanted news mentions of each tracked competitor in the weekly brief. Layoffs, fundings, partnerships, exec moves — the things that never show up in a site diff because companies don't publish bad news on their own homepage.

The natural answer is to query a news aggregator. Google News has an RSS endpoint that accepts any search query and returns a fresh list of articles with publisher, headline, URL, and date. No API key. Free forever, theoretically.

The crash

In production, the news source returned zero signals. Every time. Logs were silent — the function returned 0 cleanly. We assumed the parser was broken.

Turns out the parser is fine. Google News returns 200 OK with content-length: 0 to user-agents it doesn't like. The response is technically successful. The body is empty.

We tried browser-like user agents. We tried omitting the user agent. We tried with and without Accept headers. Same result: 200 OK, zero bytes.

Eventually we deployed to production and tailed the worker logs. There the failure mode was different: HTTP 503. Google News explicitly blocks requests originating from Cloudflare Workers' egress IPs. Which is why it works from a developer's laptop and fails from the production environment.

Bing News works

Same query, same browser user agent, same headers — Bing News RSS returns clean XML with 12 items and 9.7KB of body. From Cloudflare Workers. Every time.

We don't have a great theory for why Microsoft doesn't have the same datacentre-IP block list, but for this use case, we don't need one. We switched to Bing News as the primary source and kept Google News as a fallback for non-Workers infrastructure where it still works.

The general lesson

When you're building agent-composed content from public-web sources, "free RSS" is real free, until it isn't. The cost is finding out which providers are still tolerant of datacentre traffic and which have quietly moved bot-blocking into their CDN layer.

Three takeaways for anyone building similar pipelines:

  1. Tail the production worker, not the local one. The egress IP is the thing that matters. Local laptops are useless as a test fixture for any service that takes IP reputation into account.
  2. Treat "200 OK + empty body" as a failure signal. A surprising number of API gateways and CDN-layer bot filters return shaped success responses rather than honest 4xx/5xx errors. The shape is meant to make you stop debugging.
  3. Have a fallback for every external source. Anything you depend on for free will eventually start blocking you. The cost of plumbing in a second source is much lower than the cost of explaining to your users why a feature went away.

The Webflow layoffs were not on Webflow's website. They were on Bing.