Skip to content
Loading

Cloudflare Browser Run: The Browser Your AI Agent Actually Needs

Cloudflare Browser Run: The Browser Your AI Agent Actually Needs hero image

I was reading through Cloudflare's latest announcements when I stumbled on something that made me stop and re-read the whole post twice. They renamed Browser Rendering to Browser Run - and tucked inside that rename is a collection of features that genuinely changes what's possible when building AI agents that need to browse the web.

I'm relatively new to this space, so I came in with fresh eyes. And honestly? That made the announcement hit harder. A lot of the problems Browser Run solves are things I'd been bumping into and just assumed were... the cost of doing business. Turns out, they don't have to be.

Here's what Browser Run actually gives you, and why I think it's a big deal.


First, Why Is Browser Automation So Hard for Agents?

When an AI agent navigates a website, the loop looks deceptively simple: take a screenshot, send it to a model, get back an instruction (click here, type this), execute it, repeat. Easy enough on paper.

But the web is chaotic. A login modal pops up unexpectedly. A CAPTCHA appears. The page layout silently changed overnight. The agent doesn't know how to handle any of that- it either fails quietly or spirals in confusion. And you won't know it happened until much later, when you come back and find it stuck.

Then there's the infrastructure side. Want to run 30 browser sessions in parallel? You need 30 browsers. Want 60? Now you're managing infrastructure instead of building things. That's not where anyone wants to be spending their energy.

Browser Run on Cloudflare's global network takes direct aim at both problems.


Live View: Actually Seeing What Your Agent Sees

This one immediately made me think: wait, that's it? That's all I needed?

Live View lets you watch your agent's browser session in real time - the actual rendered page, the DOM, console output, network requests. All of it, live. You just open a URL and you're looking through your agent's eyes.

Without this, the debugging workflow is painful: kick off the agent, wait for it to finish or fail, dig through logs, try to piece together what went wrong from incomplete breadcrumbs. With Live View, you open the session and watch. If the agent is stuck on a login screen, you see it immediately. If a button isn't where the agent expects it, you know in seconds.

Two ways to get to it: grab the session_id from your code and open the devtoolsFrontendURL in Chrome, or head to the new Live Sessions tab in the Cloudflare dashboard and click in.

Pro-tip: Live View isn't just for debugging failures - it's great for understanding how your agent is behaving on pages you haven't fully tested yet. Watch a few sessions manually before you let it run unsupervised.

It's a small quality-of-life feature on paper. In practice, it fundamentally changes how you build and iterate.


Human in the Loop: The Escape Hatch Agents Needed

Here's a situation that's basically a rite of passage when building any web automation: the agent is doing great, sailing through a multi-step flow, and then - sign-in required. Or a CAPTCHA. The agent can't handle it. The whole session dies. You start over.

Human in the Loop is the escape hatch. When the agent hits a wall, a human can step in via Live View, handle the blocker (log in, solve the CAPTCHA, enter the verification code), and hand control back to the agent. The session picks up exactly where it left off. Nothing is thrown away.

What's coming soon makes this even more interesting: the agent will be able to actively signal that it needs help, triggering a notification to a human rather than requiring someone to be watching. That turns Human in the Loop from a debugging convenience into a real production pattern - one where your automation runs unattended but gracefully escalates the bits it can't handle.

Pro-tip: Think of Human in the Loop less as a fallback and more as a design primitive. You can intentionally build flows that expect human checkpoints - like reviewing a form before the agent submits it.


CDP Endpoint: Direct Protocol Access

This one is a bit more technical, but it matters. Every browser automation library - Puppeteer, Playwright, Stagehand - is built on top of the Chrome DevTools Protocol (CDP). CDP is the actual wire-level protocol that controls the browser. Previously, Browser Run only exposed CDP through those higher-level abstractions.

Now they expose it directly.

That means any tool that speaks CDP natively can connect to Browser Run without any changes. Agent frameworks, custom tooling, anything. It also unlocks lower-level capabilities that libraries like Puppeteer don't surface, like JavaScript debugging. And for passing browser state to a model, raw CDP is more token-efficient than going through a library's abstraction layer.

Migrating from self-hosted Chrome is a one-liner:

// Before: self-hosted Chrome
const browser = await puppeteer.connect({
  browserWSEndpoint: "ws://localhost:9222/devtools/browser",
});

// After: Browser Run
const browser = await puppeteer.connect({
  browserWSEndpoint:
    "wss://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/browser-rendering/devtools/browser",
  headers: { Authorization: "Bearer <API_TOKEN>" },
});

Change the WebSocket URL. That's it. You stop managing a browser process and get Cloudflare's global infrastructure instead.

Pro-tip: If you're evaluating agent frameworks and trying to decide which ones can plug into Browser Run - if it speaks CDP, it can connect. That's a broad compatibility guarantee.


MCP Client Support: Your Coding Agent Gets a Browser

Because Browser Run now exposes CDP directly, MCP clients can use it as their remote browser. Claude Desktop, Cursor, Codex, and OpenCode can all connect through the chrome-devtools-mcp package from the Chrome DevTools team.

The config for Claude Desktop:

{
  "mcpServers": {
    "browser-rendering": {
      "command": "npx",
      "args": [
        "-y",
        "chrome-devtools-mcp@latest",
        "--wsEndpoint=wss://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/browser-rendering/devtools/browser?keep_alive=600000",
        "--wsHeaders={\"Authorization\":\"Bearer <API_TOKEN>\"}"
      ]
    }
  }
}

What this means: you can ask your coding agent to browse a page, inspect its structure, take a screenshot, fill a form - all through a managed browser running on Cloudflare's network. Not a local Chrome instance that crashes when your laptop sleeps. A real, persistent, globally-distributed browser.

Pro-tip: The keep_alive parameter in the WebSocket URL keeps the session warm between requests. Useful when you're using your coding agent interactively and don't want a cold start between prompts.


WebMCP: The Web Starting to Speak Agent

This one is more speculative, but it genuinely excited me. WebMCP is a new browser API from Google Chrome (available in Chromium 146+) that lets websites declare tools that AI agents can discover and call directly.

Today, an agent visiting a travel booking site has to navigate the UI like a human would - find the search box, click it, type the destination. With WebMCP, the site can declare: "here's a search_flights tool that takes an origin, destination, and date." The agent calls it directly. No screenshot loop. No fragile UI navigation.

This changes the reliability equation completely. If the agent isn't navigating by UI, UI changes can't break it. And tools are discovered per-page rather than pre-registered, which means it works across the whole web - not just sites you've specifically built integrations for.

Browser Run already has an experimental pool running Chrome beta instances, so you can start testing WebMCP workflows today.


Session Recordings: Debugging What You Weren't Watching

Live View covers you when you're present. Session Recordings covers everything else.

Enable it with recording: true when you launch a session. After it closes, the full recording - DOM changes, mouse and keyboard events, page navigation - is available in the dashboard or via API. You can replay any session with rrweb-player and scrub through the timeline.

Coming on the roadmap: the ability to inspect DOM state and console output at any specific point during the replay. Not just watching what happened, but being able to interrogate the page state at the exact moment things went wrong.

Pro-tip: Turn recordings on during initial development as a default, not just for debugging. You'll catch unexpected behaviour you wouldn't have thought to look for.


120 Concurrent Browsers, No Cold Starts

The concurrent session limit jumped from 30 to 120. For most use cases, that's enough to stop thinking about the limit entirely. Sessions are pulled from a global pool of warm instances, so there's no cold start penalty waiting for Chrome to initialise. Need more than 120? Available by request.

The Quick Actions endpoints - screenshot, PDF, markdown extraction, crawl - also got a bump to 10 requests per second. The /crawl endpoint is worth a special mention: give it a URL, get back the full site as HTML, Markdown, or structured JSON, with configurable depth and scope. It respects robots.txt and AI Crawl Control and signs requests with a distinct bot ID.


What I'd Build With This

Reading through all of this, my mind kept jumping to ideas. Here are the ones I can't stop thinking about:

A research agent that actually browses. You give it a question. It searches, follows links, reads pages, and comes back with a synthesised answer - not just a list of URLs. The problem today is that most "research" agents cheat and only use search APIs. A proper browsing agent would be genuinely more useful, and Browser Run makes the infrastructure side tractable.

An automated job application assistant. It fills in applications across multiple job boards - but the moment it hits a CAPTCHA, a login wall, or a custom form that needs a human judgment call, it hands off. Human in the Loop makes this pattern viable. You review the tricky bits; it handles the repetitive ones.

A parallel price monitor. You want to track a product across 50 websites. Previously that's 50 browser instances and a lot of infrastructure to manage. At 120 concurrent sessions from a warm pool, this becomes a background job you run on a schedule and forget about. The /crawl endpoint could handle the structured extraction.

A personal form-filling assistant. Connected to your coding agent via MCP, it fills in forms on your behalf - flight bookings, government portals, registration pages. Not a password manager, but a task-completing agent that can navigate arbitrary web UIs and check in with you when it gets stuck.

What ties all of these together is the combination of scale, observability, and human handoff. Each piece on its own is useful. Together they make a category of applications feel genuinely buildable in a way that didn't before.


Why This Feels Like a Shift

I came into this announcement without years of accumulated frustration about browser automation infrastructure. But even from the outside, you can see what's been missing: you could build agents that browse, but you couldn't trust them, watch them, or hand off to them reliably. The infrastructure was always the limiting factor.

Browser Run feels like the first time the infrastructure is actually designed for agents rather than adapted for them. Live View and Session Recordings solve the trust and observability problem. Human in the Loop solves the reliability problem. CDP and MCP support solve the integration problem. 120 warm concurrent sessions solve the scale problem.

The rename from Browser Rendering to Browser Run is a small thing. But it signals something real: this is infrastructure for running agents, not just for rendering pages. And now the feature set backs that up.

If you're exploring what AI agents can do on the web, this is worth digging into.