What is Browser Harness?

Browser Harness is a self-healing browser automation system that connects directly to Chrome via the Chrome DevTools Protocol (CDP) over a single WebSocket. Unlike Playwright or Selenium, there's no framework layer between the agent and the browser. When the agent needs a capability that doesn't exist in helpers.py, it writes the function itself mid-task and continues.

How does Browser Harness self-heal?

When the agent encounters a task it can't complete with existing helper functions, it edits helpers.py directly to add the missing function, then uses that function to complete the task. For example: if upload_file() doesn't exist and the agent needs to upload a file, it writes upload_file(), adds it to helpers.py (extending it from 192 to 199 lines), then calls it. The fix persists for the rest of the session.

What is the difference between Browser Harness and Playwright or browser-use?

Playwright and Selenium are framework-based — they provide a fixed set of APIs that constrain what agents can do. Browser-use adds Python and AI on top of Playwright, still with framework overhead. Browser Harness removes all abstraction: one WebSocket to Chrome, a ~195-line helpers.py as the starting toolset, and the agent can extend helpers.py freely. No framework limits what the agent can do.

What are domain skills in Browser Harness?

Domain skills are agent-generated files that capture the selectors, flows, and edge cases for specific sites or tasks — GitHub, LinkedIn, Amazon, etc. The agent writes these itself when it figures out something non-obvious. They're contributed back to the repo so other users' agents don't have to rediscover the same patterns. You don't hand-author them; the agent does.

Does Browser Harness work with a cloud browser?

Yes. Browser-use offers a cloud tier (cloud.browser-use.com) with a free tier of 3 concurrent browsers, no card required. Get an API key at cloud.browser-use.com/new-api-key. The agent can also sign itself up via docs.browser-use.com/llms.txt.

What is the CDP (Chrome DevTools Protocol)?

CDP is Chrome's native debugging protocol — a WebSocket-based interface that gives direct programmatic control over all browser functions: navigation, DOM inspection, JavaScript execution, network monitoring, screenshot capture, input simulation. Playwright and Selenium both use CDP internally but wrap it in an abstraction layer. Browser Harness exposes CDP directly.

Why remove the framework layer for browser agents?

Framework abstractions are designed for human programmers who need guardrails and consistent APIs. LLMs don't need guardrails — they need freedom. Every framework layer adds constraints on what the agent can express, introduces failure modes at the abstraction boundary, and adds tokens to context. Direct CDP gives the agent complete control and eliminates the 'this isn't supported by the framework' failure class.

Browser Harness: The Self-Healing Browser Agent That Writes Its Own Tools Mid-Task

Q: How do I install Browser Harness?

Paste this into Claude Code or Codex: 'Set up https://github.com/browser-use/browser-harness for me. Read install.md first to install and connect this repo to my real browser. Then read SKILL.md for normal usage. Always read helpers.py because that is where the functions are.' The agent handles the rest.

By Prahlad Menon Published 2026-04-20 4 min read

Every browser automation framework makes a tradeoff: they give you a clean API in exchange for limiting what you can do. Playwright handles 95% of browser tasks well. The other 5% — uploading files in unusual ways, handling edge-case OAuth flows, interacting with iframes that break standard selectors — hit framework limits and fail.

For human programmers, this is acceptable. You hit a wall, you find a workaround.

For AI agents running autonomously, this is a fundamental problem. The agent has no way to work around framework limits. It just fails.

Browser Harness takes a different approach: remove the framework entirely, and let the agent write what’s missing.

The Architecture

The entire system is ~600 lines:

run.py        (~36 lines)  — runs plain Python with helpers preloaded
helpers.py    (~195 lines) — the starting toolset; the agent edits these
admin.py      (~361 lines) — daemon bootstrap + CDP WebSocket + socket bridge
SKILL.md                   — day-to-day usage instructions
install.md                 — first-time setup
domain-skills/             — agent-generated site-specific knowledge

One WebSocket to Chrome via CDP. No Playwright, no Selenium, no intermediate abstraction. The agent sees the browser state directly and acts on it directly.

Self-Healing in Practice

Here’s what the self-healing looks like in the repo’s own example:

● agent: wants to upload a file
│
● helpers.py → upload_file() missing
│
● agent edits the harness and writes it
  helpers.py 192 → 199 lines
  + upload_file()
│
✓ file uploaded

The agent needs upload_file(). It doesn’t exist. Instead of failing, the agent:

Opens helpers.py
Writes the upload_file() function
Adds it to the file (192 → 199 lines)
Calls it
Continues

The function persists. If the agent needs to upload another file later in the same session, upload_file() is there. If someone contributes the session’s helpers.py additions back to the repo, the fix is available to everyone.

This is the “bitter lesson” applied to browser automation: instead of trying to anticipate every possible task with a comprehensive framework, give the agent the primitives and let it build what it needs.

Setting It Up (Claude Code or Codex)

The install prompt from the README is worth quoting directly:

Set up https://github.com/browser-use/browser-harness for me.

Read `install.md` first to install and connect this repo to my real browser.
Then read `SKILL.md` for normal usage.
Always read `helpers.py` because that is where the functions are.
When you open a setup or verification tab, activate it so I can see the active browser tab.
After it is installed, open this repository in my browser and, if I am logged in
to GitHub, ask me whether you should star it for me as a quick demo that the
interaction works — only click the star if I say yes. If I am not logged in,
just go to browser-use.com.

You paste this into Claude Code or Codex and the agent handles the rest — reading the install docs, enabling Chrome remote debugging, connecting via CDP, and verifying the setup with a live browser interaction.

Domain Skills: Agent-Generated Knowledge

Browser Harness includes a domain-skills/ directory with agent-generated knowledge for specific sites: GitHub, LinkedIn, Amazon, and others. These capture the non-obvious selectors, flows, and edge cases that take trial and error to discover.

The design principle is important: skills are written by the agent, not by you. When your agent figures out how to handle an unusual flow on a site, it files the skill itself. The repo asks contributors to PR these agent-generated files — not hand-authored ones, because hand-authored ones describe what developers think the flows look like, not what actually works.

The community effect compounds: every agent that encounters an unusual LinkedIn login flow and figures it out adds to the domain skill. The next agent starts with that knowledge already baked in.

Why Framework-Free Matters

The browser-use team has a post called “The Bitter Lesson” that explains the design philosophy. The short version: every framework layer encodes assumptions about what browser tasks look like. Those assumptions are wrong for the long tail of tasks. The more powerful the agent, the more the framework gets in the way.

Direct CDP gives the agent:

Complete DOM access — not filtered through framework abstractions
Arbitrary JavaScript execution — in any frame, on any element
Network interception — modify requests and responses in flight
Full input simulation — mouse, keyboard, touch, exactly as Chrome sees them
No version mismatches — Chrome’s CDP API is stable; framework APIs drift

The failure class “this isn’t supported by the framework” simply doesn’t exist.

Cloud Option

For deployments that need browser instances without managing Chrome yourself, browser-use cloud offers 3 concurrent browsers free, no card required. The agent can sign itself up by reading docs.browser-use.com/llms.txt — a plain-text file designed for LLM consumption that walks through the signup flow.

Resources

GitHub: github.com/browser-use/browser-harness
Cloud: cloud.browser-use.com
The Bitter Lesson post: browser-use.com/posts/bitter-lesson-agent-frameworks
Skills post: browser-use.com/posts/web-agents-that-actually-learn