servo-agent

// Agent-controllable browser · built on Servo

A browser our agents can read.

servo-agent turns a build of the Servo engine into a browser an LLM can drive and read. It wraps Servo's built-in W3C WebDriver, and read_page distills the post-render DOM into clean markdown — so an agent reads content, not tag soup.

read_page · live DOM → clean markdown · ~100–200× smaller than raw HTML

A retro robot beams the live web onto a vintage computer screen.
01

Why Servo

Most "agent browsing" puppeteers Chromium over CDP: heavyweight, hard to instrument, easy to bot-detect. Servo is a memory-safe, embeddable engine you can own end to end. Because it already speaks WebDriver, an agent is on familiar ground — and because you control the engine, you can extend it at the source. The payoff is read_page: an agent reads distilled content instead of 200 KB of tag soup.

Wraps W3C WebDriver

The standard automation surface LLMs already understand — no bespoke protocol.

open_urlfindclick

read_page = the value-add

Post-render DOM → clean markdown, ~100–200× smaller than raw HTML.

trafilaturanoise-stripabsolute links
02

Use cases

1Research source-reader

Read many sources as clean markdown for multi-source research — more sources per context budget, fewer "page didn't render" misses.

deep-researchread_page

2Site QA / pre-deploy

Render a page, assert content & links resolve, screenshot — a self-owned browser your CI can call, no SaaS.

verifyscreenshot

3Page watcher

Cron a page, extract a value, alert on change — real rendered pages beat fragile HTTP polling.

scheduleeval_js

4Bot-walled fallback

When a cheap fetch is blocked or returns an empty JS shell, render it in an engine you own and instrument.

scrapeheadless

5Structured extraction

Pull an HTML table straight into JSON/CSV for a data pipeline.

extract_tableextract_links

6Universal "read the web"

A primitive for any agent task — distilled markdown instead of dumping raw HTML into context.

mcpread_page
03

The tools

Exposed as the servo-agent MCP server — agent-shaped actions over WebDriver:

open_urlread_pagefind wait_for_selectorclicktype_text fill_formscrollextract_links extract_tableeval_jsscreenshotstatus
04

Quick start

// install + drive

# install the harness
uv tool install servo-agent

# build the engine (sibling Servo fork)
./mach build -d --media-stack dummy

# prove it end-to-end
$ servo-agent selftest https://news.ycombinator.com

// wire into an agent (MCP)

# Claude Code
claude mcp add servo-agent -s user -- \
  uv run --project /path/to/servo-agent \
  servo-agent serve

# Codex — ~/.codex/config.toml
[mcp_servers.servo-agent]
command = "uv"
args = ["run","--project","…","servo-agent","serve"]