How Accessibility Tree Formatting Affects Token Cost in Browser MCPs — DeepSeek Blog | Neura Market
    Neura MarketNeura Market/DeepSeek
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityDeepSeekDeepSeek
    CoPilotCoPilotStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityTrendingGenerate
    DeepSeekBlogHow Accessibility Tree Formatting Affects Token Cost in Browser MCPs
    Back to Blog
    How Accessibility Tree Formatting Affects Token Cost in Browser MCPs
    mcp

    How Accessibility Tree Formatting Affects Token Cost in Browser MCPs

    kuroko February 26, 2026
    0 views

    Token cost in browser automation MCPs has become a real topic — articles like "Playwright MCP Burns...

    Token cost in browser automation MCPs has become a real topic — articles like ["Playwright MCP Burns 114K Tokens Per Test"](https://scrolltest.medium.com/playwright-mcp-burns-114k-tokens-per-test-the-new-cli-uses-27k-heres-when-to-use-each-65dabeaac7a0) have been making the rounds. Tools are approaching this from different angles: Playwright MCP's `--output-mode file` option saves snapshots to disk instead of returning them in LLM context, Vercel's [agent-browser](https://github.com/vercel-labs/agent-browser) compresses DOM state to a fraction of the original, and some tools add vision-based fallbacks for layout understanding. I've been working on [WebClaw](https://github.com/kuroko1t/webclaw), an open-source Chrome extension-based browser MCP. It takes the accessibility tree approach like Playwright MCP, but with a more compact format. I wanted to measure the actual difference — not guess, but measure — so I set up a side-by-side test. ## How I Measured **Versions tested:** - Playwright MCP: `@playwright/mcp` v0.0.68 (`npx @playwright/[email protected] --headless`) - WebClaw: `webclaw-mcp` v0.9.0 + Chrome extension v0.9.0 - Measured: February 26, 2026 I registered both [Playwright MCP](https://github.com/microsoft/playwright-mcp) and WebClaw as MCP servers in the **same Claude Code session**, then ran the same steps on each: 1. Navigate to the target URL 2. Call the snapshot tool (`browser_snapshot` / `page_snapshot`) 3. Measure the full response text length in characters 4. Estimate tokens as `characters / 4` (approximation — actual tokenization varies by model) **Both tools return the complete accessibility tree with no truncation.** WebClaw's default is unlimited output (no token budget), so this is a pure format efficiency comparison. I picked three pages with different content patterns: - **Wikipedia** — long article with many reference links and navigation templates - **GitHub** — repository page with file listing, README, and sidebar - **Hacker News** — list-style page with 30 items **Important caveat on fairness:** Playwright MCP runs a headless Chromium (not logged in). WebClaw runs in the user's Chrome (logged in to GitHub in my case). This means WebClaw sees *more* UI on GitHub — authenticated menus, notifications, repo actions — which actually increases its output. The comparison is biased against WebClaw on that page. ## Results: Format Efficiency Both tools returning full, untruncated accessibility trees: | Site | Playwright MCP | WebClaw | Difference | |------|---------------|---------|------------| | [Wikipedia (MCP article)](https://en.wikipedia.org/wiki/Model_Context_Protocol) | 16,044 tokens (64,176 chars) | 7,860 tokens (31,439 chars) | **51% smaller** | | [GitHub (anthropics/claude-cookbooks)](https://github.com/anthropics/claude-cookbooks) | 19,409 tokens (77,637 chars) | 4,304 tokens (17,215 chars) | **78% smaller** | | [Hacker News (front page)](https://news.ycombinator.com/) | 14,547 tokens (58,189 chars) | 3,052 tokens (12,207 chars) | **79% smaller** | The range is **51% to 79%** depending on the page. Let me dig into why. ## What Creates the Difference Comparing the actual output for the same Wikipedia page: **Playwright MCP** (`browser_snapshot`): ```yaml - generic [active] [ref=e1]: - link "Jump to content" [ref=e2] [cursor=pointer]: - /url: "#bodyContent" - banner [ref=e4]: - navigation "Site" [ref=e6]: - generic "Main menu" [ref=e7]: - button "Main menu" [ref=e8] [cursor=pointer] ``` **WebClaw** (`page_snapshot`): ``` [page "Model Context Protocol - Wikipedia"] [banner] [nav "Site"] [@e2 link] [search] [@e3 searchbox "Search Wikipedia"] [@e4 button "Search"] ``` The difference comes down to design choices — each reasonable on its own, but they compound: | Design choice | Playwright MCP | WebClaw | |---------------|---------------|---------| | **Which elements get refs** | All elements (`generic`, `rowgroup`, `cell`...) | Only interactive elements (buttons, links, inputs) | | **Attribute output** | `[active]`, `[cursor=pointer]`, `/url:` on all applicable | Minimal — only what's needed for action | | **Table representation** | Full nested structure per cell | Compressed single-line rows | | **Ref count (GitHub)** | 789 refs | 245 refs | Playwright MCP's approach — labeling every element with a ref — gives maximum flexibility for targeting any element. WebClaw trades that completeness for compactness by only labeling things the AI can actually interact with. ### Why the range is so wide (51% to 79%) The format savings vary by page structure: - **GitHub (78%)**: The file listing table is where the biggest difference shows. Playwright MCP assigns refs to every `row`, `cell`, `generic` wrapper (789 total). WebClaw only labels links and buttons (245 total). Additionally, WebClaw follows the W3C Accessible Name specification, using `textContent` before the `title` attribute for buttons and links. On GitHub, many buttons have short display text ("X") but verbose title attributes ("Close dialog") — using the spec-compliant order avoids the bloat. - **Hacker News (79%)**: Simple, repetitive table structure. WebClaw's table compression (`[row] 1. | link | link`) eliminates most of the verbosity. Playwright MCP outputs nested `rowgroup > row > cell > generic > link` for each of the 30 items. - **Wikipedia (51%)**: The article body has many inline links that both tools represent similarly. The savings come primarily from the navigation templates (Generative AI, Artificial Intelligence navboxes) where structural compression helps, but the text content itself is irreducible. ## Controlling Output Size WebClaw defaults to unlimited output — no truncation. But when you need to manage token costs, two options are available: **Interactive elements only** — `interactiveOnly` ```json { "interactiveOnly": true } ``` Strips all text content. A 2,000-line page becomes ~200 lines of buttons, links, and inputs. **Landmark region focus** — `focusRegion` ```json { "focusRegion": "main" } ``` Only returns the `main`, `nav`, `header`, or `footer` section. Useful when you know where the content you need is. Playwright MCP doesn't have equivalents — it always returns the full tree. ## The Broader Landscape This comparison only covers in-context accessibility trees. The ecosystem is moving fast, and there are other approaches worth knowing about: - **Playwright MCP file output** (`--output-mode file`): Saves snapshots to disk files instead of returning them in LLM context. Clients that support file references can read these without consuming context tokens. A fundamentally different approach to the same problem. - **DOM compression tools** (Vercel's [agent-browser](https://github.com/vercel-labs/agent-browser), [browser-use](https://github.com/browser-use/browser-use), etc.): These extract and compress DOM/accessibility tree state, filtering down thousands of nodes to the most relevant elements. Some also support optional vision models for layout understanding as a secondary input. WebClaw's approach is narrower: same accessibility tree method as Playwright MCP's `browser_snapshot`, but with a more compact format. The numbers above show what format choices alone can do — but they don't capture the full picture of what's possible with file-based or DOM compression approaches. ## Why Format Efficiency Still Matters Even with file-based alternatives emerging, in-context snapshots remain the default for most MCP setups. A browser automation task rarely reads a page just once — navigate, read, click, read again, fill a form, check the result — that's easily 5-10 snapshot calls. A 51-79% format reduction compounds across those calls. ## Tradeoffs I'm biased — I built WebClaw — so let me be upfront about the tradeoffs. **Where Playwright MCP is the better choice:** - CI/headless environments (WebClaw needs a visible Chrome window) - Cross-browser testing (Chromium, Firefox, WebKit) - Zero-install setup (`npx` one-liner vs. Chrome extension) - Complete output — every element gets a ref, nothing is omitted - `--output-mode file` for file-based snapshots **Where WebClaw fits better:** - Token-sensitive workflows where format compactness matters - Logged-in sessions (runs in your existing Chrome — no re-authentication) - Bot-resistant sites (Chrome extension, no WebDriver flags) - When you need output size controls (`interactiveOnly`, `focusRegion`) **WebClaw limitations:** - Requires Chrome + extension install - No headless mode - No test code generation - Uses your real session (the AI operates with your credentials) ## Setup **Claude Code:** ```bash claude mcp add webclaw -- npx -y webclaw-mcp ``` **Claude Desktop** — add to `claude_desktop_config.json`: ```json { "mcpServers": { "webclaw": { "command": "npx", "args": ["-y", "webclaw-mcp"] } } } ``` Then install the [Chrome extension](https://github.com/kuroko1t/webclaw/releases/latest): extract the zip, go to `chrome://extensions/`, enable Developer mode, and load the `dist/` folder. ## Wrapping Up The takeaway isn't "use WebClaw instead of Playwright MCP" — it's that **accessibility tree format choices matter more than you'd expect**. Assigning refs to every element vs. only interactive ones, including `[cursor=pointer]` hints vs. omitting them, following the W3C accessible name spec vs. using title attributes — these small decisions compound into a 51-79% difference on real pages. The browser MCP space is evolving quickly. File-based snapshots, DOM compression tools, and hybrid approaches are all worth watching. If you're hitting token limits with your current setup, the data here might help you understand why — and what to try next. If you want to reproduce these measurements or try WebClaw, the [repo is open](https://github.com/kuroko1t/webclaw). Issues and feedback welcome — this is a solo project and I'm still figuring out the right tradeoffs. **GitHub**: [github.com/kuroko1t/webclaw](https://github.com/kuroko1t/webclaw) **npm**: `npx -y webclaw-mcp` --- *WebClaw is MIT-licensed open source.*

    Tags

    mcpaiwebdevplaywright

    Comments

    More Blog

    View all
    How I'm using ASTs and Gemini to solve the "Codebase Onboarding" problem 🧠ai

    How I'm using ASTs and Gemini to solve the "Codebase Onboarding" problem 🧠

    Hi everyone! 👋 I’m Tara, a Senior Software Engineer and Consultant. Over the years, I've jumped...

    T
    tworrell
    Local AI Will Save Us All (The Math Says So, Trust Me)ai

    Local AI Will Save Us All (The Math Says So, Trust Me)

    Every few weeks a take goes viral in tech circles making the case for ditching cloud AI and running...

    S
    Sebastian Schürmann
    Lost in the AI Hype, I Started Smallai

    Lost in the AI Hype, I Started Small

    And it helped me get back into tech without drowning TL;DR at the end Coming back to...

    R
    Rohini Gaonkar
    Building a Replay-Tested Interactive Brokers Client in Gogo

    Building a Replay-Tested Interactive Brokers Client in Go

    I wanted an IBKR library that felt like Go and had testing I could trust. So I wrote one.

    T
    Thomas Marcelis
    Playwright in Pictures: Fully Parallel Modeplaywright

    Playwright in Pictures: Fully Parallel Mode

    Playwright’s fullyParallel mode is often treated as a simple performance switch. In practice, it...

    V
    Vitaliy Potapov
    Designing a CLI for Both Humans and Agentscli

    Designing a CLI for Both Humans and Agents

    Learn how Alpic designed its CLI for both human developers and AI agents — covering tradeoffs like polling, context windows, interactivity, and statelessness.

    J
    Julien Vallini

    Stay up to date

    Get the latest DeepSeek prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for DeepSeek and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.