Agents & tools

Browser Agent

An AI agent that controls a real web browser — clicking, typing, and reading pages — to complete tasks on websites that lack APIs.

01 ——

In plain English

A browser agent is an AI agent whose primary tool is a web browser. It opens pages, fills forms, clicks buttons, scrolls, and reads the rendered DOM the same way a human would — letting it use any website, even ones without an API.

Why it's a 2025–26 breakout category: Most real-world software still doesn't expose an API. A browser agent can book a flight on a small airline's site, fill out a government form, scrape a competitor's pricing page, or run any web-only workflow. It's the universal adapter for the agent era.

Major products:

  • OpenAI Operator — ChatGPT's browser agent
  • Anthropic Computer Use — Claude controlling a virtual desktop
  • Perplexity Comet — agentic browser
  • Browser Use — open-source library powering many agents
  • Browserbase — managed browser infrastructure for agents

Limits: Browser agents are slow (every click is a network round-trip), brittle (page changes break flows), and visible to the sites they use (some block them). Best for repetitive web tasks where speed isn't critical.

02 ——

Related terms

Back to glossaryLast reviewed May 2026
Vol. 4 · Issue 19 · Last reviewed 2026-05-30

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI