Browser Agent
An AI agent that controls a real web browser — clicking, typing, and reading pages — to complete tasks on websites that lack APIs.
In plain English
A browser agent is an AI agent whose primary tool is a web browser. It opens pages, fills forms, clicks buttons, scrolls, and reads the rendered DOM the same way a human would — letting it use any website, even ones without an API.
Why it's a 2025–26 breakout category: Most real-world software still doesn't expose an API. A browser agent can book a flight on a small airline's site, fill out a government form, scrape a competitor's pricing page, or run any web-only workflow. It's the universal adapter for the agent era.
Major products:
- OpenAI Operator — ChatGPT's browser agent
- Anthropic Computer Use — Claude controlling a virtual desktop
- Perplexity Comet — agentic browser
- Browser Use — open-source library powering many agents
- Browserbase — managed browser infrastructure for agents
Limits: Browser agents are slow (every click is a network round-trip), brittle (page changes break flows), and visible to the sites they use (some block them). Best for repetitive web tasks where speed isn't critical.