AgentScreenshots does not currently ship an MCP server.
That is deliberate for the MVP. AgentScreenshots starts as a local-first CLI because the first workflow is simple: let an AI coding agent capture and inspect the rendered UI it just changed.
agentshot "http://localhost:5173" ".agents/screenshots/home-mobile.png"
--viewport 390x844
--scroll
--wait 1000 The command captures the requested view and saves a PNG or JPEG locally. There is no hosted rendering layer and no MCP server in the current product.
Why people ask about MCP
MCP is a natural question because many coding agents use MCP servers for external tools. Browser MCPs can expose browser control to the agent: navigate, click, inspect, capture, and sometimes read console or network state.
That can be powerful. It can also be more surface area than the agent needs for a routine visual check.
AgentScreenshots focuses on the smaller loop:
- edit frontend code
- run a screenshot command
- inspect the saved image
- patch the code
- recapture if needed
No server registration is required for that loop. The shell is already available to most coding agents.
CLI vs MCP
| Question | AgentScreenshots CLI | Browser MCP |
|---|---|---|
| Current AgentScreenshots support | Yes | No |
| Setup shape | Install npm package, run agentshot | Configure MCP server in the agent environment |
| Primary interface | Shell command | Tool calls through MCP |
| Best for | Repeatable visual checkpoints | Browser operation and interactive exploration |
| Output | Local PNG/JPEG path | Observations, browser state, screenshots, logs depending on server |
| Localhost support | Direct in the local command workflow | Depends on MCP server and where it runs |
| Agent prompt size | Small command guidance | More tool semantics to teach |
| Good default after UI edits | Often yes | Sometimes, especially if already configured |
The tradeoff is not “CLI good, MCP bad.” The tradeoff is workflow shape.
Where the CLI is stronger
The CLI is strong when the agent needs a precise visual artifact.
Examples:
agentshot "http://127.0.0.1:5173" ".agents/screenshots/hero-after.png"
--selector "section:has-text('Visual checks')"
--padding 24
--viewport 1440x1000 agentshot "http://127.0.0.1:5173" ".agents/screenshots/pricing-mobile.png"
--viewport 390x844
--selector "section:has-text('Pricing')"
--padding 16 agentshot "http://127.0.0.1:5173" ".agents/screenshots/dropdown.png"
--hover ".nav-item"
--selector "header"
--padding 20 The command is explicit. The output path is stable. A future agent can find the artifact. The user can open the file. The capture can be repeated without reconstructing a browser session.
That makes the CLI a good fit for:
- before/after screenshots
- mobile layout checks
- section-specific review
- lazy-loaded pages
- hover and clicked states
- local dev servers
- CI jobs that can run the CLI and save artifacts
Where Browser MCPs are better
Use a Browser MCP when the agent needs to operate the browser, not just capture the result.
Browser MCPs are often better for:
- logging into an app
- clicking through multi-step flows
- filling forms
- debugging console errors
- inspecting network requests
- exploring an unfamiliar site
- reproducing user journeys
- reading live browser state before deciding what to do next
If the task is “act like a user for ten steps and tell me what breaks,” a Browser MCP is usually the better starting point.
If the task is “you changed the pricing section; prove it renders correctly on mobile,” a CLI screenshot is often enough.
Why not MCP first?
MCP can be added later if the workflow demand justifies it. Starting there would change the product shape.
An MCP server needs configuration, tool schemas, lifecycle decisions, documentation for each agent client, and support for environments where the server runs in a different place than the code. That may be worth it for broad browser control.
AgentScreenshots is narrower. It wants agents to learn one habit:
After meaningful UI changes, run agentshot, save the image in the repo, inspect it, and fix what the screenshot reveals. The CLI keeps that habit portable across Claude Code, Codex, Cursor, Windsurf, OpenCode, and any other agent that can run shell commands.
How they can work together
The realistic setup may use both.
| Step | Better tool |
|---|---|
| Reproduce a logged-in dashboard bug | Browser MCP |
| Find which route and state matter | Browser MCP |
| Edit the component | Coding agent |
| Capture the fixed component at desktop and mobile | AgentScreenshots |
| Inspect the PNG artifacts | Coding agent |
| Click through the full flow again | Browser MCP |
AgentScreenshots does not need to replace browser tools. It can sit beside them as the cheap, repeatable visual-check layer.
Current boundary
The current product boundary is clear:
- AgentScreenshots is not an MCP server today.
- It is not a browser cloud.
- It is not a hosted screenshot API.
- It does not upload screenshots in the MVP.
- It does not try to be a general browser-control product.
It is a local CLI and agent instruction package for capturing rendered UI that the machine can reach.
That narrowness is useful. Most frontend agent work does not need a new protocol. It needs a visual artifact before the agent says it is finished.
Try it
Give your agent eyes in 30 seconds.
One CLI command. 100 visual checks free every month. No credit card.