All posts

AgentScreenshots MCP: Does AgentScreenshots Ship an MCP Server?

AgentScreenshots does not currently ship an MCP server. Here is how the CLI compares with MCP browser tools and when each makes sense.

positioning Miha Cacic May 21, 2026 5 min read

AgentScreenshots does not currently ship an MCP server.

That is deliberate for the MVP. AgentScreenshots starts as a local-first CLI because the first workflow is simple: let an AI coding agent capture and inspect the rendered UI it just changed.

agentshot "http://localhost:5173" ".agents/screenshots/home-mobile.png" 
  --viewport 390x844 
  --scroll 
  --wait 1000

The command captures the requested view and saves a PNG or JPEG locally. There is no hosted rendering layer and no MCP server in the current product.

Why people ask about MCP

MCP is a natural question because many coding agents use MCP servers for external tools. Browser MCPs can expose browser control to the agent: navigate, click, inspect, capture, and sometimes read console or network state.

That can be powerful. It can also be more surface area than the agent needs for a routine visual check.

AgentScreenshots focuses on the smaller loop:

  1. edit frontend code
  2. run a screenshot command
  3. inspect the saved image
  4. patch the code
  5. recapture if needed

No server registration is required for that loop. The shell is already available to most coding agents.

CLI vs MCP

QuestionAgentScreenshots CLIBrowser MCP
Current AgentScreenshots supportYesNo
Setup shapeInstall npm package, run agentshotConfigure MCP server in the agent environment
Primary interfaceShell commandTool calls through MCP
Best forRepeatable visual checkpointsBrowser operation and interactive exploration
OutputLocal PNG/JPEG pathObservations, browser state, screenshots, logs depending on server
Localhost supportDirect in the local command workflowDepends on MCP server and where it runs
Agent prompt sizeSmall command guidanceMore tool semantics to teach
Good default after UI editsOften yesSometimes, especially if already configured

The tradeoff is not “CLI good, MCP bad.” The tradeoff is workflow shape.

Where the CLI is stronger

The CLI is strong when the agent needs a precise visual artifact.

Examples:

agentshot "http://127.0.0.1:5173" ".agents/screenshots/hero-after.png" 
  --selector "section:has-text('Visual checks')" 
  --padding 24 
  --viewport 1440x1000
agentshot "http://127.0.0.1:5173" ".agents/screenshots/pricing-mobile.png" 
  --viewport 390x844 
  --selector "section:has-text('Pricing')" 
  --padding 16
agentshot "http://127.0.0.1:5173" ".agents/screenshots/dropdown.png" 
  --hover ".nav-item" 
  --selector "header" 
  --padding 20

The command is explicit. The output path is stable. A future agent can find the artifact. The user can open the file. The capture can be repeated without reconstructing a browser session.

That makes the CLI a good fit for:

  • before/after screenshots
  • mobile layout checks
  • section-specific review
  • lazy-loaded pages
  • hover and clicked states
  • local dev servers
  • CI jobs that can run the CLI and save artifacts

Where Browser MCPs are better

Use a Browser MCP when the agent needs to operate the browser, not just capture the result.

Browser MCPs are often better for:

  • logging into an app
  • clicking through multi-step flows
  • filling forms
  • debugging console errors
  • inspecting network requests
  • exploring an unfamiliar site
  • reproducing user journeys
  • reading live browser state before deciding what to do next

If the task is “act like a user for ten steps and tell me what breaks,” a Browser MCP is usually the better starting point.

If the task is “you changed the pricing section; prove it renders correctly on mobile,” a CLI screenshot is often enough.

Why not MCP first?

MCP can be added later if the workflow demand justifies it. Starting there would change the product shape.

An MCP server needs configuration, tool schemas, lifecycle decisions, documentation for each agent client, and support for environments where the server runs in a different place than the code. That may be worth it for broad browser control.

AgentScreenshots is narrower. It wants agents to learn one habit:

After meaningful UI changes, run agentshot, save the image in the repo, inspect it, and fix what the screenshot reveals.

The CLI keeps that habit portable across Claude Code, Codex, Cursor, Windsurf, OpenCode, and any other agent that can run shell commands.

How they can work together

The realistic setup may use both.

StepBetter tool
Reproduce a logged-in dashboard bugBrowser MCP
Find which route and state matterBrowser MCP
Edit the componentCoding agent
Capture the fixed component at desktop and mobileAgentScreenshots
Inspect the PNG artifactsCoding agent
Click through the full flow againBrowser MCP

AgentScreenshots does not need to replace browser tools. It can sit beside them as the cheap, repeatable visual-check layer.

Current boundary

The current product boundary is clear:

  • AgentScreenshots is not an MCP server today.
  • It is not a browser cloud.
  • It is not a hosted screenshot API.
  • It does not upload screenshots in the MVP.
  • It does not try to be a general browser-control product.

It is a local CLI and agent instruction package for capturing rendered UI that the machine can reach.

That narrowness is useful. Most frontend agent work does not need a new protocol. It needs a visual artifact before the agent says it is finished.

Try it

Give your agent eyes in 30 seconds.

One CLI command. 100 visual checks free every month. No credit card.