All posts

AgentScreenshots for Codex

A factual guide to using AgentScreenshots with Codex and Codex-like coding agents through shell commands, local artifacts, and screenshot inspection.

positioning Miha Cacic May 21, 2026 5 min read

Codex and Codex-like agents work best when a task can be expressed through the repository, the shell, and inspectable local artifacts.

AgentScreenshots fits that model. It installs an npm package named agentscreenshots, exposes a shell command named agentshot, captures rendered pages from the local development environment, and saves screenshots as PNG or JPEG files at the output path the agent provides.

There is no special built-in Codex integration to claim here. The integration surface is the CLI plus whatever repo instructions or skills tell the agent when to use it.

That is enough for a useful frontend workflow.

The problem for coding agents

Frontend work is not complete just because the code compiles.

An agent can change a component, run tests, and still miss obvious rendered problems:

  • text wrapping badly on mobile
  • a sticky header covering content
  • a card grid overflowing at tablet width
  • a lazy image missing from the captured state
  • a hover menu positioned under another layer
  • a button label that no longer fits after copy changes

Humans catch these by looking. Codex needs an explicit visual checkpoint.

AgentScreenshots gives that checkpoint as a command.

agentshot "http://127.0.0.1:5173" 
  ".agents/screenshots/home-mobile.png" 
  --viewport 390x844 
  --scroll 
  --wait 1000 
  --wait-until load

The result is a local image file that the agent can inspect, compare, and mention in its final report.

How Codex should use it

The practical loop is:

  1. Read the repo and understand the requested UI change.
  2. Start the dev server if needed.
  3. Make the smallest relevant code change.
  4. Run normal checks.
  5. Capture the changed UI with agentshot.
  6. Inspect the saved screenshot.
  7. Patch visible issues and capture again.

The key is step 6. Running a screenshot command is not enough. The agent should inspect the image before it treats the visual check as evidence.

mkdir -p .agents/screenshots/nav-fix

agentshot "http://127.0.0.1:5173/dashboard" 
  ".agents/screenshots/nav-fix/desktop-after.png" 
  --selector "nav" 
  --padding 20 
  --viewport 1440x1000 
  --wait 500

agentshot "http://127.0.0.1:5173/dashboard" 
  ".agents/screenshots/nav-fix/mobile-after.png" 
  --selector "nav" 
  --padding 16 
  --viewport 390x844 
  --wait 500

That gives Codex a concrete artifact for both desktop and mobile review.

Repo instructions should stay small

For Codex, the best project-level instruction is usually a short ignition entry, not a pasted manual.

## Tool Ignition

- `agentshot`: rendered webpage screenshots and visual UI checks.
  Use after meaningful frontend edits. Run `agentshot instructions`
  before first meaningful use in a session.

The CLI’s embedded instructions can carry the details: naming conventions, when to capture before/after, which flags map to common UI states, and how to report the artifact paths.

That matters because long static instructions drift. A command like agentshot instructions gives the agent a simple way to reload the current operating guide.

What it can capture

AgentScreenshots can capture pages reachable from the current development environment:

  • localhost dev servers
  • preview deployments
  • staging environments
  • production URLs

It supports the capture shapes agents commonly need during frontend work:

NeedCommand shape
Full rendered pageDefault capture, often with --scroll
One component or section--selector with --padding
Mobile or desktop viewport--viewport 390x844 or --viewport 1440x1000
A vertical band of a long page--from and --to
Lazy-loaded content--scroll, --wait, or --wait-for
Dismissible overlay--click-if-present
Required click state--click
Hover state--hover-if-present or --hover
Hydrated app state--wait-for and --wait-until
Sharper inspection image--device-scale-factor

The output is a normal local file. That is the point: Codex can keep visual proof near the code it changed.

What it is not

AgentScreenshots is not a screenshot API, scraping platform, browser cloud, visual regression suite, MCP server, or general browser-control product.

It does not upload screenshots in the MVP.

It does not run hosted rendering for URLs that are not reachable from the environment where the command is executed.

It does not replace Playwright tests, browser automation, or manual design review.

It is narrower than those tools: local visual checks for coding agents.

When Codex should use another tool

AgentScreenshots is the wrong primary tool when the agent needs to operate an application over time.

TaskBetter primary tool
Log in, navigate, and reproduce a multi-step bugBrowser MCP or browser automation
Inspect console logs and network requestsBrowser MCP, devtools, or Playwright
Run deterministic end-to-end assertionsPlaywright test suite
Compare many pages over time in CIVisual regression suite
Capture a changed UI surface for inspectionAgentScreenshots

The overlap is real. Codex can use a Browser MCP to reach a state, then use AgentScreenshots to capture the final visual checkpoint if the state is reachable by URL and command flags. But AgentScreenshots should not be described as a browser-control layer.

Example final report

A useful Codex final report should connect the code change to the visual artifact:

Changed the pricing card grid and verified it with local captures:

- .agents/screenshots/pricing/desktop-after.png at 1440x1000
- .agents/screenshots/pricing/mobile-after.png at 390x844

The mobile capture showed the annual-price badge wrapping, so I tightened the
badge layout and recaptured before finishing.

That report is stronger than “I updated the responsive CSS” because it names what was visually checked.

Usage and pricing shape

AgentScreenshots counts one successful CLI screenshot as one visual check. Failed captures do not count.

The anonymous local trial includes 50 one-time screenshots. Free includes 100 screenshots per month. Solo is EUR 5/month or EUR 50/year for 2,000/month. Pro is EUR 20/month or EUR 200/year for 10,000/month. Studio is EUR 100/month or EUR 1,000/year for 100,000/month.

For Codex workflows, the budget question is simple: capture the pages, sections, breakpoints, and states that materially reduce frontend risk. Do not turn every code edit into a screenshot. Do capture before the agent claims a visual UI change is done.

Bottom line

AgentScreenshots gives Codex a local screenshot habit.

It is just a CLI, repo instructions, local capture output, and saved image artifacts. That is the useful part. Codex can run the command, inspect the PNG or JPEG, fix what it sees, and report the exact files it used as evidence.

For frontend coding agents, that is often the difference between “the code looks right” and “the rendered UI was checked.”

Try it

Give your agent eyes in 30 seconds.

One CLI command. 100 visual checks free every month. No credit card.