← All blog posts 6 min readanthropic

Use Claude Computer Use in 2026: API Route, Cowork, and the Tool Loop Most Tutorials Skip

What you'll learn
  • Distinguish the API tool loop from Cowork/Claude Code and pick the right route for your use case
  • Implement a working Claude computer use API call with proper sandbox isolation
  • Identify the three concrete prompt injection vectors that make unsandboxed computer use dangerous

Claude Computer Use lets Claude control a desktop by taking screenshots, moving the mouse, and typing — but there are two separate routes with completely different contracts: the API beta for builders shipping automation products, and Cowork/Claude Code for delegating tasks on your own machine. Pick the wrong one and you'll either over-engineer a delegation task or ship a security liability into production.

Most tutorials treat computer use as a monolithic "Claude controls your computer" feature. That framing is wrong — and it's why most implementations end up either brittle or dangerously insecure. The architectural truth is that Claude never executes actions itself. Claude outputs instructions. Your code executes them. That distinction changes everything about how you should build, sandbox, and audit a computer use system.

Two Execution Contracts, Not One

Anthropic launched computer use as an API public beta in October 2024 with Claude 3.5 Sonnet. On March 23, 2026, the capability landed in a separate product surface — Claude Cowork and Claude Code — as a research preview for Pro and Max subscribers.

These two surfaces have fundamentally different ownership models:

API RouteCowork / Claude Code
Who owns the loopYou (your application code)Anthropic's managed product
Who owns the sandboxYouAnthropic
Permission modelProgrammatic, per-requestSession-based, user-approved
Best forAutomation products, internal toolingPersonal delegation on your own machine
Audit trailWhatever your code logsCowork sessions (not covered by Compliance API as of mid-2026)

If you're shipping an automation feature into a product, use the API route. If you want Claude to do repetitive work on your own desktop, use Cowork. Don't mix them.

How the API Tool Loop Actually Works

The computer use API is a standard tool-use loop, not an autonomous agent. Here's the cycle:

  1. Your code captures a screenshot of the desktop state.
  2. You send the screenshot plus a task to Claude with the computer tool enabled.
  3. Claude responds with a tool_use block describing the next action: mouse_move, left_click, type, screenshot, etc.
  4. Your code executes that action inside a VM or container you control.
  5. Your code sends back a tool_result with the new screenshot.
  6. Claude determines the next action. Repeat until the task is complete.

Claude never touches your machine directly. It produces structured instructions; your application decides whether and how to execute them. This is the architectural fact that most tutorials elide — and it's where your safety controls actually live.

As of mid-2026, `computer-use-2025-11-24` supports Claude Opus 4.8, Opus 4.7, Opus 4.6, Sonnet 4.6, and Opus 4.5; a second header computer-use-2025-01-24 covers Sonnet 4.5, Haiku 4.5, and deprecated models.

Minimal Working Implementation

```python import anthropic import base64 from screenshot_utils import capture_desktop_screenshot # your sandbox screenshot tool

client = anthropic.Anthropic()

def run_computer_use_task(task: str) -> None: screenshot_b64 = base64.b64encode(capture_desktop_screenshot()).decode() messages = [ { "role": "user", "content": [ { "type": "image", "source": {"type": "base64", "media_type": "image/png", "data": screenshot_b64}, }, {"type": "text", "text": task}, ], } ]

while True: response = client.beta.messages.create( model="claude-sonnet-4-6", max_tokens=4096, tools=[{"type": "computer_20251124", "name": "computer", "display_width_px": 1280, "display_height_px": 800}], messages=messages, betas=["computer-use-2025-11-24"], )

if response.stop_reason == "end_turn": break

tool_calls = [b for b in response.content if b.type == "tool_use"] if not tool_calls: break

tool_results = [] for tool_call in tool_calls: result_screenshot = execute_action(tool_call.input) # YOUR sandbox executor tool_results.append({ "type": "tool_result", "tool_use_id": tool_call.id, "content": [{"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": base64.b64encode(result_screenshot).decode()}}], })

messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": tool_results})

run_computer_use_task("Open the terminal and list files in the home directory.") ```

Expected output: Claude issues a sequence of tool_use blocks — key: super to open a launcher, type: terminal, key: Return, type: ls, key: Return — your executor fires each action inside the sandbox VM, and Claude reads back the resulting screenshot to confirm completion.

Note the token overhead: the computer-use-2025-11-24 beta adds 466–499 system-prompt tokens plus 735 tool-definition tokens before counting screenshot images. A session with 10 screenshot exchanges at 1280×800 pixels can easily hit 50K tokens. Budget accordingly.

The Prompt Injection Risk You Can't Ignore

Prompt injection is OWASP's #1 LLM vulnerability in 2026, and computer use makes it catastrophically more dangerous than in chatbot contexts.

The attack is straightforward: a malicious website or document embeds hidden text — invisible to humans, visible to Claude's vision model — that contains instructions. When Claude reads the page as part of a task, it may execute the injected instructions. HiddenLayer demonstrated that this can trigger rm -rf / on an unsandboxed machine. In June 2026, PromptArmor showed a complete enterprise attack chain where a Word document with invisible injection text caused Cowork to locate and exfiltrate financial documents to an attacker-controlled account.

Anthropic's own guidance for production use:

  • Run inside a container or VM. Never point computer use at your dev machine or production host.
  • Restrict network access. Whitelist only the domains the task requires. An agent that can't reach arbitrary URLs can't exfiltrate data.
  • No credentials in the sandbox. No browser password managers, no SSH keys, no API tokens.
  • Log every action. Screenshot every state transition. A complete audit-trail is the only way to answer compliance questions after the fact.

The audit trail requirement is particularly sharp for enterprise teams: as of mid-2026, Cowork sessions are explicitly excluded from Anthropic's Compliance API and Audit Logs. If your security or compliance team asks "what did Claude do on that machine at 14:32?", the API route with your own logging is the only path to an answer.

When Cowork Is the Right Call

If you're an individual contributor delegating repetitive research, data entry, or file organization on your own machine, Cowork is the faster answer. Open Claude Desktop, switch to Cowork, describe the task, review the plan, and leave the desktop app running. Claude always asks permission before accessing a new application and can be interrupted at any point.

The tradeoff is control: you get speed and simplicity, but you're working within Anthropic's managed session model rather than a sandbox you own. For personal productivity tasks where you're present to supervise, that's a reasonable deal. For unattended automation, it's not.

KnowledgeCheck

Which of the following is true about the Claude Computer Use API?

A) Claude directly executes mouse and keyboard actions on the host machine B) Your application code executes actions; Claude only outputs instructions in tool_use blocks C) The computer use beta requires a separate Anthropic subscription beyond API access D) Prompt injection is only a risk when browsing untrusted websites, not when processing local documents

Correct answer: B. Claude outputs tool_use blocks; your code decides whether and how to execute them inside your sandbox. This is the load-bearing architectural fact for building safe computer use systems. Prompt injection (ruling out D) is equally dangerous from local documents — the PromptArmor June 2026 demo used a Word file, not a website.


If you want to go deeper on building production-grade agentic systems with Claude — including tool loops, sandboxing patterns, and multi-agent orchestration — the Claude Tool Use from Zero: From Basics to Production Connectors course covers the full stack from first API call to deployed agent pipeline.

References

  1. www.anthropic.com
  2. platform.claude.com
  3. www.digitalapplied.com
  4. blog.laozhang.ai
  5. siliconangle.com
  6. www.hiddenlayer.com
  7. www.kunalganglani.com
  8. www.promptarmor.com
  9. www.truefoundry.com
Next up
community 11 min read

Coding agents win on workflow, not chat UI — Codex, Claude Code, Cursor, and Gemini CLI compared on five primitives

Continue reading