Claude Tool Use from Zero: From Basics to Production Connectors
Developers who want to master Claude's tool use capabilities, from simple function calling to building robust specialized MCP servers.
- Understand and implement Claude's native tool use
- Build, test, and deploy compliant MCP servers
- Design secure, observable tool connectors for real-world domains
- Debug complex tool interaction and authorization issues
- Implement structured logging and audit trails for tool operations
Introduction to Claude's Tool Use
This chapter gets you from "Claude can answer questions" to "Claude can decide when to call a structured function, receive the result, and continue the job." That is the first practical step toward real connectors.
Claude tool use is not magic plugin installation. In the Anthropic Messages API, you describe tools with names, descriptions, and JSON input schemas. When Claude decides a tool is needed, the response contains a tool_use content block and the API response stop reason is tool_use.[1] Your application runs the actual code, then sends a follow-up message containing a tool_result. Claude never reaches into your runtime by itself; your host application remains the executor and policy boundary.
That boundary matters. If your tool fetches a stock price, deletes a file, sends an invoice, or queries a customer database, Claude only proposes the call. Your software decides whether the call is valid, authorized, observable, and safe.
Prerequisites check
Before you start, verify that you can do three things:
- Read and write a small Python or TypeScript script.
- Store an API key in an environment variable rather than hard-coding it.
- Understand JSON objects, required fields, and string/number types.
If those are shaky, finish a basic API-client tutorial first. Tool use adds a multi-step protocol on top of ordinary API calls; it is not a good place to learn HTTP from scratch.
The mental model: model chooses, host executes
A tool-use exchange has four parts:
- You send Claude a user request plus a list of available tools.
- Claude returns either normal text or a
tool_useblock. - Your host validates the tool name and input, executes local code, and returns a
tool_result. - Claude uses that result to produce the final answer or request another tool.
- Claude only proposes a tool call; your host application is always the executor and the security boundary.
- The stop reason `tool_use` signals that Claude is requesting a tool rather than producing a final answer.
- The four-step loop (request → tool_use block → tool_result → final answer) is the stable control flow regardless of SDK version.
The tool description is part of the model's context. It should say what the tool does, when to use it, and what each field means. The input schema is the contract your code can validate before execution. Anthropic recommends precise tool definitions because ambiguous descriptions make tool selection less reliable.[1]
Here is the smallest useful stock-price tool. It uses a fake in-memory price table so the first exercise is reproducible without a paid market-data provider.
You have a tool named get_stock_price that accepts {ticker: string}. Use it to answer: What is the current price of KOENIG?
[tool_use] name: get_stock_price input: {"ticker":"KOENIG"}
After your host returns the tool result: {"ticker":"KOENIG","price":42.15,"currency":"USD","as_of":"2026-05-14T12:00:00Z"}
Claude can answer: KOENIG is trading at 42.15 USD as of 2026-05-14T12:00:00Z.`} />
The important phrase is "your host returns." Claude does not know the price. The model selected the tool and filled the arguments; your program supplied the facts.
A first implementation shape
The exact SDK syntax can change, but the control flow is stable:
- SDK syntax evolves between releases, but the underlying request-tool_use-tool_result-answer loop does not change.
- The tool definition object must include a name, description, and a JSON input schema with required fields declared explicitly.
- In a full application, tool dispatch is a switch or map by tool name, not a single hardcoded function.
```python TOOLS = [ { "name": "get_stock_price", "description": "Return the latest known price for a ticker symbol from the demo portfolio feed.", "input_schema": { "type": "object", "properties": { "ticker": { "type": "string", "description": "Uppercase ticker symbol, for example KOENIG" } }, "required": ["ticker"] } } ]
def get_stock_price(ticker: str) -> dict: prices = { "KOENIG": {"price": 42.15, "currency": "USD"}, "PAPER": {"price": 18.40, "currency": "USD"}, } symbol = ticker.upper() if symbol not in prices: raise ValueError(f"Unknown demo ticker: {symbol}") return {"ticker": symbol, **prices[symbol], "as_of": "2026-05-14T12:00:00Z"} ```
In a full app, the code around this function calls the Messages API, checks for a tool_use block, dispatches by name, catches errors, and sends the result back as a tool result. Anthropic's Messages API is the primary API surface for this interaction.[2]
Why input schemas are not optional
Without a schema, every tool call becomes a guess. The model may send stock, symbol, ticker_symbol, or company_name. Your application then either breaks or accepts loose input that later creates security problems.
- Without a required input schema, the model has no contract and will use inconsistent field names across calls.
- A well-defined schema lets your host reject malformed input before it touches any external system.
- Specific descriptions ("return the latest quote for one uppercase ticker") are more reliable than vague descriptions ("fetch data").
Schemas protect both sides:
- Claude gets a compact contract for which fields to fill.
- Your host can reject malformed input before touching external systems.
- Logs become comparable because the same tool always receives the same shape.
Use specific descriptions. "Fetch data" is weak. "Return the latest quote for one uppercase ticker in the demo portfolio feed" is useful.
<Callout type="warning"> Do not connect a write-capable production tool on your first pass. Start with a read-only tool whose output you can verify manually. Once parsing, validation, and logging work, then add mutation tools with explicit approval gates. </Callout>
Common first failures
The first failure is over-broad tools. A tool named run_python or query_database is easy to demo and dangerous to operate. It gives the model a low-level execution primitive instead of a business action. Prefer get_invoice_status, lookup_stock_price, or list_open_support_cases.
- Over-broad tools like `run_python` or `query_database` hand the model a general primitive instead of a bounded business action.
- Hidden side effects (a tool that reads, updates, and emails) must be named explicitly in the description and confirmed in host code.
- Model-produced JSON must still be validated in your runtime; schemas guide the model but do not replace server-side input checks.
The second failure is hidden side effects. A tool named sync_customer might read from Salesforce, update Stripe, and email an account manager. The model cannot reason about that safely from the name. If a tool changes state, say so in the description and require confirmation in your host.
The third failure is treating model output as trusted JSON. Even when using tool schemas, validate inputs in your own runtime. Consistency guidance from Anthropic emphasizes strengthening outputs through constraints and checks, not wishful parsing.[3]
I am designing a first Claude tool for a finance assistant. Compare these tool names and tell me which is safer: run_sql(query), get_customer_balance(customer_id), or update_account(anything). Explain…
Why: - It is narrow and business-specific. - Its input is constrained to a customer identifier. - It sounds read-only, which makes review and logging easier.
run_sql(query) is too powerful because it exposes a general database primitive. update_account(anything) is vague and write-capable, so it needs a much stronger schema, authorization check, and human approval step.`} />
Hands-on exercise
Build a local script that defines get_stock_price(ticker) and exposes it to Claude as a tool. Use only a hard-coded demo price table.
Success criteria:
- The tool schema has exactly one required field:
ticker. - A request for
KOENIGproduces a tool-use turn, then a final answer with price, currency, and timestamp. - A request for an unknown ticker returns a controlled error, not a stack trace.
- You log the tool name, validated input, and success/failure.
What's next
Chapter 2 moves from one client-side function to MCP, the protocol that lets hosts discover and call tools, resources, and prompts from external servers.
[1]: Anthropic, "Tool use with Claude", https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/overview [2]: Anthropic, "Messages API", https://docs.anthropic.com/en/api/messages [3]: Anthropic, "Increase output consistency", https://docs.anthropic.com/en/docs/test-and-evaluate/strengthen-guardrails/increase-consistency
References
Beyond Function Calling: Understanding MCP
Chapter 1 gave Claude one client-side function: a stock-price lookup that your host application described, executed, and returned to the model. That pattern is the foundation of tool use. It also exposes the first scaling problem. If every AI application has to hand-code its own GitHub connector, database connector, design-system connector, and finance connector, the industry ends up with dozens of one-off integrations that all solve discovery, credentials, logs, and safety in slightly different ways.
The Model Context Protocol, or MCP, is the standard layer that moves reusable capabilities behind a protocol boundary. The official architecture documentation describes MCP as a system where host applications connect to MCP servers through MCP clients, and those servers expose capabilities such as tools, resources, and prompts.[^architecture] The practical result is simple: instead of saying, "My app gave Claude a Python function," you can say, "Claude Code connected to a server that advertises a controlled set of capabilities."
That distinction matters for the rest of this course. Native function calling teaches the turn-by-turn mechanics: describe a tool, receive a tool request, run code, return the result. MCP teaches connector architecture: discover capabilities, isolate domain logic, reuse the same server across hosts, and keep policy enforcement out of the model prompt.
By the end of this chapter, you should be able to look at an MCP server and answer five production questions:
- Which application is the host?
- Which process or remote endpoint is the server?
- Which advertised capabilities are tools, resources, and prompts?
- Which transport connects them?
- Where do secrets and security decisions live?
Those five questions are enough to keep you out of most early MCP design mistakes.
Prerequisites check
Before continuing, make sure you can explain the Chapter 1 tool-use loop without looking at notes:
- Claude receives a user request plus a list of available tools.
- Claude returns a
tool_useblock when it wants one of those tools. - Your host validates the tool name and input.
- Your host executes the real code.
- Your host sends a
tool_resultback to Claude.
If that loop is still fuzzy, repeat the Chapter 1 stock-price exercise first. MCP does not remove the loop. It puts a standard client-server protocol around the capabilities that the host can offer to Claude.
You also need a terminal where you can run basic commands. The hands-on exercise uses Claude Code's MCP configuration flow, but the conceptual work applies to Claude Desktop, API-hosted MCP connectors, and custom MCP clients as well.
From one-off functions to connector servers
Native tool use is local to an application. You describe a tool in the API request, and your application handles the result. This is ideal when the tool is small, private to your product, or not worth sharing across environments.
- Native tool use is per-application; MCP moves capability behind a protocol boundary that any MCP-compatible host can reuse.
- An MCP server encapsulates domain logic, credentials, and policy so every host application does not have to re-implement them.
- The official TypeScript SDK structure is: create an McpServer, register capabilities, create a transport, connect — the same shape applies in Python.
MCP becomes useful when a capability should be reusable. A finance connector might need to query invoices, summarize customer balances, expose an accounts-receivable policy, and offer a reusable collection-email prompt. Those pieces should not be copied into every AI application. They belong in a connector server maintained by the team that understands the finance system.
The official MCP TypeScript SDK frames server development around three steps: create an McpServer, register tools/resources/prompts, create a transport, and connect the server to that transport.[^typescript-sdk] That shape is what you will build in Chapter 3. For now, focus on the architecture:
- The host is the user-facing AI application, such as Claude Code.
- The client is the MCP protocol component inside that host.
- The server is the external process or service that exposes capabilities.
- The transport is the connection mechanism, such as local stdio or remote HTTP.
When someone says "Claude called my MCP server," translate that into the precise version: the host's MCP client discovered capabilities from the MCP server, the model chose or benefited from one of those capabilities, and the host sent the server a protocol request.
<Callout type="warning"> MCP is not a security product by itself. It standardizes how capabilities are exposed and called. Your server still owns authentication, authorization, input validation, rate limits, error handling, and audit logs. </Callout>
The three primitives: tools, resources, and prompts
MCP has several protocol concepts, but this course starts with the three primitives you will use constantly: tools, resources, and prompts.
- Tools are callable actions with side effects or query logic; resources are stable, readable context identified by URIs; prompts are reusable instruction templates.
- Modeling a stable policy document as a tool (`get_refund_policy()`) obscures its read-only nature and pollutes the action surface.
- Classifying capabilities correctly affects safety, UX, and observability — not just naming convention.
Tools are callable actions. The MCP tools specification describes tools as functions exposed by a server that can be invoked by clients, with names, descriptions, input schemas, and returned content.[^tools] Use a tool when something needs to happen: search invoices, create a ticket, list files, run a diagnostic query, or draft a document from live data.
Resources are readable context. The MCP resources specification describes resources as data exposed by servers that clients can read, often identified by URIs.[^resources] Use a resource when the model needs context rather than an action: a refund policy, a project README, a database schema summary, or a localized configuration file.
Prompts are reusable prompt templates. The MCP prompts specification describes prompts as server-provided templates that can accept arguments and return messages for a workflow.[^prompts] Use a prompt when the server knows a repeatable instruction pattern: draft a polite collection email, summarize a pull request against team standards, or prepare a compliance review checklist.
Here is the rule of thumb:
- If the model should ask the server to do something, use a tool.
- If the host should attach known context for the model to read, use a resource.
- If the connector should provide a reusable instruction pattern, use a prompt.
That classification is not academic. It changes safety, UX, and observability. A refund policy modeled as get_refund_policy() makes a read-only document look like an action. A write-capable operation modeled as a resource hides its side effect. A long workflow prompt buried inside a tool description becomes hard to version and test.
Classify each MCP capability as a tool, resource, or prompt. Use one sentence of reasoning for each: (1) list_unpaid_invoices(customer_id), (2) company://policies/refund-policy, (3) draft_collection_e…
Notice that "read-only" does not automatically mean "resource." list_unpaid_invoices(customer_id) is read-only from the business user's perspective, but it is still an action because the server must execute a query with arguments. A resource is better when the item is stable enough to identify and retrieve directly.
A connector example: accounts receivable
Generic examples like foo() and bar() do not teach connector design. Use a domain.
Imagine a small business wants Claude Code to help with accounts receivable. The underlying system has customers, invoices, payments, and reminder-email templates. A weak MCP server mirrors the database:
query_database(sql)
http_request(method, url, body)
send_email(to, subject, body)
Those tools are flexible, but they force Claude to reason at the wrong layer. They also create serious safety problems. query_database(sql) can over-fetch sensitive records. http_request() can reach unapproved endpoints. send_email() can contact customers without a business approval step.
A stronger server exposes business capabilities:
```text Tools: - list_overdue_invoices(customer_id, max_age_days) - draft_invoice_reminder(customer_id, invoice_ids) - submit_reminder_for_approval(draft_id)
Resources: - finance://policies/collections - finance://customers/{customer_id}/account-summary
Prompts: - write_polite_payment_reminder(customer_name, invoice_summary, policy_uri) - summarize_receivables_risk(customer_summary_uri) ```
This design gives the model useful verbs without handing it raw infrastructure. It also gives the server natural enforcement points. list_overdue_invoices can cap result size. draft_invoice_reminder can redact sensitive notes. submit_reminder_for_approval can require a human approval state. The resources provide stable context, and the prompts encode the organization's preferred language.
The server boundary is where you turn a messy internal system into a model-friendly, policy-aware interface.
Host, client, server, and transport
The MCP architecture is easiest to understand as a responsibility split:
- The host is the user-facing application (e.g., Claude Code), the client is its protocol component, and the server is the external capability provider.
- Use local stdio for scripts that need direct machine access; use remote HTTP for cloud services; SSE is deprecated for new work.
- Configuration scope (local, project, user) controls who can see the server, and secrets should use environment variable expansion rather than committed values.
| Part | Responsibility | Example |
|---|---|---|
| Host | User-facing AI application and UX | Claude Code |
| Client | Protocol connection managed by the host | Claude Code's MCP client for one configured server |
| Server | External capability provider | A local Node.js file-browser server |
| Transport | How messages move | stdio for local, HTTP for remote |
Claude Code's MCP documentation shows three broad connection options: remote HTTP servers, remote SSE servers, and local stdio servers.[^claude-code] The same documentation marks HTTP as the recommended option for remote cloud services and explains that local stdio servers run as local processes on your machine.[^claude-code] It also notes that SSE is deprecated in favor of HTTP where available.[^claude-code]
For a beginner, this gives you a clear decision tree:
- Use local stdio when the server is a local script or needs direct local machine access.
- Use remote HTTP when the server is a cloud service or team-managed endpoint.
- Avoid starting new SSE work unless you are integrating with an existing server that only supports it.
Configuration scope is the next decision. Claude Code supports local, project, and user scopes for MCP servers.[^claude-code] Local scope is private to your current project entry in your user configuration. Project scope writes a .mcp.json file that can be shared with the repository. User scope makes a server available across projects. For course work, start local unless the exercise explicitly asks for project sharing. For team connectors, project scope can be useful, but only when secrets are handled through environment variable expansion or secure authentication rather than committed values.
What discovery changes
In Chapter 1, your application passed tool definitions directly to Claude in the request. With MCP, the host can discover what the server offers. That discovery shift affects maintenance.
- MCP discovery means a server can advertise new capabilities without requiring every host application to be updated with hardcoded tool definitions.
- Discovery does not grant automatic trust; a responsible host can still filter, require approval, or disable advertised capabilities.
- Tool names and descriptions are part of the interface the model and host use to determine relevance, so a vague name is a bad MCP tool even if the code works.
Suppose the finance team adds a new explain_late_fee(customer_id, invoice_id) tool. In a one-off native integration, every host application might need code or configuration changes. In an MCP setup, the finance server can advertise the new capability through the protocol. The host still needs UX and approval policies, but the connector's capability surface is no longer embedded inside every application.
Discovery does not mean the model should automatically use everything. A responsible host can still filter tools, ask for user approval, display server trust state, or disable a capability. Discovery only means the server has a standard way to say what it can provide.
This is why good descriptions matter. The tool name and description are not documentation for humans only; they are part of the interface that helps the model and host understand when a capability is relevant. A tool named action() with a vague description is a bad MCP tool even if the code works.
Common anti-patterns
The first anti-pattern is the raw executor tool: run_shell, query_sql, http_request, or eval_code. These are attractive because they make demos feel powerful. In production, they shift too much decision-making to the model. Replace them with bounded domain tools such as list_failed_deployments, get_customer_balance, or search_contract_clauses.
The second anti-pattern is hiding writes behind harmless names. A tool named sync_customer might update a CRM, email an owner, and trigger billing workflows. If a tool changes state, name the side effect and require appropriate approval in the host or server.
The third anti-pattern is modeling everything as a tool. Policies, schemas, runbooks, and reference docs should usually be resources. If the model needs the same policy across many tasks, a stable resource URI is easier to inspect, cache, and cite.
The fourth anti-pattern is modeling a workflow prompt as application code only. If the connector's domain expertise includes "how our legal team wants contract-risk summaries formatted," expose that as an MCP prompt. Then the prompt can evolve with the connector instead of being copied into every host application.
The fifth anti-pattern is treating MCP connection success as production readiness. A server can connect and still be unsafe, unobservable, over-broad, or impossible to debug. Later chapters cover logging, security, and approval gates because a connected connector is only the starting line.
Reading a Claude Code MCP configuration
Now connect the architecture to the command line.
Claude Code can add a local stdio server with a command shaped like this:
claude mcp add --transport stdio --env FINANCE_API_KEY=demo finance-demo -- node ./server.js
Read it from left to right:
claude mcp addmodifies Claude Code's MCP configuration.--transport stdiosays Claude Code will start a local process and speak MCP over standard input/output.--env FINANCE_API_KEY=demopasses an environment variable to the server process.finance-demois the server name inside Claude Code.-- node ./server.jsis the command Claude Code runs as the server.
For a remote HTTP server, the shape changes:
claude mcp add --transport http sentry https://mcp.sentry.dev/mcp
In that case, Claude Code does not start a local Node.js process. It connects to a remote endpoint over HTTP. If the server needs authentication, Claude Code supports headers and OAuth flows for remote servers.[^claude-code]
Read this Claude Code MCP command and explain the host, server, transport, secret handling, and likely discovered capabilities: claude mcp add --transport stdio --env FINANCE_API_KEY=demo finance-demo…
A team wants to share an MCP server config in a repository. The config points to https://api.example.com/mcp and needs an API key. Recommend a Claude Code scope and explain how to avoid committing the…
Capability design checklist
Before you build your first server in Chapter 3, practice reviewing a server's advertised surface. For every capability, ask:
- Is the name a business action or a raw technical primitive?
- Does the description say when to use it and what it will not do?
- Are inputs narrow enough to validate?
- Is the operation read-only or state-changing?
- If it reads context, should it be a resource instead?
- If it encodes a repeatable workflow, should it be a prompt instead?
- Where will detailed errors be logged?
- What should Claude see when the server refuses a request?
Here is a practical rewrite exercise:
```text Bad: tool: http_request input: { method, url, body }
Better: tool: search_customer_invoices input: { customer_id, status, max_results }
resource: finance://policies/invoice-collection
prompt: draft_invoice_followup arguments: { customer_name, invoice_summary, policy_uri } ```
The better version gives the model enough flexibility to help while keeping control in the server. It also creates separate places for policy, workflow language, and business actions.
Hands-on exercise: connect and inspect a sandbox MCP server
Your goal is to connect Claude Code to one existing MCP server in a sandbox environment and inspect what it advertises. Do not use a production system. Do not use a server with write access to real customer data.
Use any safe server you already trust, or create a temporary demo server from official SDK examples. The TypeScript SDK repository includes server examples and a minimal server pattern that registers a tool and connects over stdio.[^typescript-sdk] The exact server is less important than the inspection habit.
Step 1: choose the server
Pick one of these:
- A local demo MCP server from the official TypeScript SDK examples.
- A read-only local server you wrote earlier.
- A remote server owned by a trusted vendor, connected with a non-production account.
Avoid:
- Servers that can delete files, send messages, change billing, or update customer records.
- Servers that ask you to paste secrets into prompts.
- Random packages you have not inspected.
Step 2: add the server to Claude Code
For a local stdio server, the command shape is:
claude mcp add --transport stdio sandbox-demo -- node ./server.js
For a remote HTTP server, the command shape is:
claude mcp add --transport http sandbox-demo https://example.com/mcp
If credentials are required, prefer environment variables, OAuth, or a dedicated sandbox token. Claude Code's documentation includes commands for server listing, detail inspection, removal, and /mcp status checking.[^claude-code]
Step 3: inspect capabilities
Run:
claude mcp list
claude mcp get sandbox-demo
Then open Claude Code's /mcp view and inspect the server status.
Write down:
- Server name.
- Transport.
- Scope.
- Command or URL.
- Any environment variables or headers involved.
- Advertised tools.
- Advertised resources.
- Advertised prompts.
Step 4: classify and critique
For each advertised capability, classify it as tool, resource, or prompt. Then answer:
- Is the name domain-specific?
- Could the input schema allow over-broad access?
- Does any capability write state?
- Would you allow this server in a shared team project?
Success criteria:
- You can identify the host, client, server, and transport in your setup.
- You can list at least one advertised capability.
- You can classify each visible capability as a tool, resource, or prompt.
- You can explain where secrets are stored and which process or endpoint receives them.
- You can name one risk you would fix before using the server in production.
Slide outline for Slide+Audio Producer
- Slide 1: Chapter goal: move from one-off function calls to reusable connector servers.
- Slide 2: MCP architecture: host, client, server, transport.
- Slide 3: Tools vs resources vs prompts, using accounts-receivable examples.
- Slide 4: Anti-patterns: raw executors, hidden writes, everything-as-tool.
- Slide 5: Claude Code configuration: stdio vs HTTP, scope, secrets.
- Slide 6: Hands-on workflow: connect, inspect, classify, critique.
- Slide 7: Bridge to Chapter 3: building the first safe file-browser MCP server.
What's next
Chapter 3 turns this architecture into code. You will build a local MCP server with one narrow file-browsing tool, connect it through stdio, and practice returning controlled errors instead of leaking raw filesystem or stack-trace details.
[^architecture]: Model Context Protocol, "Architecture overview," https://modelcontextprotocol.io/docs/learn/architecture [^tools]: Model Context Protocol specification, "Tools," https://modelcontextprotocol.io/specification/draft/server/tools [^resources]: Model Context Protocol specification, "Resources," https://modelcontextprotocol.io/specification/draft/server/resources [^prompts]: Model Context Protocol specification, "Prompts," https://modelcontextprotocol.io/specification/draft/server/prompts [^typescript-sdk]: Model Context Protocol TypeScript SDK, https://github.com/modelcontextprotocol/typescript-sdk [^claude-code]: Anthropic Claude Code docs, "Connect Claude Code to tools via MCP," https://code.claude.com/docs/en/mcp
Building Your First MCP Server
Now you will build the first reusable connector in this course: a local MCP server that exposes a safe file-browsing tool. The point is not to create a full file manager. The point is to practice the server shape: define a capability, constrain inputs, execute domain code, and return useful results.
The official TypeScript SDK includes server libraries for tools, resources, prompts, transports, and examples.[1] This chapter uses TypeScript-style examples because the SDK's minimal McpServer shape is compact and maps cleanly to production servers. The same design ideas apply in Python.
Prerequisites check
You should have completed the MCP classification exercise from Chapter 2. You should also be comfortable running a Node.js script locally. If your environment cannot run TypeScript directly, use plain JavaScript or follow the SDK quickstart from the official repository.[1]
What the server will do
The server exposes one tool:
list_project_files(root_label, relative_path)
It lists files under a pre-approved project root. The user can ask Claude to inspect project structure, but the server will not allow arbitrary filesystem traversal.
- The server enforces the security boundary; the model never decides which paths are off-limits.
- Starting with a list-only tool (no file-content access) is the correct first step for filesystem connectors.
- The `root_label` enum approach prevents arbitrary path construction at the input-schema level before any validation code runs.
This is the key production lesson: the model should not decide the security boundary. The server decides the boundary, then offers Claude a useful operation inside it.
Minimal server shape
An MCP server has identity, registered capabilities, and a transport. The SDK README shows a minimal server that registers a greet tool and connects over stdio.[1] Your file browser follows the same shape, but with stricter validation.
- Every MCP server needs three things: an identity object, registered capabilities, and a connected transport.
- `path.resolve` plus a `startsWith` check is the minimal server-side path-escape guard; omitting it means any caller can traverse to arbitrary directories.
- The `StdioServerTransport` connects the server to the host via standard input/output, which is appropriate for local processes started by a host like Claude Code.
```ts import { McpServer } from "@modelcontextprotocol/server"; import { StdioServerTransport } from "@modelcontextprotocol/server/stdio"; import * as z from "zod/v4"; import { readdir } from "node:fs/promises"; import path from "node:path";
const ROOTS = { demo: path.resolve(process.cwd(), "demo-project") };
const server = new McpServer({ name: "course-file-browser", version: "1.0.0" });
server.registerTool( "list_project_files", { description: "List files inside an approved demo project root. Does not read file contents.", inputSchema: z.object({ root_label: z.enum(["demo"]), relative_path: z.string().default(".") }) }, async ({ root_label, relative_path }) => { const root = ROOTS[root_label]; const target = path.resolve(root, relative_path);
if (!target.startsWith(root)) { throw new Error("Path escapes approved root"); }
const entries = await readdir(target, { withFileTypes: true }); return { content: [ { type: "text", text: JSON.stringify( entries.map((entry) => ({ name: entry.name, type: entry.isDirectory() ? "directory" : "file" })), null, 2 ) } ] }; } );
const transport = new StdioServerTransport(); await server.connect(transport); ```
Why this is safer than read_file(path)
A raw read_file(path) tool looks convenient. It is also an invitation to leak secrets. The model might request .env, SSH keys, browser profiles, or system files because the prompt says "inspect the project." A safer server starts with a narrow list operation, a fixed root, and no file-content access.
- A generic `read_file(path)` tool exposes every file the server process can read, including secrets and credentials.
- Relying on "Claude will know not to ask for secrets" is not a security control — enforcement must be in server code.
- Separating listing from content-reading into distinct tools makes it possible to permit one without the other.
<Callout type="warning"> Never rely on "Claude will know not to ask for secrets." A connector must enforce policy in code. The model can be helpful, but the server is responsible for security boundaries. </Callout>
Controlled errors
Errors are part of the user experience. A stack trace is useful to an attacker and confusing to a learner. A controlled error says what failed and what the caller can do next.
- Returning a raw stack trace leaks implementation details that help attackers and confuse end users.
- Controlled errors should classify the failure (unknown root, path escape, not readable) without revealing host filesystem layout.
- Log the full exception server-side and return only the minimal useful message to the model.
For this chapter, use three predictable errors:
- Unknown root label.
- Path escapes approved root.
- Path does not exist or is not readable.
In production, log the detailed exception server-side and return the minimal useful error to Claude.
Use the file browser tool to list the top-level files in the demo project. Do not read file contents.
list_project_files({ "root_label": "demo", "relative_path": "." })
Expected result: [ {"name":"package.json","type":"file"}, {"name":"src","type":"directory"}, {"name":"README.md","type":"file"} ]
Claude should summarize the project structure without inventing file contents.`} />
Try to list ../ so I can see what is outside the demo project.
Path escapes approved root.
Claude should explain that the connector is restricted to the approved demo project root and ask for a path inside that root.`} />
Hands-on exercise
Create a local MCP server named course-file-browser with one list_project_files tool.
Success criteria:
- The server starts over stdio.
- The tool lists files under one approved demo directory.
../traversal is rejected.- The tool returns names and types, not file contents.
- You can connect Claude Code to the server using its MCP configuration flow.[3]
What's next
Chapter 4 expands the server from callable tools to resources: structured context Claude can read without treating every retrieval as an action.
[1]: Model Context Protocol TypeScript SDK, https://github.com/modelcontextprotocol/typescript-sdk [2]: Model Context Protocol documentation, https://modelcontextprotocol.io/ [3]: Anthropic, "Connect Claude Code to tools via MCP", https://docs.anthropic.com/en/docs/claude-code/mcp
References
Handling Advanced Data and Resources
Tools perform actions. Resources provide context. That distinction is the difference between a connector that feels clean and one where every read looks like a side effect.
MCP's core primitives include tools, resources, and prompts.[1] The TypeScript SDK documents server support for all three.[2] In this chapter you add a resource pattern to the file-browser server: localized configuration files exposed as stable context.
Prerequisites check
You need the Chapter 3 file browser server or an equivalent local MCP server. You should be able to start it and connect a host. You should also understand why list_project_files was modeled as a tool: it performs a directory listing action and may fail depending on path.
The resource design problem
Imagine a support assistant that needs the refund policy in English and German. You could expose a tool:
get_refund_policy(locale)
That works, but it tells the model "call an action." If the policy is stable context, a resource URI is clearer:
company-config://policies/refund/en-US
company-config://policies/refund/de-DE
- A stable, locale-scoped policy is better modeled as a resource URI than as a callable tool, because it is context to read rather than an action to execute.
- Modeling read-only data as a tool makes the connector's action surface misleadingly large.
- A server can still enforce access rules and log reads on resource fetches, so resources are not uncontrolled.
The URI names the thing. Claude can read it as context. Your server can still enforce access rules and log reads.
Resource URI rules
Good resource URIs are:
- Stable: the same policy has the same URI tomorrow.
- Meaningful: a human can infer what the resource represents.
- Scoped: the URI includes tenant, project, locale, or environment when needed.
- Non-secret: the URI should not contain credentials or private tokens.
- A resource URI is an interface, not a storage location — the server maps it to a file, database row, or generated summary behind the boundary.
- Raw file paths, signed URLs with embedded secrets, and internal database primary keys are anti-patterns for resource URIs.
- Including tenant, locale, or environment in the URI gives the server a scoping signal without leaking implementation details.
Bad resource URIs include raw file paths from a developer laptop, signed URLs with secrets, or database primary keys that reveal internal implementation details.
Localized configuration example
Create a config/ directory inside the demo project:
demo-project/
config/
refund.en-US.json
refund.de-DE.json
Each file should contain structured policy data:
{
"policy": "refund",
"locale": "en-US",
"window_days": 30,
"requires_receipt": true,
"exceptions": ["downloaded digital goods", "custom services"]
}
Your server can expose this as a resource instead of a broad file-read tool. The host sees policy context, not arbitrary disk access.
Read the company refund policy resource for en-US and summarize the refund window, receipt requirement, and exceptions.
company-config://policies/refund/en-US
Expected resource content: { "policy": "refund", "locale": "en-US", "window_days": 30, "requires_receipt": true, "exceptions": ["downloaded digital goods", "custom services"] }
Claude should summarize: The en-US refund policy allows refunds within 30 days, requires a receipt, and excludes downloaded digital goods and custom services.`} />
Handling binary and large data
Resources can represent more than text, but large or binary data needs care. If a PDF contract is 80 pages, blindly injecting it into the model context is slow, expensive, and often useless. Better patterns include:
- Injecting a large document wholesale into context is slow, expensive, and often counterproductive because most content is irrelevant to the query.
- Expose metadata first, then section-level resources, then targeted extraction tools — this keeps context lean and purposeful.
- The design goal is "Claude sees the right context with enough structure to act," not "Claude sees everything."
- Expose metadata first: title, type, page count, owner, updated time.
- Expose section resources:
contract://123/section/payment-terms. - Expose summaries with provenance: include page numbers or section IDs.
- Offer a tool for targeted extraction when the model needs specific clauses.
The goal is not "Claude sees everything." The goal is "Claude sees the right context with enough structure to act."
I have a 90-page vendor contract. Design MCP resources and tools so Claude can answer payment-term questions without loading the entire PDF into context.
Resources: - contract://vendor-123/metadata - contract://vendor-123/sections/payment-terms - contract://vendor-123/sections/termination
Tools: - search_contract(contract_id, query) - extract_clause(contract_id, clause_type)
This keeps context targeted. Claude can read metadata, then the payment terms section, and only call extraction when needed.`} />
Hands-on exercise
Extend your Chapter 3 server with a localized refund-policy resource.
Success criteria:
- The server exposes
company-config://policies/refund/en-US. - The resource returns structured JSON with policy, locale, refund window, receipt rule, and exceptions.
- Claude can summarize the policy without calling a write-capable tool.
- You can explain why the URI is stable and non-secret.
What's next
Chapter 5 adds observability. You will make every tool and resource access visible through structured logs and audit events.
[1]: Model Context Protocol documentation, https://modelcontextprotocol.io/ [2]: Model Context Protocol TypeScript SDK, https://github.com/modelcontextprotocol/typescript-sdk [3]: Anthropic, "Connect Claude Code to tools via MCP", https://docs.anthropic.com/en/docs/claude-code/mcp
References
Observability and Logging in MCP
A connector you cannot observe is not production software. When Claude calls a tool, you need to know what was requested, what policy decision was made, how long execution took, and whether the result was successful. Without that trail, debugging becomes guesswork and compliance review becomes impossible.
MCP standardizes how hosts connect to servers and discover capabilities.[1] It does not remove the need for application observability. Your server must emit logs and audit events around every meaningful operation.
Prerequisites check
You should have a working local MCP server with at least one tool from Chapter 3 and one resource from Chapter 4. If you only have a tool, you can still complete this chapter, but the final exercise is stronger when you log both tool calls and resource reads.
Operational logs vs audit events
Use two categories:
- Operational logs help engineers debug reliability: latency, exception class, retry count, dependency status.
- Audit events help reviewers reconstruct sensitive actions: actor, tool name, target object, authorization result, timestamp.
- Operational logs serve engineers debugging reliability; audit events serve reviewers reconstructing sensitive actions — both can share a pipeline but must be distinguishable.
- A timeout reading a config file is operational; a user approving an external send is audit-worthy.
- Mixing the two categories makes logs harder to search and compliance reviews harder to scope.
They can go to the same logging pipeline, but they should be distinguishable. A timeout reading config/refund.en-US.json is operational. A user approving send_invoice_reminder is audit-worthy.
The minimum useful log event
Every tool call should emit:
event: stable event name, such asmcp.tool.completed.tool_name: the MCP tool called.request_id: correlation ID from the host or generated server-side.actor: user or agent identity when available.input_summary: sanitized summary, not raw secrets.success: boolean.duration_ms: elapsed time.error_code: controlled code when failed.
- A stable `event` name and a `request_id` make logs searchable and correlatable across distributed systems.
- `input_summary` must be a sanitized version of the input — never log raw credentials or full customer records.
- Consistent event shape across every tool is what makes logs usable; copying logging code into each handler destroys consistency.
<Callout type="warning"> Do not log raw credentials, full customer records, or prompt transcripts by default. Observability that leaks sensitive data creates a second incident. </Callout>
Add a wrapper
Wrap tool handlers instead of copying logging code into every tool.
async function withToolLogging<T>(
toolName: string,
input: unknown,
handler: () => Promise<T>
): Promise<T> {
const started = Date.now();
const requestId = crypto.randomUUID();
try {
const result = await handler();
console.log(JSON.stringify({
event: "mcp.tool.completed",
request_id: requestId,
tool_name: toolName,
input_summary: summarizeInput(input),
success: true,
duration_ms: Date.now() - started
}));
return result;
} catch (error) {
console.log(JSON.stringify({
event: "mcp.tool.failed",
request_id: requestId,
tool_name: toolName,
input_summary: summarizeInput(input),
success: false,
duration_ms: Date.now() - started,
error_code: classifyError(error)
}));
throw error;
}
}
This wrapper gives every tool a consistent event shape. Consistency is what makes logs searchable.
Use list_project_files to list the demo project root. Then explain which log fields should appear for this call.
The server should emit a structured log similar to: { "event": "mcp.tool.completed", "request_id": "generated-id", "tool_name": "list_project_files", "input_summary": {"root_label":"demo","relative_path":"."}, "success": true, "duration_ms": 12 }
Claude should explain that request_id, tool_name, sanitized input, success, and duration make the call debuggable.`} />
Audit examples
Read-only file listing may not need a durable audit record in a toy project. A legal document redaction tool does. A payroll approval tool definitely does.
- The sensitivity of the operation determines whether a durable audit record is required, not just whether the call succeeded.
- Audit records should capture actor, connector, tool, target, authorization result, and timestamp — without dumping private data into logs.
- Separating the audit event from the operational log means compliance reviewers can read one stream without sorting through latency metrics.
Audit records should focus on business meaning:
{
"event": "audit.connector.action_requested",
"actor": "user_123",
"connector": "payroll-assistant",
"tool_name": "draft_payment_reminders",
"target": "payroll_run_2026_05_14",
"authorization": "allowed",
"approval_required": true,
"timestamp": "2026-05-14T12:00:00Z"
}
That event tells a reviewer what happened without dumping private payment details into logs.
Design an audit event for a tool named redact_contract(document_id, redaction_policy). Include fields useful to a compliance reviewer but avoid logging document text.
It avoids logging document text while preserving actor, target, policy, authorization, and outcome.`} />
Hands-on exercise
Add structured logging to your file-browser server.
Success criteria:
- Every
list_project_filescall emits one completion or failure log. - Logs include request ID, tool name, sanitized input, success, duration, and controlled error code when failed.
- Path traversal attempts are logged as failures without exposing host filesystem details.
- You can paste one successful log and one failed log into your notes.
What's next
Chapter 6 adds authentication and authorization. Logging tells you what happened; authorization decides whether it should happen.
[1]: Model Context Protocol documentation, https://modelcontextprotocol.io/ [2]: Model Context Protocol TypeScript SDK, https://github.com/modelcontextprotocol/typescript-sdk [3]: Anthropic, "Tool use with Claude", https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/overview
References
Security and Authentication
By now your connector can expose tools, resources, and logs. That is enough for a demo and not enough for production. Production connectors need authentication, authorization, least privilege, and approval gates.
Anthropic's Claude Code MCP documentation warns that third-party MCP servers should be used carefully, especially when they communicate with the internet, because they can introduce prompt-injection risks.[1] The lesson is broader than Claude Code: every connector is a new trust boundary.
Prerequisites check
You need the structured logs from Chapter 5. If a tool call is not logged, you cannot audit security behavior. You should also have at least one tool that can be denied without breaking the whole server.
Authentication vs authorization
Authentication answers "who is calling?" Authorization answers "may this caller do this action to this target?"
Examples:
- Authentication: request includes a bearer token that maps to
user_123. - Authorization:
user_123may read demo project files but may not list production secrets. - Authentication: the MCP server receives
FINANCE_API_KEYas an environment variable. - Authorization: that key can read settlement records but cannot issue refunds.
- Authentication and authorization are separate concerns: a valid credential does not automatically permit every tool or every target.
- Per-tool authorization (not just per-server) is necessary because different tools carry different risk levels.
- The pattern "authenticate → derive actor → authorize action → execute → log result" is the correct ordering for every tool handler.
Do not collapse these concepts. A valid credential does not imply permission for every tool.
Least privilege for connectors
Least privilege means the connector receives only the access needed for its declared operations. A file browser for a demo project should not mount the whole home directory. A payroll assistant should not receive a write token if it only drafts reminder emails. A legal redaction tool should not retain original documents after output is produced unless policy requires retention.
- Least privilege applies at the credential level (scope of the API key or token) and at the tool level (which operations the connector exposes).
- An "admin token plus prompt instructions" security model is not authorization — prompt rules are suggestions, authorization checks are enforced in code.
- MCP's ease of exposing capabilities is a design pressure toward over-permission; narrow tools and scoped credentials counteract it.
MCP makes it easy to expose capabilities. That ease is exactly why you must design them narrowly.
<Callout type="warning"> Avoid "admin token plus prompt rules" as a security model. Prompt rules are instructions; authorization checks are code. </Callout>
Authorization wrapper
Put authorization before tool execution and before detailed logging of target data.
```ts type Actor = { id: string; roles: string[]; };
function requirePermission(actor: Actor, permission: string) { if (!actor.roles.includes(permission)) { const error = new Error("Forbidden"); error.name = "FORBIDDEN"; throw error; } }
async function authorizedTool<T>( actor: Actor, permission: string, run: () => Promise<T> ): Promise<T> { requirePermission(actor, permission); return run(); } ```
Then call it inside a tool handler:
server.registerTool("list_project_files", schema, async (input, context) => {
const actor = actorFromContext(context);
return authorizedTool(actor, "project:read", async () => {
return listProjectFiles(input);
});
});
The exact context object depends on your transport and host. The pattern is stable: authenticate, derive actor, authorize action, execute tool, log result.
A connector has tools read_invoice, draft_invoice_reminder, and send_invoice_reminder. Assign read/write/approval requirements for each using least privilege.
- read_invoice: requires invoice:read. No human approval if read-only and scoped.
- draft_invoice_reminder: requires invoice:read and email:draft. No external send; approval optional.
- send_invoice_reminder: requires invoice:read and email:send, plus human approval before sending.
The write-capable external action has the strongest gate.`} />
Human-in-the-loop approval
Some tools should not execute immediately even when authorized. Approval gates are appropriate when a tool sends money, changes legal text, deletes data, emails customers, or modifies production systems.
- Authorization is necessary but not sufficient for high-risk actions; approval gates add a human decision point after authorization passes.
- A tool should return an `awaiting_approval` object with enough preview data for a human to make an informed decision without exposing full private content.
- The application must enforce the approval state in code; Claude explaining the draft is not a substitute for a required user click.
A safe tool can return a pending action:
{
"status": "awaiting_approval",
"action": "send_invoice_reminder",
"preview": {
"to": "ap@example.test",
"subject": "Reminder: invoice INV-123",
"body_excerpt": "This is a reminder that..."
},
"approval_id": "appr_123"
}
Claude can explain the draft, but your application should require a user click or separate approval event before sending.
Design the result object for a tool that prepares, but does not send, payroll reminder emails. Include enough data for human review.
The tool prepares the action and creates an approval record. It does not send messages automatically.`} />
Hands-on exercise
Add an authorization wrapper to your Chapter 3 file-browser tool.
Success criteria:
- The server derives an actor from context or a local demo token.
list_project_filesrequiresproject:read.- An actor without
project:readreceives a controlled forbidden error. - The denial is logged without leaking the requested filesystem path outside the approved root.
What's next
Chapter 7 applies the connector pattern to creative tools. You will see how state, non-textual outputs, and professional applications change the design pressure.
[1]: Anthropic, "Connect Claude Code to tools via MCP", https://docs.anthropic.com/en/docs/claude-code/mcp [2]: Model Context Protocol documentation, https://modelcontextprotocol.io/ [3]: Model Context Protocol TypeScript SDK, https://github.com/modelcontextprotocol/typescript-sdk
References
Creative Connectors
The creative connectors are a suite of nine integrations launched by Anthropic on April 28, 2026, that enable the Model Context Protocol (MCP) to control professional creative software directly via Claude. This launch marks a strategic shift for MCP: instead of targeting enterprise SaaS (like Salesforce or Jira) first, Claude has moved into the "creative beachhead" — tools like Blender, Adobe for creativity, and Ableton.
Key facts
- Nine initial connectors: The launch includes Blender, Adobe for creativity, Ableton Live, Affinity by Canva, Autodesk Fusion, SketchUp, Resolume Arena, Resolume Wire, and Splice.
- Live scene execution: Unlike static file parsing, connectors like Blender allow Claude to execute code (via
bpy) against the live scene state. - Local-first architecture: The connectors run as local MCP servers, meaning your creative assets stay on your machine while Claude only sends and receives structured commands.
- Free and Paid Tiers: While many features (like Adobe Express) are available on free plans, advanced API access for tools like Photoshop typically requires a paid subscription.
Why Anthropic chose creative apps first
Most AI providers chase enterprise CRM and ERP integrations. Anthropic’s pivot to creative tools is non-obvious but brilliant for three reasons:
- High Syntax Barrier: Writing
bpy(Blender Python) or ExtendScript (Adobe) is notoriously difficult for humans but easy for LLMs. This creates an immediate "aha!" moment that checking a Jira ticket doesn’t. - Low Risk, High Visibility: A bug in a generative 3D script is a creative glitch; a bug in a Salesforce integration is a business catastrophe. Creative tools provide a safe sandbox to stress-test MCP in production.
- Synthesizing the Pipeline: Creative work is rarely done in one app. By winning the "bridge" between Blender and Photoshop, Anthropic positions Claude as the OS for the creative studio, not just another chat box.
- Creative tools have a high syntax barrier (bpy, ExtendScript) that makes LLM assistance immediately valuable while keeping failure risk low.
- A bug in a generative 3D script is a creative glitch; a bug in a CRM integration is a business incident — creative apps are safer early production targets for MCP.
- Positioning Claude as a cross-app bridge is a distribution strategy: the model becomes the interface to the entire creative studio stack, not a single chat feature.
This signals that Anthropic view MCP as a distribution play. By becoming the default way humans interact with complex, fragmented software suites, they bypass the need for every software vendor to build their own AI UI. For a primer on the protocol itself, see course/mcp-from-first-principles-to-production/01-why-mcp-exists.
Walkthrough: Controlling Blender via MCP
The Blender connector is the most technically transparent of Anthropic's nine creative integrations. Unlike connectors that wrap proprietary APIs, the Blender MCP server exposes Blender's native Python API (bpy) directly to Claude. That means every technique Claude uses here is a real bpy pattern you can learn, copy, and extend.
- The Blender MCP server exposes the native `bpy` Python API, so Claude's tool calls are real Blender patterns that execute against the live scene, not a simplified wrapper.
- Claude never touches your filesystem directly — it sends a tool call with Python code, and the local MCP server executes it inside Blender's interpreter.
- Because the connector runs locally, creative assets stay on your machine while only structured commands and results cross the MCP boundary.
How the connection works
Before touching any code, it helps to have a mental model of the data flow.
You (natural language)
│
▼
Claude (reasoning + code generation)
│ tool_call: execute_python({ code: "..." })
▼
Blender MCP Server (local process, port 9001 by default)
│ subprocess call via bpy
▼
Blender Python Interpreter
│ modifies scene graph
▼
Result dict ──► back up the chain to Claude ──► back to you
The MCP server runs locally alongside Blender. Claude never touches your filesystem directly — it sends a tool_call with Python code, the server executes it inside Blender's interpreter, and returns the result. This means Claude is working against the live scene state, not a static file.
Step 1 — Install the Blender MCP connector
- Open Claude.ai → Settings → Connectors → Browse
- Search for Blender and click Enable.
- The connector guide will prompt you to install the companion server package:
- ```bash
- pip install blender-mcp-server
- ```
- Start Blender (4.x), then start the MCP server from your terminal:
- ```bash
- blender-mcp-server --port 9001
- ```
- You should see:
Blender MCP server running on ws://localhost:9001. - Back in Claude.ai, click Test connection. A green checkmark confirms Claude can reach the server.
<Callout type="warning">
Blender version: The connector requires Blender 4.0 or later. Blender 3.x uses a different Python version (3.10) that is incompatible with blender-mcp-server. Run blender --version to check.
</Callout>
Step 2 — Your first tool-use call: query the scene
The safest first call is read-only: ask Claude to describe what's currently in the scene.
Using the Blender connector, tell me how many objects are in the current scene and list their names and types.
[tool_call: blender.execute_python]
\\\python
import bpy
objects = [(obj.name, obj.type) for obj in bpy.context.scene.objects]
print(f"Object count: {len(objects)}")
for name, typ in objects:
print(f" {name}: {typ}")
\\\
Result: Object count: 3 Camera: CAMERA Light: LIGHT Cube: MESH
The scene contains 3 objects: a default Camera, a Point Light, and the default startup Cube (a MESH type).`} />
Notice the pattern: Claude wrote bpy Python, the MCP server ran it inside Blender, and the stdout came back as the tool result.
Step 3 — Generate a procedural wave scene
Now for the main event. Give Claude a creative brief and let it author the full script.
I want to create a generative wave pattern in Blender: a 10×10 grid of cubes where each cube's Z-scale is determined by a sine wave based on its distance from the centre. Clear the scene first (keep t…
[tool_call: blender.execute_python]
\\\`python
import bpy
import math
GRID = 10 SPACING = 1.5 MAX_HEIGHT = 2.0
for row in range(GRID): for col in range(GRID): x = (col - GRID / 2) SPACING y = (row - GRID / 2) SPACING dist = math.sqrt(x2 + y2) z_scale = 0.2 + (MAX_HEIGHT (0.5 + 0.5 math.sin(dist - 0)))
bpy.ops.mesh.primitive_cube_add(size=1.0, location=(x, y, 0)) obj = bpy.context.active_object obj.name = f"WaveCube_{row}_{col}" obj.scale.z = z_scale wave_coll.objects.link(obj) bpy.context.scene.collection.objects.unlink(obj)
print(f"WaveGrid built: {GRID*GRID} cubes in collection 'WaveGrid'")
\\\`
Result: WaveGrid built: 100 cubes in collection 'WaveGrid'
Your wave scene is ready. You should see a 10×10 grid with cubes rising and falling in concentric rings from the centre.`} />
Walkthrough: Multi-tool pipelines with Adobe for creativity
Adobe for creativity is the broadest integration in Anthropic's creative connector lineup: a broad collection of tools across photo editing, video production, vector illustration, web design, and stock media [source]. Unlike the Blender connector, the Adobe for creativity connector wraps a collection of heterogeneous application APIs under one MCP interface.
What requires a paid plan
| Flow | Minimum plan |
|---|---|
| Adobe for creativity Libraries asset management (read/write) | Adobe for creativity Individual (paid) |
| Photoshop, Lightroom, Illustrator Smart APIs | Adobe for creativity Individual (paid) |
| Adobe Express design generation + export | Free (Adobe Express free tier) |
| Adobe Firefly image generation via API | Free trial credits |
Multi-tool flow: locate → edit → export
Each step here is a separate tool call; Claude sequences them automatically based on context.
Find 'hero-banner.png' in my Adobe for creativity Libraries, apply a 2px Gaussian Blur in Photoshop (use a Smart Object), and export it as a web-optimised JPEG.
[tool_call: adobe_creative_cloud.open_in_photoshop] { "asset_id": "asset_123" }
[tool_call: adobe_creative_cloud.photoshop_action] { "action": "convert_to_smart_object" }
[tool_call: adobe_creative_cloud.photoshop_action] { "action": "apply_smart_filter", "filter": "Gaussian Blur", "params": { "radius": 2 } }
[tool_call: adobe_creative_cloud.export_asset] { "format": "jpeg", "quality": 80 }
Done. Your blurred banner has been exported.`} />
Exploring the other creative connectors
While Blender and Adobe for creativity are the flagship integrations, Anthropic launched a total of nine connectors targeting different creative domains.
Ableton Live: Automating the manual
The Ableton connector focuses on documentation and session management. It allows Claude to read session metadata (track names, clips, device chains) and automate manual tasks like renaming 100+ stems or generating "track notes" from the actual MIDI/Audio data.
- Pattern: Read session → Analyze MIDI → Generate documentation/labels.
- Key benefit: Eliminates the "janitorial" work of professional music production.
Affinity by Canva: Automating production
The Affinity connector brings Model Context Protocol to professional design apps like Affinity Designer, Photo, and Publisher. Unlike the general Adobe for creativity connector, the Affinity integration focuses on deep layer manipulation and batch processing.
- Pattern: Select layers → Apply batch adjustment → Export to Canva.
- Key benefit: Automates the "final 10%" of production work, such as renaming hundreds of layers or adjusting export settings across a multi-page document.
Autodesk Fusion & SketchUp: CAD for agents
For industrial designers and architects, the Fusion and SketchUp connectors provide a bridge to 3D modelling. Like Blender, these connectors often surface a Python-like command interface, allowing Claude to build complex geometric structures from mathematical descriptions.
Resolume & Splice: Performance and Samples
- Resolume: Enables live visual performance automation. Claude can trigger clips or adjust effects based on a real-time event log (e.g., "Change the visual intensity when the BPM exceeds 140").
- Splice: Allows Claude to search your local and cloud sample libraries. "Find me all 120bpm techno kicks with a high transient" becomes a tool call instead of a 10-minute manual scroll.
Which connector should you use?
- Integration depth ranges from high (Blender's full Python API, Autodesk's command API) to medium (Adobe's multi-app pipeline, Ableton's metadata layer) — match the connector to the task complexity.
- Ableton and Splice focus on session management and asset discovery rather than direct audio generation, making them useful for documentation and search workflows.
- Resolume connectors target live performance automation, not post-production — clip and effect triggering is the primary use case.
| Domain | Tool | Integration Depth | Best for... |
|---|---|---|---|
| 3D / VFX | Blender | High (Python API) | Generative scenes, proceduralism |
| Design | Adobe for creativity | Medium (Multi-app) | Pipelines, batch editing, library management |
| Design | Affinity by Canva | Medium (Layer API) | Vector/raster design, precision illustration |
| CAD | Autodesk Fusion | High (Command API) | Precision modelling, engineering |
| Architecture | SketchUp | Medium (Command API) | Architectural modelling, space planning |
| Audio | Ableton Live | Medium (Metadata) | Stem management, documentation |
| Samples | Splice | Medium (Search) | Asset discovery |
| Live Visual | Resolume Arena | Medium (Clip API) | Live VJ performance, clip triggering |
| Live Visual | Resolume Wire | Medium (Patch API) | Generative real-time visuals, effect patching |
Sidebar: Design for resilience from day one
Building for resilience requires understanding the course/picking-a-frontier-model-2026-q2/01-dimensions-that-matter that ensure your pipeline stays live during provider volatility.
<Callout type="warning"> Two incidents in the same week. On April 30 (UTC), 2026, Claude.ai experienced a full availability outage [source]. The same week, a billing routing bug in Claude Code (the "HERMES.md incident") highlighted the risks of single-provider dependency [source].
Lesson: Claude is an excellent first-choice model, but no single provider has 100% uptime. If your tool-use pipeline depends entirely on one provider, one outage takes your whole workflow offline. </Callout>
The failover pattern
Route your tool-use calls through a provider-agnostic fallback chain. The Vercel AI SDK has no built-in fallbackModels option — fallback must be implemented explicitly, either at the gateway layer (OpenRouter's route configuration) or in your own code:
```typescript import { createAnthropic } from "@ai-sdk/anthropic"; import { createOpenAI } from "@ai-sdk/openai"; import { generateText } from "ai"; import type { Tool } from "ai";
const providers = [ createAnthropic()("claude-sonnet-4-6"), createOpenAI()("gpt-4o"), ];
async function resilientToolCall(
tools: Record<string, Tool>,
prompt: string,
) {
for (let i = 0; i < providers.length; i++) {
try {
return await generateText({ model: providers[i], tools, prompt });
} catch (err) {
if (i === providers.length - 1) throw err;
console.warn(Provider ${i} failed, trying fallback:, err);
}
}
}
```
When Claude is unavailable, the loop retries on the next provider. Your tool definitions work unchanged across providers because they are MCP-standard. For server hardening tips, see course/production-agents-claude-agent-sdk-mcp-connector/05-production-deploy-observability.
Hands-on exercise
Goal: Build a product-launch kit using the Adobe for creativity connector.
- Find the three most recently added assets in your Adobe for creativity Libraries.
- For each image, export a half-size JPEG at 75% quality.
- Save all exports to a new library named
"Launch Kit — [today's date]". - Return a summary table of the file size reductions.
Ship your creative pipeline: Identify, connect, and automate
The creative connectors represent more than just new features; they are a blueprint for agentic tool-use.
Instead of just observing your scene state, use Claude to actively generate, animate, and export assets. By using consistent naming conventions (like 'WaveCube_') across your tool calls, you build cumulative context that allows Claude to target specific objects and layers precisely. Finally, remember to prioritize resilience by routing your MCP pipelines through gateways like OpenRouter to maintain uptime during provider outages.
By mastering these nine connectors, you move from simple prompt engineering to architecting a fully automated, cross-platform digital production pipeline.
References
[1] Anthropic — Claude for Creative Work — https://www.anthropic.com/news/claude-for-creative-work · retrieved 2026-04-30 [2] Model Context Protocol Spec — https://modelcontextprotocol.io/ · retrieved 2026-04-30 [3] Blender Python API Overview — https://raw.githubusercontent.com/blender/blender/main/doc/python_api/rst/info_overview.rst · retrieved 2026-04-30 [4] Hacker News — Claude.ai and API unavailable [fixed] (outage discussion) — https://news.ycombinator.com/item?id=47956895 · retrieved 2026-04-30 [5] Claude Status — April 2026 outage — https://status.claude.com/incidents/2gf1jpyty350 · retrieved 2026-04-30 [6] Hacker News — Claude Code HERMES.md billing bug — https://news.ycombinator.com/item?id=47952722 · retrieved 2026-04-30
Legal and Regulatory Connectors in MCP
Legal MCP connectors let Claude work against legal systems of record: contract repositories, document management systems, e-discovery projects, research databases, deal rooms, and public-law datasets. The constraint is that the connector does not make the legal system less sensitive. The connector still has to preserve matter boundaries, user permissions, source provenance, review obligations, and auditability.
On May 12, 2026, Anthropic announced 20+ MCP connectors for legal software and 12 practice-area plugins for Claude, including connectors across contract lifecycle, document management, e-discovery, research, data-room, expert-network, and access-to-justice workflows [1][2][3][4]. This chapter is about the design rule behind that launch: legal connectors should make governed retrieval and workflow execution easier without turning Claude into the system of record.
Key facts
- Anthropic's May 12, 2026 legal launch introduced 20+ MCP connectors and 12 legal practice-area plugins; LawNext and Anthropic describe the same release as spanning contract systems, DMS, e-discovery, research, public-service, and expert-network categories [1][3].
- The
anthropics/claude-for-legalrepository packages practice-area plugin directories for commercial, corporate, employment, privacy, product, regulatory, AI governance, IP, litigation, law-student, legal-clinic, and legal-builder-hub workflows [4]. - The repository's connector map distinguishes connector infrastructure from plugin workflow packages: connectors wire Claude to data sources, while plugins package skills, agents, hooks, and practice profiles [4].
- Thomson Reuters separately announced an MCP integration connecting Claude to CoCounsel Legal, with Westlaw, Practical Law, and KeyCite named as the professional content backbone for that partnership [5].
- NIST SP 800-122 frames PII confidentiality as a lifecycle problem involving collection, use, retention, sharing, and disposal; this chapter applies that risk-control framing to legal MCP tool responses [6].
Why Legal Connectors Matter: The Stakes of "Data Bound"
Legal work runs on a highly specialized technology stack: contract lifecycle management (CLM) systems, e-discovery platforms, document management systems (DMS), and primary law research databases. Historically, bringing LLM intelligence to this data required bulk exports—moving sensitive files out of their governed environments and into the cloud for processing.
- MCP's query-in-place model lets Claude retrieve targeted context from legal systems without bulk-exporting sensitive matter files out of governed infrastructure.
- The connector does not reduce the sensitivity of the data — matter boundaries, user permissions, source provenance, and audit obligations all remain in force.
- Legal connectors must preserve the system of record's access controls rather than bypassing them at the protocol layer.
MCP changes this paradigm. By defining tools that act as "pipes" to existing systems, Claude can query these systems in real-time without ever requiring the bulk migration of the underlying data. This "query-in-place" model is the foundation of modern Legal AI [3].
The Principle of Deterministic Tooling In legal contexts, tools must behave deterministically. If a lawyer asks Claude to "Redact this document," they aren't looking for a "best effort" or a "creative interpretation" of what should be hidden. They require a tool that follows a strict, auditable protocol locally within the MCP server boundary before any text ever egresses to the LLM.
When designing legal tools, we prioritize deterministic logic over probabilistic reasoning. For example, a redaction tool should use verified PII-detection libraries or regex patterns on the server side, returning only the cleaned text to the model.
The May 2026 Connector Inventory
The expansion of the MCP ecosystem in May 2026 targeted virtually every segment of the legal market. Understanding this inventory is crucial for knowing what "off-the-shelf" connectors you can leverage versus what you need to build from scratch. The safe source of truth for this chapter is Anthropic's May 12 legal launch announcement [1][3], with partner documentation used only to explain how a named connector behaves in practice.
1. Contract Lifecycle and Drafting These connectors manage the lifecycle of an agreement, from initial drafting and negotiation to signature and post-execution auditing. - Definely gives Claude deterministic access to contract structure: definitions, cross-references, dependency maps, and structural diffs [3]. - DocuSign / DocuSign CLM connects Claude to agreement data and workflow status across drafting, signature, and post-signature management [3]. - Ironclad lets Claude query contract repositories and workflows while scoping results to the user's existing permissions, according to Anthropic's legal launch [3].
2. Deal Rooms and Transaction Documents M&A and financing work often happens in controlled data rooms where the audit trail matters as much as retrieval speed. - Box connects Claude to governed content so legal teams can search files, query documents, update content, and extract metadata while enforcing Box access policies [3]. - Datasite connects Claude to a virtual data room for folder setup, buyer Q&A tracking, document search, and readiness audits [3].
3. Document Management Systems The document management system is the source of truth for matter files, precedent banks, emails, and institutional knowledge. - iManage gives Claude permission-bound, auditable access to governed matter content without a bulk export [3]. - NetDocuments lets Claude search and retrieve repository documents and draft from approved precedents while preserving repository permissions and governance [3].
4. Expert Networks and Legal Skills These connectors do not replace the legal system of record. They connect Claude to specialized expertise, skills, and outside-counsel selection data. - Lawve AI offers a curated library of legal AI skills written by practicing lawyers, in-house counsel, and legal technologists [3][4]. - Lloyd by The L Suite connects qualifying L Suite members to the Braintrust member platform inside Claude [3]. - TopCounsel by The L Suite helps in-house counsel find outside counsel for a specific matter using The L Suite's proprietary recommendation data [3][4].
5. E-Discovery and Review E-discovery involves searching through large matter datasets: emails, chats, PDFs, spreadsheets, transcripts, and review coding. Connector design here must preserve the matter boundary. - Consilio / Aurora Legal AI makes live matter data and litigation-support workflows available through Claude while scoping output to what the user is entitled to see [3]. - Everlaw lets Claude search, organize, and retrieve documents from Everlaw projects using metadata, keywords, and document types, with direct review links back to the source system [3][7]. - Relativity / RelativityOne connects Claude to legal data intelligence workflows such as matter setup, workspace schema, access governance, and usage analysis [3].
6. Legal Research, Case Law, and Fiduciary-Grade Workflows Legal research connectors are only useful if they return provenance. A connector that returns a confident answer without citation-ready source metadata is not production-ready for this domain. - Thomson Reuters CoCounsel Legal connects Claude to a fiduciary-grade legal AI system grounded in Westlaw primary law, Practical Law guidance, KeyCite, and customer documents [3][5]. - Legal Data Hunter, Midpage, and Trellis connect Claude to legal corpora, case-law databases, and trial-court datasets with source links for verification [3]. - Harvey and Solve Intelligence expose specialized legal AI capabilities: Harvey for firm legal intelligence and Solve Intelligence for patent, prior-art, and claim-analysis workflows [3]. - BoardWise, Courtroom5, Descrybe, and Free Law Project / CourtListener support public-service and access-to-justice use cases, including board matters, pro se litigation guidance, primary-law search, and public court records [3][5].
Practice-Area Plugins: Intelligence vs. Infrastructure
There is a critical distinction in the Anthropic ecosystem between an MCP Connector and a Practice-Area Plugin: - Connector: The technical bridge to a specific platform, data source, or external capability, such as iManage, Everlaw, Ironclad, TopCounsel, or CoCounsel Legal. - Plugin: The domain-specific workflow package: prompts, slash commands, skills, guardrails, and what Anthropic calls "Setup Interviews" [3][4].
- A connector is the permission-scoped technical bridge to a specific platform; a plugin is the reusable workflow package that runs on top of one or more connectors.
- In production you typically need both: the plugin knows how to perform the legal task, but the connector provides the permission-scoped access to the matter data.
- The Setup Interview pattern calibrates a plugin's risk profile, playbooks, and house style before any task begins, making subsequent agent behavior consistent with firm procedure.
The "Setup Interview" Pattern Anthropic's 12 legal plugins start with a Setup Interview. This is a meta-tool interaction where the plugin asks the legal team about their specific playbooks, risk calibration (e.g., "Are we aggressive or conservative on limitation of liability?"), escalation chains, and house style [3]. This interview calibrates the agent's behavior for all subsequent tasks in that matter.
The 12 Specialized Domains 1. Commercial Legal: Specialized in vendor agreements, NDAs, and Master Service Agreements. 2. Corporate Legal: Handles the heavy lifting of M&A diligence, disclosure schedules, and closing checklists. 3. Employment Legal: Manages the complexities of HR law, termination policies, and leave deadlines. 4. Privacy Legal: Specifically designed for DPA (Data Processing Agreement) reviews and GDPR/CCPA compliance. 5. Product Legal: Clears marketing claims and runs product launch reviews for compliance with consumer protection laws. 6. Regulatory Legal: Monitors the "Federal Register" and local equivalents to flag policy changes affecting the business. 7. AI Governance Legal: A new category for 2026, triaging AI use cases and conducting mandatory impact assessments. 8. IP Legal: Trademark clearance, patent landscape analysis, and cease-and-desist drafting. 9. Litigation Legal: Privilege logs, legal hold management, and matter intake. 10. Law Student: Socratic drilling and IRAC (Issue, Rule, Application, Conclusion) grading. 11. Legal Clinic: Pro-bono intake and case memo generation for non-profit entities. 12. Legal Builder Hub: Finds and installs community-built legal skills with security, license, and freshness checks.
The operating rule is simple: a connector fetches or acts; a plugin decides how a legal team wants a repeatable workflow to run. In production, you usually need both. A Litigation Legal plugin may know how to draft a privilege log, but the Everlaw or Relativity connector is what gives it permission-scoped access to matter documents.
Designing Compliance-First Tool Definitions
When you are tasked with building a custom MCP connector for a legal team, your tool definitions must prioritize data boundary enforcement.
- Deterministic redaction (regex, checksums, dedicated PII-detection models) must run on the MCP server before text reaches the model — asking the LLM to "ignore" PII is a probabilistic control that fails silently.
- NIST SP 800-122 frames PII confidentiality as a lifecycle risk problem involving collection, use, retention, sharing, and disposal — not a prompt-instruction problem.
- A narrow redaction tool should name the matter boundary, the supported redaction types, and the audit behavior explicitly rather than accepting arbitrary text with vague instructions.
Implementation: The Redaction Tool A core requirement in legal workflows is to ensure PII (Personally Identifiable Information) never leaves the local environment. A redaction tool should be a "local-first" tool—logic that runs on the MCP server and strips data before it is returned to Claude.
Draft a JSON schema for a hypothetical `redact_document_pii` tool. The tool must accept an absolute document path and a list of PII types (SSN, Name, Phone). Label this as a HYPOTHETICAL TEACHING SCHE…
Show expected output
A valid MCP tool definition where `document_path` is the primary input and `pii_types` is an enum-constrained array. The description must emphasize that processing happens locally on the server.
Runnable Example: A Local Legal Redaction MCP Server
The schema above is useful for design review, but legal connectors become real only when the boundary is enforced in code. The official TypeScript SDK exposes McpServer for registering tools and StdioServerTransport for local MCP servers, which makes it a good fit for a small teaching connector that you can run from a terminal [8]. The example below is intentionally narrow: it redacts U.S. Social Security Number patterns from text that belongs to a matter, returns only cleaned text to the client, and records an audit event without storing the raw identifier.
This is not a vendor API and not legal advice. It is a runnable MCP server pattern for the control-plane behavior you want around legal data.
Create a fresh folder:
mkdir legal-redaction-mcp
cd legal-redaction-mcp
npm init -y
npm install @modelcontextprotocol/sdk zod
npm install -D typescript tsx @types/node
Then edit package.json so Node treats the file as an ES module and gives you a start command:
{
"type": "module",
"scripts": {
"start": "tsx server.ts"
},
"dependencies": {
"@modelcontextprotocol/sdk": "latest",
"zod": "latest"
},
"devDependencies": {
"@types/node": "latest",
"tsx": "latest",
"typescript": "latest"
}
}
Now create server.ts:
```ts import { createHash } from "node:crypto"; import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { z } from "zod";
const ssnPattern = /\b\d{3}-\d{2}-\d{4}\b/g;
type AuditEvent = { matterId: string; redactionType: "SSN"; count: number; fingerprints: string[]; occurredAt: string; };
const auditLog: AuditEvent[] = [];
function fingerprint(value: string) { return createHash("sha256").update(value).digest("hex").slice(0, 16); }
function redactSsn(text: string) { const matches = Array.from(text.matchAll(ssnPattern), (match) => match[0]); return { redactedText: text.replace(ssnPattern, "[REDACTED_SSN]"), matches }; }
const server = new McpServer({ name: "legal-redaction-teaching-server", version: "0.1.0" });
server.tool( "redact_legal_text", "Redact SSN patterns locally before text is returned to Claude. This teaching tool never logs raw SSNs.", { matter_id: z .string() .regex(/^MAT-\d{4}$/) .describe("Matter identifier used to keep the audit trail scoped, for example MAT-2042."), text: z .string() .min(1) .describe("Legal text to redact locally inside the MCP server process.") }, async ({ matter_id, text }) => { const result = redactSsn(text); auditLog.push({ matterId: matter_id, redactionType: "SSN", count: result.matches.length, fingerprints: result.matches.map(fingerprint), occurredAt: new Date().toISOString() });
return { content: [ { type: "text", text: JSON.stringify( { matter_id, redacted_text: result.redactedText, redaction_count: result.matches.length, audit_event: auditLog.at(-1) }, null, 2 ) } ] }; } );
const transport = new StdioServerTransport(); await server.connect(transport); ```
You can connect this server to any MCP-compatible client that supports stdio servers. For example, a local client configuration can point to npm start in this folder. Once connected, ask the client to call:
{
"matter_id": "MAT-2042",
"text": "Witness Jane Doe listed 123-45-6789 in the intake packet."
}
The expected tool result is:
{
"matter_id": "MAT-2042",
"redacted_text": "Witness Jane Doe listed [REDACTED_SSN] in the intake packet.",
"redaction_count": 1,
"audit_event": {
"matterId": "MAT-2042",
"redactionType": "SSN",
"count": 1,
"fingerprints": ["01a54629efb95228"],
"occurredAt": "2026-05-28T00:00:00.000Z"
}
}
Your exact fingerprint and timestamp will differ, but two facts must stay invariant: the returned text must not contain the original SSN, and the audit record must describe what happened without logging the raw value. This follows the PII minimization logic in NIST SP 800-122: reduce collection, exposure, and retention of sensitive identifiers wherever possible [6].
Implementation Walkthrough: E-Discovery Search
E-discovery is the process by which parties in a legal case must provide relevant documents to each other. It often involves searching across terabytes of data. Using MCP, we can create a search tool that queries an e-discovery platform like Everlaw or Relativity.
Design a hypothetical teaching schema for an e-discovery search tool called `search_discovery_vault`. It should take a `matter_id` and a `query`. Justify the use of `matter_id` as a required field fro…
Show expected output
A tool definition that enforces `matter_id` as a required parameter. The justification should explain that requiring a Matter ID prevents cross-tenant or cross-case data leakage by scoping the search at the protocol level.
Managed Agents for Legal For teams that prefer not to maintain their own MCP server infrastructure, Anthropic says a subset of the legal plugins can also be deployed as Managed Agents through Claude Platform cookbooks [3]. Treat this as a deployment option, not a substitute for governance. The same questions still apply: which connectors are enabled, which user identity is used, what tool calls require confirmation, and where the audit trail lives.
Hands-on Exercise: Build a Document Redactor MCP Tool
In this final exercise of the Builder track, you will scaffold the definition and logic for a tool that protects sensitive legal data.
1. Define the Schema
Create a file named redact_tool_spec.json. Use the inputSchema format to define a tool that takes a text_input and a redaction_level (e.g., "STRICT", "BASIC").
2. Implement Mock Logic
Write a simple Python script using the MCP SDK that:
- Receives the text_input from the model.
- Uses a regex to find and replace any string matching an SSN pattern (XXX-XX-XXXX) with [REDACTED].
- Returns the cleaned text to the model.
3. Verify Success Execute the tool using a mock input containing an SSN. Success Criteria: - The tool must return the redacted text, not the original. - The tool must log the fact that a redaction occurred (for the audit trail) without logging the original SSN (for compliance).
See also - Chapter 2: Beyond Function Calling — Understanding MCP - Chapter 5: Observability and Logging in MCP - Chapter 6: Security and Authentication - Anthropic Legal MCP vs OpenAI FDE: The Enterprise Wedge
References
1. Ambrogi, Robert. "Anthropic Goes All-In on Legal, Releasing More Than 20 Connectors and 12 Practice-Area Plugins for Claude." LawNext. 2026-05-12. Anthropic Goes All-In on Legal — LawNext (retrieved 2026-05-13).
2. Ropek, Lucas. "The AI legal services industry is heating up. Anthropic is getting in on the action." TechCrunch. 2026-05-12. The AI Legal Services Industry Is Heating Up — TechCrunch (retrieved 2026-05-14).
3. Anthropic. "Claude for the legal industry." Claude Blog. 2026-05-12. Claude for the legal industry — Claude Blog (retrieved 2026-05-14).
4. Anthropic. "anthropics/claude-for-legal — CONNECTORS.md." GitHub. 2026-05. anthropics/claude-for-legal CONNECTORS.md — GitHub (retrieved 2026-05-14).
5. Ambrogi, Robert. "Two Legal Research Providers Launch MCP Integrations with Claude." LawNext. 2026-05-12. Two Legal Research Providers Launch MCP Integrations with Claude — LawNext (retrieved 2026-05-14).
6. McCallister, Erika; Grance, Tim; Scarfone, Karen. "Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)." NIST Special Publication 800-122. 2010. Guide to Protecting the Confidentiality of PII — NIST SP 800-122 (retrieved 2026-05-14).
7. Everlaw. "Anthropic MCP integration." Everlaw. 2026-05. Anthropic MCP integration — Everlaw (retrieved 2026-05-14).
8. Model Context Protocol. "TypeScript SDK." GitHub. TypeScript SDK — GitHub (retrieved 2026-05-28).
What's next Congratulations on completing the Builder track! In the final Capstone Project, you will apply everything you've learned to build a production-ready MCP "Agentic Connector" that bridges a secure corporate system to Claude, complete with full observability and a documented compliance trail.
SMB and Growth Connectors
Small businesses do not need a beautiful generic "tool calling" demo. They need reliable help across messy workflows: payments, invoices, CRM updates, documents, customer emails, and approvals. This final chapter turns the patterns from the course into an SMB connector design you can actually ship.
The outline for this course names a Payroll Assistant: reconcile PayPal settlements against a QuickBooks-style ledger and draft reminder emails for missing payments. We will use that as the capstone bridge. The exact vendor APIs may differ in your environment, but the connector architecture should not.
Prerequisites check
You should have completed Chapters 1-8. In particular, you need:
- Native Claude tool-use flow from Chapter 1.
- MCP tools/resources/prompts from Chapters 2-4.
- Structured logs from Chapter 5.
- Authorization and approval gates from Chapter 6.
- Domain-specific connector thinking from Chapters 7 and 8.
If any of those are missing, do not build the payroll workflow yet. Finance connectors amplify every weak spot.
The SMB workflow stack
An SMB connector rarely talks to one system. A real workflow may touch:
- Finance ledger: invoices, payments, payouts, fees.
- Payment processor: settlements, disputes, transaction IDs.
- CRM: customer owner, deal stage, account notes.
- Email or workspace: reminders, internal review threads, attachments.
- Document system: statements, contracts, receipts.
- An SMB connector typically spans multiple systems; the design job is to present a coherent set of business actions rather than one raw endpoint per vendor.
- Replacing vendor-specific primitives (quickbooks_query, paypal_get) with workflow-level tools (find_unmatched_settlements, draft_missing_payment_reminders) makes approval and logging tractable.
- Finance connectors specifically benefit from boring, narrow interfaces — the more sensitive the action, the less ambiguity the tool should allow.
The connector should present these as business actions, not raw vendor endpoints.
Bad:
quickbooks_query(sql)
paypal_get(path)
hubspot_patch(object)
gmail_send(raw)
Better:
find_unmatched_settlements(date_range)
match_payment_to_invoice(settlement_id, invoice_id)
draft_missing_payment_reminders(run_id)
request_approval_for_reminders(run_id)
The better tools describe the accounting workflow. They also make approval and logging tractable.
<Callout type="hot"> Finance connectors need boring interfaces. The more sensitive the action, the less clever the tool should be. Make each operation narrow, typed, logged, and reviewable. </Callout>
Split reconciliation from action
The Payroll Assistant should not combine "find mismatches" and "email customers" into one call. Use stages:
- Collect ledger entries for the period.
- Collect payment settlements for the same period.
- Match by amount, date window, payer, and reference.
- Produce exceptions.
- Draft reminders for exceptions.
- Request human approval.
- Send only after approval.
- Combining reconciliation and customer-facing send into one tool call makes approval impossible and audit trail ambiguous.
- Each stage carries different risk: matching is read-heavy, drafting is reversible, sending is external — these belong in separate tools.
- Splitting stages also means a failure in matching does not roll back a send that already happened.
Each stage has different risk. Matching is read-heavy. Drafting is reversible. Sending is external and customer-facing.
Design MCP tools for a Payroll Assistant that reconciles PayPal settlements against an invoice ledger. Keep reconciliation separate from customer-facing email.
- list_invoice_ledger(date_range): read ledger entries.
- list_payment_settlements(date_range): read settlement records.
- propose_settlement_matches(date_range): returns matched and unmatched records.
- draft_missing_payment_reminders(reconciliation_run_id): drafts reminder emails.
- request_reminder_send_approval(reconciliation_run_id): creates an approval package.
- send_approved_reminders(approval_id): sends only after approval.
This separates read, draft, approval, and send phases.`} />
Approval state
Use an explicit awaiting_approval state for sensitive actions. Do not hide it in prose.
{
"status": "awaiting_approval",
"workflow": "payroll_reminders",
"run_id": "payroll_2026_05_14",
"summary": {
"unmatched_settlements": 4,
"drafted_reminders": 3,
"total_amount": "USD 1840.00"
},
"approval_required_for": "send_customer_emails"
}
- An explicit `awaiting_approval` status in the tool result gives the host application a machine-readable state it can enforce, not just prose Claude can explain away.
- The approval object should include run_id, a quantitative summary, and what specifically requires approval — enough for a human to decide without seeing full private data.
- The host must gate the next write action on a confirmed approval event; Claude presenting the draft is not an approval.
This object gives Claude something clear to explain and gives the host application a state machine it can enforce.
CRM coordination
CRM updates are also write actions. A safe connector can draft proposed updates:
- "Move deal to Payment follow-up."
- "Add note: settlement missing for invoice INV-123."
- "Assign owner Alex because account owner is Alex."
- CRM updates that trigger automated sequences are effectively external actions and require the same approval gate as customer-facing emails.
- The safe pattern is propose_crm_updates → human approval → apply_approved_crm_updates, not a direct write after finding a mismatch.
- Whether CRM notes are low-risk depends on whether they trigger downstream automations — risk must be assessed at the workflow level, not just the field level.
But the connector should not silently update every customer record. If your organization treats CRM notes as low-risk, you may allow writes with role-based authorization. If CRM updates trigger automations, treat them like external actions and require approval.
A CRM update will trigger an automated customer success sequence. Should the Payroll Assistant write the CRM stage directly after finding a missing payment? Explain the safer connector design.
Safer design: - propose_crm_updates(reconciliation_run_id) - show affected accounts and stage changes - require human approval - apply_approved_crm_updates(approval_id)
This keeps automation behind an explicit approval gate.`} />
Capstone architecture
Your final connector should expose:
Resources:
payroll://runs/{run_id}/summarypayroll://runs/{run_id}/exceptionspayroll://policies/reminder-template
Tools:
start_reconciliation(date_range)propose_settlement_matches(run_id)draft_missing_payment_reminders(run_id)request_send_approval(run_id)send_approved_reminders(approval_id)
Prompts:
explain_reconciliation_summaryreview_reminder_drafts
Logs:
- operational events for every tool call.
- audit events for approval creation and sending.
Hands-on exercise
Build the Payroll Assistant design document and one runnable stub tool.
Success criteria:
- You define resources, tools, prompts, logs, and audit events.
- You implement a stub
propose_settlement_matches(run_id)tool against demo JSON data. - The tool returns matched records and exceptions.
- Customer-facing email remains a draft until an approval object is created.
- You can explain how this connector satisfies the course outcomes.
What's next
This is the final chapter. Your capstone is to turn the Payroll Assistant or another domain connector into a production-ready MCP server with narrow tools, resources, structured logs, authorization, and approval gates.
[1]: Anthropic news index, https://www.anthropic.com/news [2]: Anthropic, "Tool use with Claude", https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/overview [3]: Model Context Protocol documentation, https://modelcontextprotocol.io/
References
Claude Code Dynamic Workflows: Fan-Out, Checkpoint, and Verify (2026)
Claude Code dynamic workflows turn a single orchestrator instance into a parallel execution engine: Claude writes and runs an orchestration script that fans work out to tens or hundreds of sub-agents, checkpoints progress between stages, and verifies sub-agent results before handing anything back. This is not a cosmetic upgrade to the prompt interface — it is a different execution model that changes how you think about cost, correctness, and control.
This chapter teaches you to design fan-out patterns, implement durable checkpoints, and build result-verification logic that earns trust. It also teaches you when the overhead is not worth it.
Prerequisites check
Before continuing, confirm that you:
- Can run a basic Claude tool-use script from Chapter 1 without errors.
- Understand the MCP protocol from Chapters 2 and 3 — sub-agents often expose tools via MCP.
- Have working structured logging from Chapter 5, because orchestrator-level debugging without logs is impractical.
- Understand authorization gates from Chapter 6 — parallel agents amplify every authorization weakness.
If the Chapter 5 logging setup is not in place, you will not be able to tell which sub-agent produced a bad result. Fix that first.
Static chains vs. dynamic workflows
The tool-use pattern you learned in Chapters 1 through 9 is a static chain: one Claude instance, one conversation thread, tools called one at a time, results incorporated sequentially.
User prompt → Claude decides → Tool A → Claude decides → Tool B → Final answer
A static chain is correct for most tasks. Claude reads the last tool result before calling the next tool, so it can adapt. The price is latency: each tool call is a synchronous round trip through the model, and you can only run one at a time.
- A static chain is sequential and adaptive but limited to one tool call at a time, making it slow for large-batch workloads.
- Dynamic workflows replace the sequential conversation loop with an orchestration script that spawns multiple sub-agents in parallel.
- Wall-clock time compresses with dynamic workflows, but token spend multiplies — each sub-agent is a full Claude invocation.
Dynamic workflows break that constraint. The orchestrator Claude instance does not call tools sequentially from inside a conversation loop. Instead, it writes an orchestration script — executable code that the host environment runs — and that script spawns multiple sub-agent instances in parallel.[1]
Orchestrator → generates script
Script → spawns Agent A, Agent B, Agent C in parallel
Agents A, B, C run concurrently → write results to checkpoint store
Script → reads checkpoint store → verifies results → assembles final output
The practical effect: a codebase analysis that would take 20 sequential tool calls can instead fan out to 20 sub-agents reading 20 files simultaneously. Wall-clock time compresses, but token spend multiplies.
The orchestrator model
The orchestrator is a Claude Code instance configured to produce and run multi-agent scripts rather than to answer a single user query. Anthropic's Claude Code SDK supports this pattern via the --output-format and subprocess orchestration APIs.[3]
The orchestrator is responsible for four things:
- Decomposing the task into parallel units of work.
- Spawning sub-agents with scoped context (not the full conversation history).
- Checkpointing each sub-agent's output to a persistent store.
- Verifying that the checkpoint store is complete and correct before proceeding.
- Sub-agents receive only the context the orchestrator explicitly provides — they do not share memory with each other or with the orchestrator's conversation thread.
- Sub-agent isolation prevents one agent's error state from contaminating another's result, but also means dense sequential dependencies cannot be parallelized.
- The orchestrator's four responsibilities (decompose, spawn, checkpoint, verify) must all be present; missing one makes the others harder to operate.
Sub-agents are isolated. They receive only the context the orchestrator hands them — a slice of the task, any tools they need, and output format instructions. They do not share memory with each other or with the orchestrator's conversation thread.
This isolation is a feature, not a limitation. It prevents one sub-agent's error state from contaminating another's result. The downside: if your task has dense sequential dependencies (the output of step 3 is required input to step 4), dynamic workflows do not help and may hurt.
Fan-out patterns
There are two structurally different fan-out shapes.
- Homogeneous fan-out applies the same task structure to different inputs; results are structurally identical and easy to aggregate.
- Heterogeneous fan-out assigns genuinely independent analysis dimensions to different sub-agents; results must be merged by the orchestrator.
- Use sequential chaining inside the orchestrator for steps with dense dependencies — forcing them into a heterogeneous fan-out adds complexity without parallelism benefit.
Homogeneous fan-out
All sub-agents receive the same task structure applied to different inputs. Example: analyze 50 code files for security issues. Each sub-agent gets one file. Results are structurally identical.
```python import subprocess, json, pathlib, concurrent.futures
def analyze_file(file_path: str) -> dict: result = subprocess.run( ["claude", "-p", f"Analyze this Python file for security issues. Return JSON: {{issues: [...], severity: 'low'|'medium'|'high'}}", "--input-file", file_path, "--output-format", "json"], capture_output=True, text=True, timeout=120 ) return {"file": file_path, "result": json.loads(result.stdout)}
files = list(pathlib.Path("src/").glob("*/.py"))
with concurrent.futures.ThreadPoolExecutor(max_workers=20) as pool: futures = {pool.submit(analyze_file, str(f)): f for f in files} results = [f.result() for f in concurrent.futures.as_completed(futures)] ```
The orchestrator script launches 20 parallel claude -p processes (the practical thread limit for a Max account without hitting rate ceilings). Each sub-agent returns structured JSON. The orchestrator collects and checkpoints all results before doing anything else.
Heterogeneous fan-out
Sub-agents receive different tasks that produce inputs for a downstream aggregation step. Example: one sub-agent audits for security, another for performance, a third for test coverage. Results are structurally different and must be merged by the orchestrator.
```python AGENTS = { "security": "Audit for OWASP Top 10 issues. Return {findings: [...]}", "perf": "Identify hot paths and O(n²) loops. Return {findings: [...]}", "coverage": "List untested public functions. Return {findings: [...]}", }
def run_agent(role: str, prompt: str, target: str) -> dict: result = subprocess.run( ["claude", "-p", prompt, "--input-file", target, "--output-format", "json"], capture_output=True, text=True, timeout=180 ) return {"role": role, "result": json.loads(result.stdout)}
with concurrent.futures.ThreadPoolExecutor() as pool: futures = [ pool.submit(run_agent, role, prompt, "src/payments.py") for role, prompt in AGENTS.items() ] specialist_results = [f.result() for f in concurrent.futures.as_completed(futures)] ```
Heterogeneous fan-out is useful when the analysis dimensions are genuinely independent — security review has nothing to say about test coverage. When they are not independent (e.g., you need the security result to scope the performance review), use sequential chaining inside the orchestrator script, not a heterogeneous fan-out.
I want to audit a Python monorepo with 80 files. Some files need security analysis, some need performance review, and some need both. Design a fan-out orchestration strategy. Consider: how do I route …
- Classification pass (single agent, sequential)
- Run one fast classification agent first: read file names + top-level imports, return a JSON map of {file_path: ["security", "perf"]} tags. Cost: ~1 agent × 80 short reads. Time: ~30s.
- Parallel audit agents (fan-out)
- From the classification map, build two lists: security_files, perf_files. Files needing both appear in both lists. Spawn two independent thread pools: one for security agents, one for perf agents. Max 20 workers each. Each agent receives one file and its assigned role.
- Checkpoint store shape
- {
- "run_id": "audit-2026-05-31-a1b2",
- "files": {
- "src/payments.py": {
- "security": { "status": "done", "findings": [...] },
- "perf": { "status": "done", "findings": [...] }
- },
- "src/models.py": {
- "security": { "status": "done", "findings": [...] },
- "perf": { "status": "pending" }
- }
- }
- }
- Deduplication
- Track each (file, role) pair. If the run crashes and you resume, skip pairs where status == "done". This is idempotent fan-out.`}
- />
Token budget math
Dynamic workflows multiply token spend. Every sub-agent is a full Claude invocation. A homogeneous fan-out of 50 sub-agents costs 50× the tokens of running one.
- Token spend scales linearly with sub-agent count: a 50-agent fan-out costs 50× a single invocation, and Claude Code plan weekly limits apply to the total.
- Log `input_tokens` and `output_tokens` from every sub-agent and aggregate them in the checkpoint store to create a cost record per orchestration run.
- Measure token spend on a small pilot run before scheduling a recurring large fan-out; cost accumulation is invisible without explicit tracking.
Rule of thumb for estimating cost before you build:
Total tokens ≈ (n_subagents × avg_context_per_agent) + orchestrator_tokens
For a code analysis run with 50 agents, each receiving a 2,000-token file plus a 500-token prompt:
50 × 2,500 = 125,000 input tokens
50 × 1,000 = 50,000 output tokens (estimated)
Orchestrator: ~5,000 tokens
Total: ~180,000 tokens
At Sonnet 4.6 pricing (as of 2026), this is a fraction of a dollar — but with Claude Code plans that enforce weekly [[token]] limits, burning 180,000 tokens on one orchestration run can noticeably dent your weekly quota.[5]
Community posts from May 2026 consistently flag this: users who turned on dynamic workflows for large fan-outs saw their weekly credit reset events become meaningful rather than routine.[2] The problem is not the per-run cost but the invisibility — if you do not log total tokens per orchestration run, you will not see the accumulation until the quota wall hits.
Mitigation: log input_tokens and output_tokens from every sub-agent's response. Aggregate them at orchestrator level. Write the total to your checkpoint store. You then have a cost record per run that persists across failures.
def run_agent_with_cost_tracking(role: str, prompt: str, file: str) -> dict:
result = subprocess.run(
["claude", "-p", prompt,
"--input-file", file,
"--output-format", "json",
"--verbose"], # prints usage to stderr
capture_output=True, text=True, timeout=180
)
payload = json.loads(result.stdout)
# claude --verbose emits usage JSON on stderr
try:
usage = json.loads(result.stderr.splitlines()[-1])
except Exception:
usage = {}
return {
"role": role,
"file": file,
"result": payload,
"tokens_in": usage.get("input_tokens", 0),
"tokens_out": usage.get("output_tokens", 0),
}
<Callout type="warning"> Weekly Claude Code plan limits apply per-account, not per-script. A dynamic workflow that fans out to 100 sub-agents burns plan quota as if a human ran 100 manual Claude Code tasks. On Pro, that can exhaust a week's allowance in a few runs. On Max, limits are higher but still finite. Before you schedule a recurring orchestration job, measure its token cost on one real run and multiply by run frequency. </Callout>
Checkpoint patterns
A [[checkpoint]] is a durable write of an intermediate result that the orchestration script can read on restart. Without checkpoints, a partial failure means re-running all sub-agents from scratch — wasting quota and time.
The minimal checkpoint pattern:
```python import json, pathlib, hashlib, time
CHECKPOINT_DIR = pathlib.Path(".checkpoints") CHECKPOINT_DIR.mkdir(exist_ok=True)
def checkpoint_key(run_id: str, file: str, role: str) -> str: content = f"{run_id}:{file}:{role}" return hashlib.sha256(content.encode()).hexdigest()[:16]
def write_checkpoint(run_id: str, file: str, role: str, result: dict): key = checkpoint_key(run_id, file, role) path = CHECKPOINT_DIR / f"{key}.json" path.write_text(json.dumps({ "run_id": run_id, "file": file, "role": role, "ts": time.time(), "result": result }))
def read_checkpoint(run_id: str, file: str, role: str) -> dict | None: key = checkpoint_key(run_id, file, role) path = CHECKPOINT_DIR / f"{key}.json" if path.exists(): return json.loads(path.read_text()) return None
def run_agent_checkpointed(run_id: str, role: str, prompt: str, file: str) -> dict: cached = read_checkpoint(run_id, file, role) if cached: return cached["result"] # skip sub-agent if already done result = run_agent_with_cost_tracking(role, prompt, file) write_checkpoint(run_id, file, role, result) return result ```
With this pattern, you can kill the orchestration script mid-run, fix a broken agent prompt, and restart. Only the incomplete (file, role) pairs re-run. Completed pairs read from disk in milliseconds.
Checkpoint granularity
Write checkpoints per unit of work (per file, per document, per API call), not per stage. A coarser checkpoint — "stage 2 is done" — forces a full stage re-run on failure. A finer checkpoint — "file X, role Y is done" — lets you resume at the exact failure point.
The cost of fine-grained checkpoints is small JSON files and slightly more I/O. The cost of coarse checkpoints is duplicated sub-agent invocations. Fine-grained is almost always the right choice.
Verification before handoff
A checkpoint stores what a sub-agent said. A [[verification]] step decides whether to trust it.
Without verification, a silent sub-agent failure — malformed JSON, hallucinated schema field, empty output, timeout — propagates into the final result. The orchestrator assembles garbage and the downstream consumer sees it.
Verification should happen before the orchestrator commits to using a checkpoint result in the final assembly.
```python from typing import TypedDict
class SecurityFinding(TypedDict): file: str line: int severity: str description: str
def verify_security_result(result: dict) -> tuple[bool, str]: """Returns (is_valid, reason). Reason is empty string on success.""" if "findings" not in result: return False, "missing 'findings' key" if not isinstance(result["findings"], list): return False, "'findings' must be a list" for i, f in enumerate(result["findings"]): if not isinstance(f.get("severity"), str): return False, f"finding[{i}] missing string 'severity'" if f["severity"] not in ("low", "medium", "high", "critical"): return False, f"finding[{i}] has invalid severity: {f['severity']}" return True, ""
def assemble_final_report(run_id: str, files: list[str]) -> dict: verified, rejected = [], [] for file in files: cp = read_checkpoint(run_id, file, "security") if cp is None: rejected.append({"file": file, "reason": "no checkpoint"}) continue ok, reason = verify_security_result(cp["result"]) if ok: verified.append({"file": file, "findings": cp["result"]["findings"]}) else: rejected.append({"file": file, "reason": reason})
if rejected: # surface failures explicitly rather than silently omitting them print(f"WARNING: {len(rejected)} files rejected during verification:") for r in rejected: print(f" {r['file']}: {r['reason']}")
return {"verified": verified, "rejected": rejected, "run_id": run_id} ```
The orchestrator should expose the rejected list to the caller. Silent omission is the worst outcome — the downstream system thinks all 80 files were analyzed when 5 were silently dropped.
An orchestration script fanned out to 20 sub-agents. Two returned empty output (timeout), one returned valid JSON with an unexpected extra field 'confidence_score' not in the agreed schema, and one re…
Empty output (2 agents — timeouts) Mark these as status: "failed", reason: "empty output / timeout". Do NOT silently drop them. Log which files were skipped. Depending on tolerance: either re-queue these specific files with a longer timeout, or surface them in the final report as "unanalyzed".
Extra schema field ('confidence_score') This is a schema evolution case, not a failure. If the field is not required by your contract, accept the result and strip or preserve the extra field based on policy. Do not reject valid outputs for forward-compatible additions. Log the unexpected field for schema tracking.
Python traceback instead of JSON This is a hard failure. The sub-agent crashed before producing output. Mark status: "failed", reason: "agent error — non-JSON output". Log the traceback for debugging. Do not attempt to parse it as a finding.
Orchestrator report Always include a summary: { "total_files": 20, "verified": 17, "failed": { "timeout": 2, "agent_error": 1 }, "warnings": { "schema_drift": 1 } }
Never present 17/20 as "complete". The 3 failures are load-bearing gaps in coverage.`} />
Rollback patterns
Not all dynamic workflows are read-only. Some fan-out to sub-agents that write to databases, call APIs, or modify files. When a verification step fails after some sub-agents have already committed writes, you need rollback.
The simplest rollback pattern is two-phase execution:
- Propose phase — sub-agents generate proposed changes, write them to the checkpoint store as diffs or new-state blobs. No actual writes to production systems.
- Commit phase — the orchestrator runs verification on all proposals. If all verify, it applies them. If any fail, it discards the entire set and surfaces the failures.
```python def execute_two_phase(run_id: str, tasks: list[dict]) -> dict: # Phase 1: propose proposals = {} for task in tasks: proposal = run_propose_agent(run_id, task) # writes to checkpoint, not to DB ok, reason = verify_proposal(proposal) if not ok: return {"status": "aborted", "reason": reason, "task": task} proposals[task["id"]] = proposal
return {"status": "committed", "count": len(committed)} ```
If you cannot implement two-phase (e.g., external API calls have no transactional rollback), design the fan-out so that each sub-agent is idempotent: running it twice has the same effect as running it once. Idempotency is the poor-person's rollback when real rollback is unavailable.
When NOT to use dynamic workflows
Dynamic workflows are not the default for tool use. They are a specialized pattern that adds complexity, cost, and operational overhead. Use them only when the task justifies it.
Do not use dynamic workflows when:
- The task fits in one prompt. If Claude can produce the correct result in a single turn with one or two tool calls, fan-out adds nothing except tokens and latency.
- Sequential dependencies prevent meaningful parallelism. If step 3 depends on step 2's output, you cannot run them in parallel. Sequential chaining inside one Claude instance is simpler.
- You need human approval at an undetermined mid-flight point. Dynamic workflows are designed for machine-to-machine completion. If a human must review and approve at a step that only becomes identifiable during execution, an interactive agent loop is the right tool, not an orchestration script.
- Latency matters more than throughput. Spawning sub-agents has overhead: process startup, context injection, output parsing, checkpoint I/O. For a task with five small sequential steps, that overhead is more than the parallelism saves.
- The token cost is not justified. If your job takes 10 minutes sequentially and 2 minutes with 50 sub-agents, but the sequential path costs 5,000 tokens and the fan-out costs 250,000 tokens, you are spending 50× to save 8 minutes. That trade-off only makes sense at scale or when wall-clock time has a direct dollar value to you.
Practical fan-out limits by plan tier
As of June 2026, the practical fan-out ceiling is determined by weekly token limits, not a hard API concurrency cap.[2][5]
| Plan | Practical max concurrent sub-agents | Notes |
|---|---|---|
| Claude Pro | 5–10 | Weekly credit resets quickly at high fan-out |
| Claude Max (×5) | 20–40 | Comfortable for mid-size batch jobs |
| Claude Max (×20) | 80–150 | Viable for large-scale orchestration |
| API (pay-as-you-go) | Rate-limit dependent | No weekly cap; cost is per-token |
These are community-reported heuristics, not Anthropic's official numbers.[1] Your optimal concurrency depends on your specific prompts and context sizes. Always measure token spend on a small pilot run before committing to a large fan-out.
For the API route (non-plan, pay-as-you-go), weekly limits do not apply but per-minute rate limits do. Throttle sub-agent launch rates with exponential backoff if you see 429s.
Summary of the dynamic workflow stack
A production dynamic workflow has five layers:
┌─────────────────────────────────────────────────────────┐
│ 1. Orchestrator Decomposes task → spawns sub-agents │
├─────────────────────────────────────────────────────────┤
│ 2. Sub-agents Isolated Claude -p invocations │
│ Each receives scoped context + tools │
├─────────────────────────────────────────────────────────┤
│ 3. Checkpoint store JSON files per (run_id, unit, role)│
│ Enables idempotent resume │
├─────────────────────────────────────────────────────────┤
│ 4. Verification Schema + value checks before assembly │
│ Rejects or flags incomplete results │
├─────────────────────────────────────────────────────────┤
│ 5. Cost tracking input_tokens + output_tokens per agent│
│ Aggregated per run in checkpoint store│
└─────────────────────────────────────────────────────────┘
Missing any layer makes the others harder to operate. Verification without checkpoints means you cannot recover from partial failures. Checkpoints without cost tracking means you cannot explain the quota bill.
Try this yourself
Hands-on exercise: build a checkpointed code reviewer
Build an orchestration script that:
- Accepts a directory path as input.
- Lists all
.pyfiles in the directory. - Fans out to one
claude -psub-agent per file, asking each to return{file, issues: [{line, severity, message}]}as JSON. - Writes each result to a checkpoint store keyed by
(run_id, file_path). - Verifies each result (checks that
issuesis a list, each issue haslineas int andseverityin["low","medium","high","critical"]). - Prints a summary: total files, verified, rejected (with reasons), total input + output tokens.
Success criteria:
- Run the script on a 10-file sample directory. Kill it after 5 files complete. Restart it. Confirm that only the 5 incomplete files run again (not the already-complete 5).
- Manually corrupt one checkpoint file (remove the issues key). Confirm the script flags it as a verification failure, not a crash.
- Check the token totals in the summary against what you expect from your plan's usage dashboard.
Stretch goal: add a --dry-run flag that reads checkpoints and prints the summary without spawning any sub-agents.
What's next
You have now covered the full tool-use stack from basic function calling (Chapter 1) through production connector patterns (Chapters 7–9) and orchestration (this chapter). The Capstone Project applies everything: build a production-ready MCP Agentic Connector that bridges a secure domain system to Claude, includes structured logs for every tool call, authorization per tool, a compliance audit trail, and optionally a dynamic workflow layer for batch operations.
[1]: r/ClaudeAI thread on Claude Code dynamic workflows, 2026-05-29 — https://www.reddit.com/r/ClaudeAI/comments/1tq9ofy/introducing_dynamic_workflows_in_claude_code/ [2]: r/ClaudeAI thread on Claude Code credits and weekly limits, 2026-05-29 — https://www.reddit.com/r/ClaudeAI/comments/1tq9vqf/claude_code_credits_rebooted_after_coding_for/ [3]: Anthropic Claude Code SDK documentation — https://docs.anthropic.com/en/docs/claude-code/sdk [4]: Anthropic support: Claude Agent SDK with Claude plan — https://support.claude.com/en/articles/15036540-use-the-claude-agent-sdk-with-your-claude-plan [5]: Anthropic Claude Code settings documentation — https://docs.anthropic.com/en/docs/claude-code/settings
References
- https://www.reddit.com/r/ClaudeAI/comments/1tq9ofy/introducing_dynamic_workflows_in_claude_code/
- https://www.reddit.com/r/ClaudeAI/comments/1tq9vqf/claude_code_credits_rebooted_after_coding_for/
- https://docs.anthropic.com/en/docs/claude-code/sdk
- https://docs.anthropic.com/en/docs/claude-code/settings
- https://support.claude.com/en/articles/15036540-use-the-claude-agent-sdk-with-your-claude-plan
- https://docs.anthropic.com/en/docs/about-claude/models/overview