
mcp-builder: build production MCP servers without reading the spec
Anthropic's mcp-builder skill (60,300 installs) enforces a structured 4-phase workflow — Research → Implement → Review → Evaluate — that cut one engineer's token costs by 98.7% ($98.39 → $0.13 per 1,000 checks) across 15 enterprise MCP servers. Full install guide, Phase 4 eval framework breakdown, 11 known bugs with a practical mitigation, and a clear "when NOT to use" section.

リサーチノート
Most MCP server tutorials end at "paste your OpenAPI spec into a generator and call it done." MCP (Model Context Protocol, the open standard that lets LLMs call external tools through a structured interface) has a growing library of ready-made servers — but building a custom one for your own API is still a friction point. One founder tried exactly that in a B2B SaaS context and found it "completely breaks in production" — enterprise customers bring custom field formats and LLM-specific instruction requirements that no static generator handles.1 mcp-builder (Anthropic, Apache 2.0) is the answer to that failure mode: a structured 4-phase agent skill that enforces MCP design principles rather than just scaffolding code.
60,300 installs on skills.sh across Claude Code, Cursor, Windsurf, Codex, GitHub Copilot, Cline, and 14 other agent tools.2 It sits at #8 in the anthropics/skills repo (140k stars, Apache 2.0).3
What it actually does
Firecrawl's Hiba Fathima put it clearly: "This is not a one-shot generator. It is a structured guide that enforces the design principles that make MCP servers genuinely useful."4 The skill itself states: "The quality of an MCP server is measured by how well it enables LLMs to accomplish real-world tasks."5
The workflow has four phases that run sequentially:
| Phase | What happens |
|---|---|
| 1. Research | Study MCP design principles (API coverage vs. workflow tools, naming conventions, context management); read SDK docs; map out authentication, endpoints, data models |
| 2. Implement | Scaffold project (TypeScript recommended); build API client, error handling, pagination; add per-tool behavior hints (readOnlyHint, destructiveHint, idempotentHint — MCP SDK annotations that tell the client whether a tool is safe to call without confirmation) |
| 3. Review | Code quality pass (DRY, type coverage, description clarity); test with MCP Inspector (Anthropic's official local GUI for testing MCP servers interactively) |
| 4. Evaluate | Generate 10 independent, read-only, complex, verifiable XML eval questions — confirm the LLM can actually use the server before deploying |
The evaluation phase is where most generators skip out. Firecrawl notes it "verifies the model can actually use your server before you deploy it" — which is exactly what you want to know and almost never check manually.4
On language choice: mcp-builder recommends TypeScript as default — "AI models are good at generating TypeScript code, benefiting from its broad usage, static typing and good linting tools."5 Python with FastMCP (a high-level Python library for building MCP servers) is supported for local/stdio use cases.
Install
One command works across all supported agents:
npx skills add https://github.com/anthropics/skills --skill mcp-builderNo prerequisites, no runtime dependencies. The skill is plain Markdown context loaded by your agent. For Claude Code specifically, you can also use
/plugin marketplace add anthropics/skills followed by selecting mcp-builder.2 The workflow assumes Node.js (for TypeScript) or Python 3.10+ is already in your environment.Time to first working server: roughly 1 hour for a simple unauthenticated public API, 2 hours for a complex authenticated REST API.6
What the results look like
Kelly Kohlleffel used mcp-builder to build 15 MCP servers connecting an entire enterprise data stack: Fivetran, Snowflake, Databricks, dbt, Census, Google Cloud, Microsoft Fabric OneLake, AWS S3, plus public APIs for Oura, SpaceX, OpenLibrary, National Parks, and FDA.6

The initial naive build loaded all 188 tools at once. A single pipeline health check cost 56,912 tokens and $98.39 per 1,000 checks. After applying mcp-builder's workflow design principles — collapsing 188 tools into 5 workflow-oriented tools and optimizing response payloads — the same check dropped to 421 tokens and $0.13 per 1,000 checks: a 98.7% token reduction.6
The per-stage breakdowns: Source pipeline 99.5% reduction, Ingestion 99.2%, Transformation 98.8%, Warehouse 98.6%, Activation 99.1%.6 The difference isn't the skill itself — it's that the 4-phase workflow forced the design question "what does the LLM actually need to accomplish this task?" instead of mapping APIs one-to-one.
Community signal
Reddit user u/geekeek123 (r/ClaudeAI, 904 upvotes): "MCP Builder — Generates MCP server boilerplate. If you're building custom integrations, this cuts setup time by like 80%."7
A recent r/thingsapp thread shows a builder using Claude Code + mcp-builder to create a Things 3 desktop MCP connector via the x-callback URL scheme. Commenter jsrqs1981: "The Claude MCP builder skill is pretty great."8 Another user in the thread: "life honestly hasn't been the same."8

Known bugs — read before copying reference code
GitHub issue #1013 (filed 2026-04-23) documents 11 specific defects in mcp-builder's reference files.9 The two most likely to bite you:
ctx.elicit()signature is wrong — the reference uses inventedprompt=andinput_type=parameters that don't exist in the actual SDK- Node.js complete example is missing
expressandStreamableHTTPServerTransportimports — paste-and-compile fails immediately - Pydantic v2:
max_itemsis deprecated, should bemax_length
Fix PR #1041 was submitted 2026-04-26 and covers all 11 issues. As of 2026-05-23 it remains unmerged (27 days open).10 Filer DamienOR-dot's read on this: "Most of the above are drift from SDK changes over time rather than bad design."9
Practical mitigation: treat the reference files as design patterns, not copy-paste templates. Run
tsc --noEmit after Phase 2 and let the type checker catch import gaps before Phase 3.When not to use it
- Quick internal prototypes — the 4-phase workflow, MCP Inspector test pass, and 10-question eval suite are production tooling. For a throwaway script you won't maintain, they add more overhead than value. Firecrawl explicitly: "Worth it for production MCP servers; overkill for a quick internal experiment."4
- B2B SaaS with heavy custom fields — static OpenAPI→MCP conversion breaks when customers have per-tenant parameter formats. mcp-builder improves on naive generation, but the environment-level override layer (custom tool descriptions per client) still needs to be hand-built.1
- GraphQL / gRPC / SOAP targets — the reference code and design guidance are REST/HTTP-centric. No confirmed reports of mcp-builder handling non-REST API targets cleanly.
- Context budget is already tight — installing more than 8–12 skills simultaneously creates a meaningful context tax: each skill's description loads on every invocation whether or not it's triggered. mcp-builder itself is lean (no
modelorallowed-toolsfields), but if you're already close to the context ceiling, audit your installed skill count first.11
Skill metadata
- Repo: github.com/anthropics/skills →
skills/mcp-builder/ - Installs: 60,300 (skills.sh, as of 2026-05-23)2
- License: Apache 2.03
- Author: Anthropic (official)
- Last SKILL.md update: 2026-05-193
- Agents: Claude Code, Cursor, Windsurf, GitHub Copilot, Codex, Cline, VS Code, OpenCode, Gemini CLI, 10+ more2
- Security audits: Gen Agent Trust Hub, Socket (Pass), Snyk (Pass)2
- Open PRs: #1041 (bug fixes, 27 days unmerged as of 2026-05-23)10
Acquisition context: Anthropic acquired Stainless (the company that auto-generates SDKs and MCP servers from API specs) on 2026-05-18 for $300M+.12 Stainless had been generating all official Anthropic SDKs (TypeScript, Python, and others) since the API launched. The Stainless team joins to work on Claude's agent and connector capabilities. External Stainless customers lose automatic maintenance pipelines; Speakeasy, FERN, and liblab are the main alternatives.12
コンテンツカードを読み込んでいます…
参考ソース
- 1We tried generating MCP servers from static OpenAPI specs
- 2mcp-builder on skills.sh
- 3anthropics/skills on GitHub
- 4Best Codex Skills to Try in 2026 — Firecrawl
- 5MCP Builder SKILL.md — Anthropic/GitHub
- 6Kelly Kohlleffel / Medium: What I Learned Building MCP Servers
- 7r/ClaudeAI: 10 Claude Skills that actually changed how I work
- 8r/thingsapp: Things 3 MCP connector with Claude
- 9GitHub Issue #1013: Bugs and stale API references in mcp-builder
- 10GitHub PR #1041: fix mcp-builder stale API references
- 11The 8 Skills Every Claude Code Setup Needs in 2026 — Towards AI
- 12Anthropic Bought Stainless: What To Do About Your AI Stack — FindSkill.ai
このコンテンツについて、さらに観点や背景を補足しましょう。