汎用LLM・AI開発⭐ リポ 1品質スコア 73/100

mcp-builder

FastMCP 2.xを使用したMCPサーバーの構築方法を説明します。トークン予算の管理、ツールの統合、出力の切り詰め、Opusとの互換性のための実行パターンに重点を置いています。非同期サブプロセスパターン、安全な出力制限、アクションベースのツール設計もカバーしています。MCPサーバーの構築、ツールの過負荷やコンテキストオーバーフローのデバッグ、トークン使用量の最適化、Opusのサイレント障害の修正、ツール統合が必要な際に活用できます。

description の原文を見る

Guides construction of MCP servers using FastMCP 2.x with emphasis on token budget management, tool consolidation, output truncation, and the execute pattern for Opus compatibility. Covers async subprocess patterns, safe output limits, and action-based tool design. Use when creating MCP servers, debugging tool overload or context overflow, optimizing token usage, fixing Opus silent failures, or consolidating tools. Triggers: MCP, FastMCP, tool schema, token limit, context overflow, MCP server, execute pattern, tool consolidation.

SKILL.md 本文

MCPサーバーの構築

FastMCP 2.xを使用して、Claudeおよび他のLLMで確実に動作するMCPサーバーを構築するための専門的なガイダンスです。

機能

FastMCP 2.xサーバーセットアップ -- stdio およびSSEトランスポート対応の最新の非同期MCPサーバースキャフォルディング
トークン予算管理 -- コンテキスト溢れとモデルクラッシュを安全な制限で防止
ツール統合 -- アクション基盤およびプロバイダー基盤パターンを使用してツール数を削減
出力切り詰め -- すべてのツール出力に安全な文字数制限を適用
Executeパターン -- Opusスキーマ非互換性を回避するコード実行ツール
非同期サブプロセス -- タイムアウトと標準入力分離を備えたノンブロッキングコマンド実行
トラブルシューティング -- 接続障害、コンテキスト溢れ、プラットフォーム固有の問題を診断

ルーティングロジック

リクエストタイプ	リファレンスを読み込む
FastMCPセットアップ、トランスポート、リソース、テスト	references/fastmcp-patterns.md
トークン予算、説明の最適化、統合戦略	references/token-budget.md
デバッグ、Windows問題、CLIハンドオフ、サイレント障害	references/troubleshooting.md

コア原則

1. ツール数を10以下に保つ

LLMは10～20以上のツールが提示されるとうまく動作しません。actionパラメーターを使用して、関連操作を単一のツールの背後に統合します。

# 良い例: 1つの統合ツール
@mcp.tool(name="dev", description="Dev tools. Actions: lint, test, build.")
async def dev(action: str, params: dict = None) -> str:
    if action == "lint": ...
    elif action == "test": ...
    elif action == "build": ...

2. 説明を1つあたり50トークン以下に保つ

ツール説明はすべてのリクエストでコンテキストに読み込まれます。すべてのツール全体で500トークン未満、ツールあたり50トークン未満を目指します。

# 悪い例（約100トークン）
description="""Consolidated development tools for the monorepo.
Actions: lint, test, build. Params: {"package": "...", "fix": bool}
Example: {"action": "lint", "params": {"package": "core"}}"""

# 良い例（約30トークン）
description="""Dev tools. Actions: lint, test, build.
Params: {"package": "...", "fix": bool}"""

3. すべての出力を切り詰める

大きな出力はOpus 4.5をクラッシュさせ、過度なコンテキストを消費します。ハード制限を適用します。

MAX_OUTPUT_LENGTH = 8000  # ~2000トークン

def truncate_output(output: str, max_length: int = 8000) -> str:
    if len(output) <= max_length:
        return output
    truncated = output[:max_length]
    last_newline = truncated.rfind('\n')
    if last_newline > max_length * 0.8:
        truncated = truncated[:last_newline]
    return truncated + "\n\n... (truncated for token safety)"

4. Opus互換性のためにExecuteパターンを使用する

Claude Opusは、複雑なPydantic/JSONスキーマパラメーターを持つツールではサイレント障害を起こすことがあり、同じサーバー上の単純なツールは正常に機能します。修正方法：単一のcode: strパラメーターを持つexecuteツールを公開します。

@mcp.tool()
async def my_execute(code: str) -> dict:
    """[Execute] Run Python with server functions available.
    Functions: search(query, top=5), graph_stats(). All async.
    Example: `results = await search("button"); print(results)`
    """
    import io
    from contextlib import redirect_stdout, redirect_stderr

    stdout_buf, stderr_buf = io.StringIO(), io.StringIO()
    namespace = _NAMESPACE.copy()

    with redirect_stdout(stdout_buf), redirect_stderr(stderr_buf):
        if "await " in code:
            wrapped = "async def __main__():\n"
            wrapped += "\n".join(f"    {line}" for line in code.split("\n"))
            wrapped += "\n    return locals()"
            exec(compile(wrapped, "<execute>", "exec"), namespace)
            result_locals = await namespace["__main__"]()
            namespace.update(result_locals)
        else:
            exec(compile(code, "<execute>", "exec"), namespace)

    return {"stdout": stdout_buf.getvalue(), "stderr": stderr_buf.getvalue()}

これが機能する理由： スキーマは簡潔です（code: str）。Opusはぐらい優れたPythonコード記述能力を持っています。構造化パラメーターのシリアライゼーションが失敗するのです。Gemini/GPT互換性のために元のツールを保持します。

5. 常にstdin=DEVNULLで非同期サブプロセスを使用する

同期サブプロセスはMCPハートビートをブロックします。Windowsでは、標準入力の継承がMCP標準入出力トランスポートでデッドロックを引き起こします。

process = await asyncio.create_subprocess_exec(
    *cmd,
    stdin=asyncio.subprocess.DEVNULL,   # 重要: Windowsデッドロックを防止
    stdout=asyncio.subprocess.PIPE,
    stderr=asyncio.subprocess.PIPE,
)
stdout, stderr = await asyncio.wait_for(process.communicate(), timeout=300)

クイックリファレンス

安全なトークン予算

コンポーネント	制限	理由
ツール説明（合計）	<500トークン	すべてのリクエストで読み込まれる
単一ツール説明	<50トークン	簡潔に保つ
コマンド出力	8,000文字	約2,000トークン
リサーチ出力	12,000文字	約3,000トークン、まだ安全
ツール数	<10ツール	LLMは10～20以上で性能低下

最小限のFastMCP 2.xサーバー

from fastmcp import FastMCP

mcp = FastMCP("my-server", instructions="Brief description.")

@mcp.tool(name="my-tool", description="Concise description under 50 tokens.")
async def my_tool(action: str) -> str:
    return "SUCCESS\n\nResult here"

def run_server():
    mcp.run()  # stdio transport (default)

一貫性のあるリターン形式

# 成功
return "SUCCESS\n\nOutput here..."

# 失敗
return "FAILED\nError: reason\n\nPartial output..."

VS Code MCPコンフィグレーション

.vscode/mcp.json（ワークスペースレベル）内に記載します：

{
  "servers": {
    "my-server": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "python", "-m", "mypackage.mcp_server"],
      "cwd": "${workspaceFolder}"
    }
  }
}

重要なフィールド：

"type": "stdio" — VS Codeで必須です。これがないと、サーバーがサイレントに起動に失敗します。
"cwd" — botcoreがインストールされたvenvを所有するプロジェクトをポイントする必要があります。異なるリポジトリのコンシューマープロジェクトの場合は、絶対パスを使用します：
```
"cwd": "D:\\Github\\my-org\\my-bot"
```
uv runを使用します（単なるpythonではなく）。プロジェクトのvenvが確実にアクティブになります。

クロスワークスペースMCPサーバー（例：リポジトリAで定義されたボット、リポジトリBから使用）の場合、リポジトリBの.vscode/mcp.jsonにサーバー設定を追加し、cwdがリポジトリAをポイントするようにします。

Claude Code設定

~/.claude/mcp.json内に記載します：

{
  "mcpServers": {
    "my-server": {
      "command": "python",
      "args": ["-m", "mypackage.mcp_server"]
    }
  }
}

チェックリスト

MCPサーバーをリリースする前に、以下を確認します：

ツール数が10以下（@mcp.toolデコレーター数）
総説明トークン数が500以下（トークンあたり約4文字を推定）
すべての出力がMAX_OUTPUT_LENGTHで切り詰められている
すべてのサブプロセス呼び出しがasyncio.create_subprocess_execを使用している
すべてのサブプロセス呼び出しがstdin=asyncio.subprocess.DEVNULLを設定している
Opusでツールがサイレント失敗する場合、executeツールが追加されている
リターン形式がSUCCESS/FAILEDプレフィックスを一貫して使用している
FastMCPが>=2.13.0,<3にピン留めされている

エスカレーション時期

上記のすべてのガイドラインに従っていてもコンテキスト溢れが続く場合
統合に抵抗する20以上の異なる操作を公開する必要がある場合
非PythonのメッセージチャネルプロトコルクライアントまたはSDKとの統合
stdio/SSEを超えたリアルタイムストリーミング要件
executeパターン使用後も続くOpus障害

ライセンス: MIT(寛容ライセンスのため全文を引用しています) · 原本リポジトリ

詳細情報

作者: lushly-dev
リポジトリ: lushly-dev/botcore
ライセンス: MIT
最終更新: 2026/4/3

GitHubで原本を見る →フィードバックを送る

Source: https://github.com/lushly-dev/botcore / ライセンス: MIT