Contents

  1. Why schema design governs MCP server quality
  2. Principle 1: name with verb + object, unify the convention per server
  3. Principle 2: description = "what" + "when to use", 200-400 chars
  4. Principle 3: keep inputSchema as flat as possible
  5. Principle 4: declare behavior with tool annotations
  6. Principle 5: return errors inside result, not as protocol errors
  7. Principle 6: large payloads go through URIs, not inlined
  8. Principle 7: always implement input validation at boundaries
  9. Pre-production checklist
  10. FAQ

Why schema design governs MCP server quality

By 2026, MCP server quality is no longer judged by "how many features it has" but by "how well the tool schema is designed." The reason is simple — from an agent's perspective, the tool definitions themselves live in the context window as part of the prompt.

Anthropic's engineering blog disclosed that "tool definitions consumed 134K tokens before optimization." After introducing MCP Tool Search (a lazy-loading mechanism), that figure dropped to roughly 5K — about 85% reduction. Industry surveys from MindStudio and others put a single tool definition at 100-500 tokens, a five-server 58-tool setup at ~55K, Jira alone at ~17K. Get schema design wrong and your server burns 100K+ tokens before the conversation even starts.

Three effects of good schema quality

(1) Token reduction — verbose descriptions are dramatically more expensive.
(2) Better tool selection accuracy — Anthropic engineering reports degradations like "tool selection 43% → 14%" when schema quality drops.
(3) Better host-app auto/manual approval handling — missing annotations cause 30% of Claude Connector Directory rejections.

Principle 1: name with verb + object, unify the convention per server

The official MCP spec doesn't force a naming convention, but the industry has converged on snake_case or camelCase, verb + object. From the LLM's view, verb-first names accelerate "what does this tool do?" classification.

The non-negotiable rule is consistency within a server. Mixing database_query and createInvoice in the same server makes the LLM implicitly categorize them, and tool selection accuracy drops.

// ✅ Good (snake_case unified, verb + object)
{
  "name": "search_customers",
  "name": "create_invoice",
  "name": "send_email"
}

// ❌ Bad (mixed conventions, missing verbs)
{
  "name": "customers",        // no verb
  "name": "InvoiceCreate",    // PascalCase + reversed order
  "name": "send-email-v2"     // kebab-case + version suffix
}

Principle 2: description = "what" + "when to use", 200-400 chars

description must include both what the tool does AND when to use it. The LLM does tool selection from descriptions, so a feature-only "returns a list of customers" gets confused with similar tools.

Target length is 1-3 sentences, 200-400 characters. Verbose multi-paragraph descriptions are recognized as the top token-waste driver. Condense these four elements:

// ✅ Good description (four elements condensed)
"description": "Search customers. Supports query/limit/offset.
              Returns max 100 results — pagination required.
              For a single customer by ID, use get_customer_by_id instead.
              Cannot create or update (read-only)."

// ❌ Bad description (feature-only, verbose)
"description": "This tool returns a list of customers from our
              comprehensive customer database management system,
              which has been built with industry-leading technology...
              (continues for 200+ more characters of generalities)"

Principle 3: keep inputSchema as flat as possible

JSON Schema supports deep nesting and complex validation, but for MCP tool inputSchema the rule is "as flat as possible". The deeper the nesting: (1) more tokens, (2) more LLM cognitive load, (3) higher parse-failure rate.

When a complex object hierarchy looks necessary, two options:

// ❌ Bad (deep nesting, complex object at top level)
{
  "type": "object",
  "properties": {
    "filters": {
      "type": "object",
      "properties": {
        "date_range": {
          "type": "object",
          "properties": {
            "start": { "type": "string" },
            "end": { "type": "string" }
          }
        },
        "tags": { "type": "array", "items": {...} }
      }
    }
  }
}

// ✅ Good (flat, required vs. optional explicit)
{
  "type": "object",
  "properties": {
    "start_date": { "type": "string", "format": "date" },
    "end_date":   { "type": "string", "format": "date" },
    "tags":       { "type": "array", "items": { "type": "string" } }
  },
  "required": ["start_date", "end_date"]
}

Principle 4: declare behavior with tool annotations

Tool annotations, added in 2025 and standardized through 2026, are metadata that declare how a tool behaves. The four sub-fields are readOnlyHint / destructiveHint / idempotentHint / openWorldHint — and missing them accounts for 30% of Claude Connector Directory rejections according to public data.

Annotation Default Tools that should set true Purpose
readOnlyHintfalsesearch / get / list / fetchDeclares no state change
destructiveHinttruecreate / update / delete / sendDeclares hard-to-undo operations
idempotentHintfalseSame input → same result, every timeDeclares retry safety
openWorldHinttrueAnything reaching external API/DB/networkDeclares external scope
// Search tool (read-only)
{
  "name": "search_customers",
  "description": "...",
  "inputSchema": {...},
  "annotations": {
    "readOnlyHint": true,
    "destructiveHint": false,
    "idempotentHint": true,
    "openWorldHint": true
  }
}

// Delete tool (destructive)
{
  "name": "delete_invoice",
  "annotations": {
    "readOnlyHint": false,
    "destructiveHint": true,
    "idempotentHint": false,
    "openWorldHint": true
  }
}
⚠️ Annotations are hints, not contracts

Annotations are informational signals, not enforceable guarantees. A hint says "the tool claims to behave this way" — the system does not guarantee it. MCP chose hints over contracts because enforcing contracts across untrusted servers is impractical. The client should still run its own safety checks.

A subtle pitfall: a tool named "search" that also writes the query to an analytics DB must not set readOnlyHint:true. Even if the primary purpose is reading, a write-side effect disqualifies the read-only label. Similarly, "destructive" isn't only "delete data" — overwriting a file, revoking a token, closing an issue all count as destructive if they're hard to undo.

Principle 5: return errors inside result, not as protocol errors

MCP error design has a big trap. Tool execution errors should be returned inside the result object as isError: true + a message, NOT as MCP protocol-level errors (the JSON-RPC error field).

The reason is clear: if errors come back as protocol errors, the LLM can't see the error content — it only knows "something failed". If errors come back inside result, the LLM can read "what went wrong" and attempt recovery.

// ❌ Bad (protocol-level error)
// The LLM can't see this — it just knows "it failed"
{
  "jsonrpc": "2.0",
  "id": 1,
  "error": { "code": -32603, "message": "Database connection failed" }
}

// ✅ Good (isError inside result + structured message)
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "isError": true,
    "content": [{
      "type": "text",
      "text": "Database connection failed. Retry-After: 5 seconds.
              This is a transient error - the agent should retry
              with exponential backoff (recommend 3 attempts)."
    }]
  }
}

Principle 6: large payloads go through URIs, not inlined

A 2026 best practice that gets emphasized hard: return handles/URIs for large payloads — do not inline megabytes of data into the result body. Inlining dumps that data straight into the LLM's context window and burns tokens for no reason.

Typical patterns to watch: file retrieval, log retrieval, dataset queries. These balloon with record counts. The MCP spec includes the Resources concept — return a URI reference and fetch only when needed.

// ❌ Bad (100K log lines inlined)
{
  "result": {
    "content": [{
      "type": "text",
      "text": "[2026-05-19 00:00] log line 1...
              ... (100,000 more lines, multiple MB) ..."
    }]
  }
}

// ✅ Good (URI returned, fetched on demand as a resource)
{
  "result": {
    "content": [{
      "type": "resource",
      "resource": {
        "uri": "logs://2026-05-19/full",
        "mimeType": "text/plain",
        "description": "100K log lines (5.2MB). Use read_resource to fetch."
      }
    }]
  }
}

Principle 7: always implement input validation at boundaries

Declaring validation in JSON Schema is not enough. The server must also implement validation at the code boundary. Agents will sometimes call tools ignoring the schema, and malicious users can send unsafe input through an agent.

Mandatory validation checklist:

Pre-production checklist

Run through this before any new MCP server or major tool update lands in production:

Want your MCP server AEO-rated?

KanseiLink evaluates schema quality, success rate, and latency, then publishes AEO ratings (AAA-D) to agents. Sit alongside 225+ SaaS in the registry and become more discoverable to agents.

Talk to us about AEO rating

FAQ

How many tokens does an MCP tool definition typically consume?

100-500 tokens per tool, depending on description verbosity. A 10-tool mid-size server runs 1,500-3,000 tokens before the conversation starts; a five-server, 58-tool setup hits ~55K, with Jira alone around 17K. Anthropic's engineering blog disclosed pre-optimization figures of 134K tokens.

Should names be snake_case or camelCase?

The spec permits both. Industry standard is "verb + object" and "consistent across the server". Mixing conventions in one server degrades tool selection accuracy, so consistency matters more than which convention you pick.

How long should descriptions be?

Both "what" and "when to use" are required. Target 1-3 sentences, 200-400 characters. Condense action + input prerequisite + output expectation + when-to-pick into that space. Verbose multi-paragraph descriptions are recognized as the top token waste driver in 2026.

What happens if I skip tool annotations?

30% of Claude Connector Directory submission rejections cite missing annotations. Without them, the client can't risk-classify tools: read-only tools get unnecessary confirmation dialogs, destructive tools may run with no warning, and auto-approve optimization breaks.

Should I return errors as protocol errors or in result.isError?

Use result.isError = true + structured message. Protocol-level errors hide content from the LLM and prevent recovery. Errors inside result let the LLM read what failed and decide whether to retry.

Data disclosure & disclaimer

Token figures in this article come from Anthropic engineering blog posts (anthropic.com/engineering/code-execution-with-mcp, anthropic.com/engineering/advanced-tool-use), the MindStudio article "Claude Code MCP Servers and Token Overhead", Webfuse's "MCP Cheat Sheet (2026)", and techbuddies.io's "How Claude Code's New MCP Tool Search Slashes Context Bloat". The tool annotations spec and the "30% rejection rate from missing annotations" claim are sourced from the MCP official blog (blog.modelcontextprotocol.io/posts/2026-03-16-tool-annotations/) and sunpeak.ai's "Claude Connector Directory Submission". Schema design principles are aggregated from modelcontextprotocol.info docs, thenewstack.io's "15 Best Practices for Building MCP Servers in Production", and apxml.com's "Tool Definition Schema". The code samples are minimal illustrations — always verify against the official SDK spec for your target language during implementation.