Are "MCP Tool Definitions Eat 40-50% of Context" and "MCP Is Dead" Really True? — Fact-Checking the Perplexity Defection 2026

Q: Is 'MCP tool definitions consume 40-50% of context' really true?

⚠️ Conditionally true. Merge CTO Gil Feig estimates tool metadata at 40-50% of available context for typical multi-server deployments, and multiple independent measurements support the estimate. GitHub's official MCP server exposes 94 tools at roughly 17,600 tokens, and Atlassian's official MCP for Jira plus Confluence runs about 10,000 tokens. Combining several large MCP servers easily reaches 30,000+ tokens of metadata before any work begins. But the figure is configuration-dependent. Curate down to one or two servers and you stay in the low thousands of tokens, well below 40-50%. The number should be read as a range for multi-server setups, not as an absolute property of MCP.

Q: Is 'MCP is dead' or 'the standard has died' really true?

❌ Inaccurate. Perplexity CTO Denis Yarats announced at Ask 2026 on March 11, 2026 that the company is moving away from MCP internally toward APIs and CLIs, but this is one company's decision, not an industry retreat. Since Anthropic open-sourced MCP in November 2024, it has been adopted by OpenAI, Google DeepMind, Microsoft, Cursor, Atlassian, ServiceNow, Salesforce, Linear, GitHub, and many other enterprise platforms. The official MCP repository is actively working on SEP-1576 to reduce schema redundancy and add dynamic tool selection; Atlassian ships mcp-compressor, and Anthropic provides code mode — both attack the bloat problem directly. The protocol is not dying. It is being fixed.

Q: Does tool-definition bloat actually hurt agent performance?

✅ Yes — and it is a different problem from token consumption. Multiple reports observe that tool selection accuracy collapses from about 43% to under 14% when agents face bloated tool sets — roughly a 3x degradation, with the agent picking the wrong tool seven out of eight times when the menu gets too long. This is a cognitive load problem, independent of the context-window size: even if you have plenty of context left, the LLM's selection heuristics break down when the menu is too long. Token compression alone does not solve it; dynamic tool loading does.

Q: How do you reduce MCP tool-definition token bloat?

Four levers with measured results: (1) Anthropic code mode — calling tools via generated code instead of direct invocation, reported up to 98.7% context-overhead reduction. (2) Atlassian mcp-compressor — reduces tool description overhead by 70-97% without changing how the agent invokes tools. (3) Server curation — keep only the one or two MCP servers you actually need; everything else is dead weight. (4) Compact modes — KanseiLink and a growing set of MCP servers ship a compact:true mode that returns minimal fields. The problem is not the MCP protocol itself but the 'connect everything just in case' default.

Q: So how should companies actually use MCP?

The right question is not 'MCP or not MCP' but 'how do we use MCP economically.' Practical guidance: (1) Only connect servers your workload actually needs; remove the rest. (2) For servers with large tool catalogs, evaluate mcp-compressor-style compression or Cloudflare/Anthropic code mode. (3) On the vendor side, reduce tool count, simplify descriptions, and support dynamic tool loading to raise AEO (Agent Engine Optimization) quality. Perplexity's choice to use APIs and CLIs for a specific use case is reasonable, but it is not evidence that 'MCP is dead' — it is evidence that for one workload, an alternative path was more economical.

The claims that lit the fire
Claim ①: "Tool definitions eat 40-50%"
Claim ②: "MCP is dead"
Claim ③: "Too many tools breaks accuracy"
We know how to fix it — four levers that work
What this means for Japanese SaaS
FAQ

The claims that lit the fire

On March 11, 2026, at the Ask 2026 conference, Perplexity CTO Denis Yarats announced that the company was internally moving away from Anthropic's MCP toward APIs and CLIs. The reasons he gave were two: (1) MCP tool definitions consume 40-50% of available context before the agent does any work, and (2) authentication flows create friction when connecting to multiple services. The clip spread on Threads and X within hours, and "MCP is dead" / "the standard has died" reactions piled up.

Around the same window, Merge CTO Gil Feig independently estimated that "tool metadata accounts for 40-50% of available context in typical deployments." Developer reports surfaced too: one engineer measured roughly 7,000 tokens already consumed before the first prompt with just four MCP servers connected; a colleague on a heavier configuration was burning ~66,000 tokens — nearly a third of Claude Sonnet's 200k context window — before saying anything.

The viral discourse compresses to three claims. We check each against measured data.

Claim ①: "Tool definitions eat 40-50%"

Verdict: ⚠️ Conditionally true

True for typical multi-server setups. Not true for single-server, curated configurations.

What is actually consuming tokens? Several independent measurements converge on similar numbers.

MCP server / configuration	Tool count	Definition tokens	% of 200k window
GitHub official MCP	94	~17,600	~8.8%
Atlassian official MCP (Jira + Confluence)	—	~10,000	~5.0%
Multiple large servers connected	—	30,000+	~15%+
4-server lightweight config (typical)	—	~7,000	~3.5%
Heavy config (reported extreme)	—	~66,000	~33%

"40-50%" should be read as a range for a class of configurations, not an absolute property of MCP. Stack several large servers and you arrive there easily — GitHub alone is 94 tools and ~17.6k tokens, so combining it with Atlassian and Slack will cross 30k. Curate down to one or two servers — or use a server with a compact mode like KanseiLink's compact: true — and you sit in the low thousands.

How to read this

"MCP eats 40-50%" is a "if you connect everything just in case" truth. It is not a fundamental property of the protocol; it is a property of the operating choice. The same server can consume an order of magnitude fewer tokens depending on what you select and how.

Claim ②: "MCP is dead"

Verdict: ❌ Inaccurate

One company's choice is not an industry retreat. The standard is being patched, not buried.

The "MCP is dead / standard has died" reaction extrapolates a single company's internal call into an industry trend. The actual situation:

Adoption keeps spreading — Since Anthropic open-sourced MCP in November 2024, it has been adopted by OpenAI, Google DeepMind, Microsoft, Cursor, Atlassian, ServiceNow, Salesforce, Linear, GitHub, and a long tail of enterprise platforms. In April 2026 it was donated to the Linux-Foundation-hosted AAIF (AI Agent Interoperability Foundation), moving toward vendor-neutral standardization.
The fix is being designed in the open — The official MCP repo is working on SEP-1576: Mitigating Token Bloat in MCP — Reducing Schema Redundancy and Optimizing Tool Selection, standardizing both schema-redundancy reduction and dynamic tool selection.
Compression tools already exist — Atlassian ships mcp-compressor with reported 70-97% tool-description compression without changing how the agent calls tools. Anthropic's MCP guidance includes code mode, which has been measured to cut context overhead by up to 98.7%.
The defection is one company, one use case — Perplexity's choice reflects a specific product context (search) where API + CLI was a more economical path. That is a workload-level optimization, not a verdict on the protocol.

The accurate framing is not "the protocol is dying" but "the 'connect everything' default has shown its costs, and the ecosystem is responding".

Claim ③: "Too many tools breaks accuracy"

Verdict: ✅ True — and it is a separate problem

This is a cognitive-load issue independent of context-window size. Compression alone does not solve it.

Beneath the token-cost discussion, there is a deeper problem. Multiple measurements observe that tool selection accuracy collapses when the menu of tools gets long:

The cognitive cost of menu bloat

43%

Tool selection accuracy on small-to-mid sets

14%

Accuracy on bloated sets (same conditions)

Degradation — wrong tool 7 of 8 times

That is, you can compress tool definitions to save tokens and still pick the wrong tool. Even if 90% of your 200k window is empty, a long menu breaks the LLM's selection heuristics. This is not a context-window problem.

The direction is clear: load only the tools you need, dynamically. That is exactly what SEP-1576 addresses.

We know how to fix it — four levers that work

The right question is not "use MCP or drop MCP" but "how do we use MCP economically." Four levers have measured effects:

① Anthropic code mode — Call tools via generated code rather than direct invocation. Reported up to 98.7% context-overhead reduction. See KanseiLink's prior verification of Cloudflare's code-mode numbers for more.
② mcp-compressor family — Atlassian's mcp-compressor reduces tool description overhead by 70-97% without changing call semantics. Retrofits to existing servers.
③ Server curation — Connect only the one or two servers you actually need. The single biggest gain is usually just removing servers that are along for the ride.
④ Compact modes — Use minimal-field response modes such as KanseiLink's search_services({compact: true}). A growing number of MCP servers ship a similar option.

Editor's view, May 2026

The debate is shifting from "MCP vs API" to "MCP used wisely vs MCP used naively." Three layers each have work to do: the protocol (SEP-1576), the servers (compact modes, dynamic loading), and the operators (curation). Together they can shrink the 40-50% down to single-digit percentages.

What this means for Japanese SaaS

For Japanese SaaS vendors designing or maintaining MCP servers, the takeaways from the Perplexity moment are concrete:

Cut tool count — Do not expose "one tool per API endpoint." Aggregate to the granularity an agent actually needs. GitHub's 94-tool surface is the cautionary example.
Tighten tool descriptions — Strip markdown noise, repetition, and decorative formatting. Every word lives in the agent's context.
Offer compact modes — Provide full/compact options on list-type tools. KanseiLink itself does this — compact: true drops fields and we have measured 91-96% token savings versus web-fetching docs.
Prepare for dynamic loading — Track SEP-1576 and design your server so subsets of tools can be loaded by use-case context, not the entire catalog at boot.

Before claiming "MCP-ready," measure how your server looks economically from the agent's side. In the AEO (Agent Engine Optimization) era, that measurement is the new vendor responsibility.

FAQ

Is "MCP tool definitions consume 40-50% of context" really true?

⚠️ True for typical multi-server setups, not for curated single-server use. GitHub official MCP is ~17,600 tokens for 94 tools; Atlassian (Jira + Confluence) is ~10,000; combining large servers easily exceeds 30,000+. Heavy configurations reach ~66,000 tokens (~33% of a 200k window) before any work. Drop to one or two servers and you sit in the low thousands.

Is "MCP is dead" / "the standard has died" really true?

❌ Inaccurate. Perplexity's defection is one company's decision. Since November 2024, MCP has been adopted by OpenAI, Google DeepMind, Microsoft, Cursor, Atlassian, ServiceNow, Salesforce, Linear, GitHub, and many others; in April 2026 it was donated to Linux-Foundation-hosted AAIF. The MCP repo itself is fixing the bloat problem via SEP-1576.

Does tool-definition bloat actually hurt agent performance?

✅ Yes — independently of the token-cost issue. Reports observe tool-selection accuracy collapsing from ~43% to under 14% on bloated sets (3x degradation, wrong tool 7 of 8 times). Compression alone does not solve this; dynamic tool loading does.

How do you reduce MCP tool-definition token bloat?

Four levers: (1) Anthropic code mode — up to 98.7% reduction. (2) Atlassian mcp-compressor — 70-97% tool-description compression. (3) Server curation — keep only the one or two servers you actually need. (4) Compact modes — KanseiLink and other servers offer compact: true to return minimal fields.

So how should companies actually use MCP?

The right question is "how to use MCP economically," not "use or drop." (1) Connect only needed servers; remove the rest. (2) For large catalogs, evaluate mcp-compressor or code mode. (3) Vendors should reduce tool count and support dynamic loading. Perplexity's choice is reasonable for its workload — but it is not evidence that "MCP is dead."

Sources, Data Disclosure & Disclaimer

Claims and figures in this article are based on primary and secondary sources available as of 2026-05-18. Principal sources: Perplexity CTO Denis Yarats's remarks at Ask 2026 (2026-03-11, reported via Threads and multiple outlets); Merge CTO Gil Feig's "40-50%" tool-metadata estimate; the GitHub official MCP server's 94 tools / ~17,600-token measurement; the Atlassian official MCP (Jira + Confluence) ~10,000-token measurement; the ~66,000-token heavy-config developer report; the 43%→14% tool-selection accuracy degradation observation; Anthropic's code-mode "up to 98.7% reduction" report; Atlassian mcp-compressor's 70-97% reduction claim; and the official MCP repository's SEP-1576 issue. All measurements are snapshots and shift with server versions and connection configurations. "40-50%" is an estimated range for typical multi-server setups, not an absolute property of MCP. Our verdict on the "MCP is dead" claim is an analytical judgment based on publicly observable signals (adoption, SEP progress, compressor availability) as of May 2026 and is not a forward-looking guarantee about market dynamics.

Are "MCP Tool Definitions Eat 40-50% of Context" and "MCP Is Dead" Really True? — Fact-Checking the Perplexity Defection 2026

Contents

The claims that lit the fire

Claim ①: "Tool definitions eat 40-50%"

Claim ②: "MCP is dead"

Claim ③: "Too many tools breaks accuracy"

The cognitive cost of menu bloat

We know how to fix it — four levers that work

What this means for Japanese SaaS

How many tokens is your MCP server actually eating?

FAQ

Is "MCP tool definitions consume 40-50% of context" really true?

Is "MCP is dead" / "the standard has died" really true?

Does tool-definition bloat actually hurt agent performance?

How do you reduce MCP tool-definition token bloat?

So how should companies actually use MCP?

For AI Agents

Contents

The claims that lit the fire

Claim ①: "Tool definitions eat 40-50%"

Claim ②: "MCP is dead"

Claim ③: "Too many tools breaks accuracy"

The cognitive cost of menu bloat

We know how to fix it — four levers that work

What this means for Japanese SaaS

How many tokens is your MCP server actually eating?

FAQ

Is "MCP tool definitions consume 40-50% of context" really true?

Is "MCP is dead" / "the standard has died" really true?

Does tool-definition bloat actually hurt agent performance?

How do you reduce MCP tool-definition token bloat?

So how should companies actually use MCP?

Related Reads

Cloudflare Code Mode's "99.9% Token Reduction" — Is It Real? Untangling the 81% and 99.9% Numbers

96% Agent Token Cost Reduction, Proven — KanseiLink Optimization Data 2026

"$150K Integration Savings", "92% Token Cut" — Are MCP's Big Claims Real? KanseiLink Verifies

What is MCP? — How Model Context Protocol Transforms SaaS Integration in the AI Era

For AI Agents