Contents
- Why compare MCP hosting costs now
- Two scenarios — 1M and 10M calls per month
- Cloudflare Workers — $5 flat plus free egress, the cost king
- AWS Lambda — permanent free tier and a 20% ARM discount
- Vercel Functions — great DX, but three-axis billing
- Railway — Docker containers with usage-based pricing
- Fly.io — edge containers after the free tier removal
- Selection flow by use case
- FAQ
Why compare MCP hosting costs now
2026 is the year remote MCP servers (no client-side runtime, hit via HTTPS) went mainstream. Dropbox, Box, Microsoft 365, Cloudflare, AWS, Pipedream — every major vendor has now shipped MCP servers hosted on their own platform. At the same time, more SaaS companies are building and offering their own MCP servers, which makes "where do we deploy it" a real decision.
This article looks at MCP-specific workload characteristics — short, high-frequency, I/O-bound, and cold-start sensitive — and lays out five platforms' pricing as of May 2026 side by side.
"Just pick the cheapest" is risky. MCP servers have an unusual load profile: zero cost when idle, and a tight latency budget when called — anything slower than a few dozen milliseconds shows up in the agent's UX. Serverless (Workers / Lambda / Vercel) is the default winner, but persistent connections or long-running tasks push you to containers (Railway / Fly.io).
Two scenarios — 1M and 10M calls per month
To make the comparison concrete, we assume two workloads with a typical MCP-server profile: 50ms CPU and 128MB memory per call.
- Scenario A: 1M calls/month (avg 0.4 req/s, agents for ~30–100 users)
- Scenario B: 10M calls/month (avg 4 req/s, agents for ~500–1,000 users)
| Platform | Free tier | Scenario A / mo | Scenario B / mo | Cold start |
|---|---|---|---|---|
| Cloudflare Workers Paid | —($5 minimum) | $5 | $5 | <5ms |
| AWS Lambda (ARM Graviton2) | Always-free: 1M req + 400k GB-sec/mo | $0 (in free tier) | $1.80–$2.50 | 280ms avg |
| Vercel Functions (Pro) | 1M invocations (Hobby) | $20 (Pro base) | $20 + ~$5.40 overage | 100–200ms |
| Railway (Hobby/Pro) | $5 / $20 usage credit included | $5 (small always-on) | $20–$50 | Always-on (seconds on scale-to-zero) |
| Fly.io | $5 trial credit only | $2–$5 (shared-cpu-1x) | $10–$30 (multi-Machine) | Always-on |
*AWS Lambda Scenario B math: 10M − 1M free tier = 9M reqs × $0.20/1M = $1.80. GB-seconds: 10M × 50ms × 0.125GB = 62,500 GB-seconds, which fits inside the permanent 400,000 GB-second free allowance, so compute is free. Numbers use the ARM rate.
*Cloudflare Workers Paid is $5/month and includes 10M requests — Scenario B fits exactly.
Cloudflare Workers — $5 flat plus free egress, the cost king
As of May 2026, the platform that dominates remote MCP server economics is Cloudflare Workers. The pricing is simple:
- Free: 100,000 requests/day, 10ms CPU per invocation
- Workers Paid: $5/month minimum, 10M requests included, $0.50 per additional million, egress (bandwidth) is free
Both Scenario A and Scenario B fit inside $5/month flat. The V8 Isolate cold start under 5ms (sometimes sub-millisecond in reports) translates directly to better UX when agents hit MCPs in tight loops. Compared with AWS Lambda's 280ms average (and occasional 500ms spikes), Workers is ~100× faster.
・Short, high-frequency MCP tool calls (the "call → return" loop dominates)
・Large egress in responses (file fetch, vector search results)
・You want global distribution and low latency by default
・You want a predictable $5 flat budget
There's a CPU-time cap per invocation (10ms on Free, up to 30s on Paid). Long-running LLM calls or heavy compute done synchronously inside a Worker isn't a good fit. Use waitUntil() and Durable Objects, and push external API I/O onto the fetch wall-clock budget.
AWS Lambda — permanent free tier and a 20% ARM discount
AWS Lambda's headline feature is its permanent free tier: 1M requests and 400,000 GB-seconds per month, available to every account forever (not just new ones). Scenario A fits entirely inside the free tier, so the only AWS billing is for other resources like API Gateway.
The 2026 power move is ARM Graviton2: GB-second rate of $0.0000133334 (20% cheaper than the x86 rate of $0.0000166667). Reports cite up to 34% price-performance improvement on suitable workloads. The request rate stays at $0.20 per 1M regardless of architecture. Most MCP servers run on Node.js or Python — for pure TypeScript MCPs, picking ARM64 is essentially free money.
Scenario B math: (10M − 1M) × $0.20/1M = $1.80 plus compute (still inside the free tier) = roughly $2.50/month all-in. That's cheaper than Workers Paid in many cases. The catch: cold start averaging 280ms is something MCP clients feel, and whether that's tolerable is the deciding axis.
# AWS SAM template excerpt (ARM64 + maximizing free tier)
Resources:
McpServer:
Type: AWS::Serverless::Function
Properties:
Runtime: nodejs20.x
Architectures: [arm64] # 20% cheaper GB-sec rate
MemorySize: 256
Timeout: 30
Handler: dist/handler.mcp
Vercel Functions — great DX, but three-axis billing
Vercel Functions has a unique three-axis billing model: $0.60 per 1M invocations + $0.128 per CPU-hour + $0.0106 per GB-hour of memory. The Hobby free tier covers 100,000 invocations (some plan tiers include up to 1M), and Pro at $20/month includes 1M invocations.
Scenario A fits inside Pro at $20/month. Scenario B is 10M − 1M = 9M × $0.60/1M = $5.40 overage, plus CPU and memory billing. The pattern: roughly 3–4× the cost of Workers Paid at the same request volume — the question is whether Next.js integration, preview deployments, the Edge Network, and team features earn that premium.
The "$286 actually billed" anecdote from a Deploywise 2026 report comes from three-axis overage stacking on top of the Pro $20 base — possible when MCP traffic is larger than estimated. Set an on-demand usage budget (default $200) and watch the bill.
(1) Functions running on oversized memory by default — GB-hour piles up. (2) Streaming responses count toward CPU time. (3) Bandwidth overage if your MCP returns files. Set a usage budget and check billing alerts weekly.
Railway — Docker containers with usage-based pricing
Railway uses a subscription plus included usage credit model. Hobby is $5/month with $5 of usage credit; Pro is $20/month with $20 of usage credit. Overage is metered at $20/vCPU/month and $10/GB-RAM/month. CPU and memory are billed by the second, so idle time costs effectively nothing.
If you want to run an MCP server as Docker, take advantage of existing WebSocket/SSE persistent connections, or run an OSS MCP server (MongoDB MCP and friends) without rewriting it, Railway is the realistic pick over serverless. Managed Postgres / MySQL / Redis, persistent volumes, and object storage are all included.
Scenario A (1M calls/month) typically fits a shared-CPU 0.5 vCPU + 512MB RAM service inside the Hobby tier ($5/month). Scenario B usually lands inside the Pro $20 usage credit with 1–2 vCPUs + 1–2GB RAM. Watch out: without scale-to-zero, the service runs (and bills) overnight even with no traffic.
Fly.io — edge containers after the free tier removal
Fly.io removed its free tier in 2024. New accounts get $5 in trial credit, then a credit card is required and all Machines and storage are billed.
Pricing is fully usage-based — no fixed plans. A small always-on shared-cpu-1x with 256MB RAM is roughly $1.94/month; storage is $0.15/GB/month; outbound bandwidth starts at $0.02/GB. The practical floor for a real small app is around $5/month.
Fly.io's edge is the ability to spread the same single binary across regions (mirror an MCP server across Tokyo, Osaka, Singapore, and Frankfurt). It's not the global mesh that Cloudflare Workers gives you, but it dramatically outperforms a Lambda function pinned to us-east-1 from anywhere else in the world. Long-running tasks, persistent WebSocket connections, and custom runtimes (Go, Rust, Bun, etc.) are where Fly.io shines for MCP work.
Selection flow by use case
There's no single "always pick this" answer, but the flow below covers the majority of cases.
- Short, high-frequency, cold-start sensitive → Cloudflare Workers Paid ($5 flat)
- Inside the AWS estate, reusing IAM → AWS Lambda (ARM64) + API Gateway
- Heavy Next.js integration, team / preview workflows → Vercel Functions (set the usage budget)
- Docker as-is, WebSocket persistent connections → Railway ($5–$20)
- Multi-region, custom runtimes, long-running tasks → Fly.io (multi-Machine layout)
- Personal projects, PoC with zero budget → Cloudflare Workers Free (100k req/day) or AWS Lambda (always-free)
- Large response sizes (files / vector hits) → Cloudflare Workers + R2 (free egress)
FAQ
What's the cheapest place to run an MCP server?
It depends on volume. Up to 1M calls/month → AWS Lambda is effectively $0 inside the always-free tier. Up to 10M → Cloudflare Workers Paid ($5 flat, free egress). Beyond 10M → Workers Paid stays ahead at $0.50 per additional million. If you need persistent connections, Railway ($5–$20) is the choice.
Which platform has the fastest cold start?
Cloudflare Workers (V8 Isolates, <5ms — sometimes sub-millisecond in reports) wins by a wide margin. AWS Lambda averages 280ms in us-east-1. Vercel Functions sits at 100–200ms. Railway and Fly.io run persistent containers, so there's no cold start in the usual sense (a few seconds for scale-to-zero restarts).
Does Fly.io still have a free tier?
It was removed in 2024. New accounts get $5 in trial credit. A minimal shared-cpu-1x with 256MB RAM is about $1.94/month, and the practical floor is around $5/month.
Is AWS Lambda ARM really 20% cheaper?
Yes — GB-second rate of $0.0000133334 vs $0.0000166667 on x86. Request charges are uniform at $0.20 per 1M. For Node.js + pure TypeScript MCP servers, ARM64 is essentially free money.
Can I run a production MCP server on Cloudflare Workers Free?
For personal projects or small PoCs, yes. Free is capped at 100k req/day and 10ms CPU per invocation, which is tight for enterprise MCPs. Production should move to the $5/month Paid plan: 10M requests included, $0.50 per additional million, and free egress.
What about huge workloads (100M+ calls/month)?
At that volume you're in committed-use and enterprise-contract territory: Cloudflare Workers for Platforms, AWS Compute Savings Plans (Lambda eligible), Vercel Enterprise. Past 100M req, Vercel's $200 on-demand budget cap becomes risky, and Workers Paid or Workers Enterprise are typically the default.
Pricing and performance numbers reflect each vendor's public documentation and third-party reports as of May 2026: Cloudflare Workers Paid ($5/month, 10M requests included, free egress) from developers.cloudflare.com/workers/platform/pricing/; AWS Lambda (always-free 1M req + 400k GB-sec, ARM Graviton2 −20%) from aws.amazon.com/lambda/pricing/; Vercel Functions (three-axis billing, Hobby/Pro $20) from vercel.com/docs/functions/usage-and-pricing; Railway (Hobby $5 / Pro $20, $20/vCPU-month, $10/GB-RAM-month) from railway.com/pricing; Fly.io (free tier removed in 2024, $5 trial credit, shared-cpu-1x ≈ $1.94/month) from fly.io/docs/about/pricing/. Cold-start figures come from Cloudflare's own blog (V8 Isolates <5ms) and Rebal AI's March 2026 production report (Lambda 280ms avg). The Vercel "$286 bill" example is from Deploywise's 2026 report. Sample math assumes 50ms CPU + 128MB memory per call as a typical MCP server profile; real numbers vary with the workload. Always confirm against vendor pricing before going to production.