What's the cheapest platform to run an MCP server on?

It depends on the workload. Up to 1M calls/month, AWS Lambda's always-free tier (1M requests + 400,000 GB-seconds, permanent) makes it effectively $0. Up to 10M calls/month, Cloudflare Workers Paid ($5 base, 10M requests included, free egress) is the simplest cap. Beyond 10M, Workers Paid still wins at $0.50 per additional million requests with no egress charges. Vercel Functions offers great DX but bills across three dimensions at once. Workloads that need persistent connections are better suited to containers (Railway / Fly.io).

Which platform has the fastest cold start for MCP servers?

Cloudflare Workers wins by a wide margin — V8 Isolates start in under 5ms (some reports show sub-millisecond). AWS Lambda averages 280ms in us-east-1, occasionally spiking to 500ms after long idle periods (production data from early 2026). Vercel Functions hits roughly 100-200ms. Railway and Fly.io run persistent containers, so the cold-start concept barely applies — though scale-to-zero configurations may restart in seconds. For agents hammering MCPs in rapid succession, Workers' cold-start advantage shows up directly in user experience.

Is AWS Lambda ARM Graviton2 really cheaper?

Yes. The GB-second rate is 20% lower on ARM ($0.0000133334 vs $0.0000166667 on x86), and reports cite up to 34% price-performance improvement on some workloads. The request charge is $0.20 per 1M requests regardless of architecture. Most MCP servers run on Node.js or Python — for pure TypeScript MCP servers, picking ARM64 is essentially free money.

Can I run a production MCP server on the free Cloudflare Workers plan?

Maybe for personal projects and small PoCs. The free plan caps you at 100,000 requests per day and 10ms CPU per invocation. Typical MCP tool calls fit within 10ms of CPU time, but fan-outs to external APIs stretch wall time, so designs need to use waitUntil() and durable patterns correctly. For real production, the $5/month Paid plan is the standard — 10M requests included, $0.50 per additional million, and the killer feature is no egress charges.

MCP Server Hosting Cost Comparison 2026 — Cloudflare Workers vs AWS Lambda vs Vercel vs Railway vs Fly.io

Q: Does Fly.io still have a free tier?

Fly.io removed its free tier in 2024. New accounts get $5 in trial credit, after which a credit card is required and every Machine and GB of storage is billed. A minimal always-on shared-cpu-1x with 256MB RAM is roughly $1.94/month. The practical floor for a real app is around $5/month. There are no fixed plans — pricing is purely usage-based: per-second Machine time, storage at $0.15/GB/month, and outbound bandwidth from $0.02/GB.

Why compare MCP hosting costs now
Two scenarios — 1M and 10M calls per month
Cloudflare Workers — $5 flat plus free egress, the cost king
AWS Lambda — permanent free tier and a 20% ARM discount
Vercel Functions — great DX, but three-axis billing
Railway — Docker containers with usage-based pricing
Fly.io — edge containers after the free tier removal
Selection flow by use case
FAQ

Why compare MCP hosting costs now

2026 is the year remote MCP servers (no client-side runtime, hit via HTTPS) went mainstream. Dropbox, Box, Microsoft 365, Cloudflare, AWS, Pipedream — every major vendor has now shipped MCP servers hosted on their own platform. At the same time, more SaaS companies are building and offering their own MCP servers, which makes "where do we deploy it" a real decision.

This article looks at MCP-specific workload characteristics — short, high-frequency, I/O-bound, and cold-start sensitive — and lays out five platforms' pricing as of May 2026 side by side.

Editorial frame, May 2026

"Just pick the cheapest" is risky. MCP servers have an unusual load profile: zero cost when idle, and a tight latency budget when called — anything slower than a few dozen milliseconds shows up in the agent's UX. Serverless (Workers / Lambda / Vercel) is the default winner, but persistent connections or long-running tasks push you to containers (Railway / Fly.io).

Two scenarios — 1M and 10M calls per month

To make the comparison concrete, we assume two workloads with a typical MCP-server profile: 50ms CPU and 128MB memory per call.

Scenario A: 1M calls/month (avg 0.4 req/s, agents for ~30–100 users)
Scenario B: 10M calls/month (avg 4 req/s, agents for ~500–1,000 users)

Platform	Free tier	Scenario A / mo	Scenario B / mo	Cold start
Cloudflare Workers Paid	—($5 minimum)	$5	$5	<5ms
AWS Lambda (ARM Graviton2)	Always-free: 1M req + 400k GB-sec/mo	$0 (in free tier)	$1.80–$2.50	280ms avg
Vercel Functions (Pro)	1M invocations (Hobby)	$20 (Pro base)	$20 + ~$5.40 overage	100–200ms
Railway (Hobby/Pro)	$5 / $20 usage credit included	$5 (small always-on)	$20–$50	Always-on (seconds on scale-to-zero)
Fly.io	$5 trial credit only	$2–$5 (shared-cpu-1x)	$10–$30 (multi-Machine)	Always-on

*AWS Lambda Scenario B math: 10M − 1M free tier = 9M reqs × $0.20/1M = $1.80. GB-seconds: 10M × 50ms × 0.125GB = 62,500 GB-seconds, which fits inside the permanent 400,000 GB-second free allowance, so compute is free. Numbers use the ARM rate.

*Cloudflare Workers Paid is $5/month and includes 10M requests — Scenario B fits exactly.

Cloudflare Workers — $5 flat plus free egress, the cost king

As of May 2026, the platform that dominates remote MCP server economics is Cloudflare Workers. The pricing is simple:

Free: 100,000 requests/day, 10ms CPU per invocation
Workers Paid: $5/month minimum, 10M requests included, $0.50 per additional million, egress (bandwidth) is free

Both Scenario A and Scenario B fit inside $5/month flat. The V8 Isolate cold start under 5ms (sometimes sub-millisecond in reports) translates directly to better UX when agents hit MCPs in tight loops. Compared with AWS Lambda's 280ms average (and occasional 500ms spikes), Workers is ~100× faster.

✅ When Cloudflare Workers fits

・Short, high-frequency MCP tool calls (the "call → return" loop dominates)
・Large egress in responses (file fetch, vector search results)
・You want global distribution and low latency by default
・You want a predictable $5 flat budget

⚠️ Workers constraints

There's a CPU-time cap per invocation (10ms on Free, up to 30s on Paid). Long-running LLM calls or heavy compute done synchronously inside a Worker isn't a good fit. Use waitUntil() and Durable Objects, and push external API I/O onto the fetch wall-clock budget.

AWS Lambda — permanent free tier and a 20% ARM discount

AWS Lambda's headline feature is its permanent free tier: 1M requests and 400,000 GB-seconds per month, available to every account forever (not just new ones). Scenario A fits entirely inside the free tier, so the only AWS billing is for other resources like API Gateway.

The 2026 power move is ARM Graviton2: GB-second rate of $0.0000133334 (20% cheaper than the x86 rate of $0.0000166667). Reports cite up to 34% price-performance improvement on suitable workloads. The request rate stays at $0.20 per 1M regardless of architecture. Most MCP servers run on Node.js or Python — for pure TypeScript MCPs, picking ARM64 is essentially free money.

Scenario B math: (10M − 1M) × $0.20/1M = $1.80 plus compute (still inside the free tier) = roughly $2.50/month all-in. That's cheaper than Workers Paid in many cases. The catch: cold start averaging 280ms is something MCP clients feel, and whether that's tolerable is the deciding axis.

# AWS SAM template excerpt (ARM64 + maximizing free tier)
Resources:
  McpServer:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: nodejs20.x
      Architectures: [arm64]   # 20% cheaper GB-sec rate
      MemorySize: 256
      Timeout: 30
      Handler: dist/handler.mcp

Vercel Functions — great DX, but three-axis billing

Vercel Functions has a unique three-axis billing model: $0.60 per 1M invocations + $0.128 per CPU-hour + $0.0106 per GB-hour of memory. The Hobby free tier covers 100,000 invocations (some plan tiers include up to 1M), and Pro at $20/month includes 1M invocations.

Scenario A fits inside Pro at $20/month. Scenario B is 10M − 1M = 9M × $0.60/1M = $5.40 overage, plus CPU and memory billing. The pattern: roughly 3–4× the cost of Workers Paid at the same request volume — the question is whether Next.js integration, preview deployments, the Edge Network, and team features earn that premium.

The "$286 actually billed" anecdote from a Deploywise 2026 report comes from three-axis overage stacking on top of the Pro $20 base — possible when MCP traffic is larger than estimated. Set an on-demand usage budget (default $200) and watch the bill.

⚠️ Typical patterns that blow up the Vercel bill

(1) Functions running on oversized memory by default — GB-hour piles up. (2) Streaming responses count toward CPU time. (3) Bandwidth overage if your MCP returns files. Set a usage budget and check billing alerts weekly.

Railway — Docker containers with usage-based pricing

Railway uses a subscription plus included usage credit model. Hobby is $5/month with $5 of usage credit; Pro is $20/month with $20 of usage credit. Overage is metered at $20/vCPU/month and $10/GB-RAM/month. CPU and memory are billed by the second, so idle time costs effectively nothing.

If you want to run an MCP server as Docker, take advantage of existing WebSocket/SSE persistent connections, or run an OSS MCP server (MongoDB MCP and friends) without rewriting it, Railway is the realistic pick over serverless. Managed Postgres / MySQL / Redis, persistent volumes, and object storage are all included.

Scenario A (1M calls/month) typically fits a shared-CPU 0.5 vCPU + 512MB RAM service inside the Hobby tier ($5/month). Scenario B usually lands inside the Pro $20 usage credit with 1–2 vCPUs + 1–2GB RAM. Watch out: without scale-to-zero, the service runs (and bills) overnight even with no traffic.

Fly.io — edge containers after the free tier removal

Fly.io removed its free tier in 2024. New accounts get $5 in trial credit, then a credit card is required and all Machines and storage are billed.

Pricing is fully usage-based — no fixed plans. A small always-on shared-cpu-1x with 256MB RAM is roughly $1.94/month; storage is $0.15/GB/month; outbound bandwidth starts at $0.02/GB. The practical floor for a real small app is around $5/month.

Fly.io's edge is the ability to spread the same single binary across regions (mirror an MCP server across Tokyo, Osaka, Singapore, and Frankfurt). It's not the global mesh that Cloudflare Workers gives you, but it dramatically outperforms a Lambda function pinned to us-east-1 from anywhere else in the world. Long-running tasks, persistent WebSocket connections, and custom runtimes (Go, Rust, Bun, etc.) are where Fly.io shines for MCP work.

Selection flow by use case

There's no single "always pick this" answer, but the flow below covers the majority of cases.

Short, high-frequency, cold-start sensitive → Cloudflare Workers Paid ($5 flat)
Inside the AWS estate, reusing IAM → AWS Lambda (ARM64) + API Gateway
Heavy Next.js integration, team / preview workflows → Vercel Functions (set the usage budget)
Docker as-is, WebSocket persistent connections → Railway ($5–$20)
Multi-region, custom runtimes, long-running tasks → Fly.io (multi-Machine layout)
Personal projects, PoC with zero budget → Cloudflare Workers Free (100k req/day) or AWS Lambda (always-free)
Large response sizes (files / vector hits) → Cloudflare Workers + R2 (free egress)

FAQ

What's the cheapest place to run an MCP server?

It depends on volume. Up to 1M calls/month → AWS Lambda is effectively $0 inside the always-free tier. Up to 10M → Cloudflare Workers Paid ($5 flat, free egress). Beyond 10M → Workers Paid stays ahead at $0.50 per additional million. If you need persistent connections, Railway ($5–$20) is the choice.

Which platform has the fastest cold start?

Cloudflare Workers (V8 Isolates, <5ms — sometimes sub-millisecond in reports) wins by a wide margin. AWS Lambda averages 280ms in us-east-1. Vercel Functions sits at 100–200ms. Railway and Fly.io run persistent containers, so there's no cold start in the usual sense (a few seconds for scale-to-zero restarts).

Does Fly.io still have a free tier?

It was removed in 2024. New accounts get $5 in trial credit. A minimal shared-cpu-1x with 256MB RAM is about $1.94/month, and the practical floor is around $5/month.

Is AWS Lambda ARM really 20% cheaper?

Yes — GB-second rate of $0.0000133334 vs $0.0000166667 on x86. Request charges are uniform at $0.20 per 1M. For Node.js + pure TypeScript MCP servers, ARM64 is essentially free money.

Can I run a production MCP server on Cloudflare Workers Free?

For personal projects or small PoCs, yes. Free is capped at 100k req/day and 10ms CPU per invocation, which is tight for enterprise MCPs. Production should move to the $5/month Paid plan: 10M requests included, $0.50 per additional million, and free egress.

What about huge workloads (100M+ calls/month)?

At that volume you're in committed-use and enterprise-contract territory: Cloudflare Workers for Platforms, AWS Compute Savings Plans (Lambda eligible), Vercel Enterprise. Past 100M req, Vercel's $200 on-demand budget cap becomes risky, and Workers Paid or Workers Enterprise are typically the default.

Data disclosure and disclaimers

Pricing and performance numbers reflect each vendor's public documentation and third-party reports as of May 2026: Cloudflare Workers Paid ($5/month, 10M requests included, free egress) from developers.cloudflare.com/workers/platform/pricing/; AWS Lambda (always-free 1M req + 400k GB-sec, ARM Graviton2 −20%) from aws.amazon.com/lambda/pricing/; Vercel Functions (three-axis billing, Hobby/Pro $20) from vercel.com/docs/functions/usage-and-pricing; Railway (Hobby $5 / Pro $20, $20/vCPU-month, $10/GB-RAM-month) from railway.com/pricing; Fly.io (free tier removed in 2024, $5 trial credit, shared-cpu-1x ≈ $1.94/month) from fly.io/docs/about/pricing/. Cold-start figures come from Cloudflare's own blog (V8 Isolates <5ms) and Rebal AI's March 2026 production report (Lambda 280ms avg). The Vercel "$286 bill" example is from Deploywise's 2026 report. Sample math assumes 50ms CPU + 128MB memory per call as a typical MCP server profile; real numbers vary with the workload. Always confirm against vendor pricing before going to production.

MCP Server Hosting Cost Comparison 2026 — Cloudflare Workers, AWS Lambda, Vercel, Railway, Fly.io

Contents

Why compare MCP hosting costs now

Two scenarios — 1M and 10M calls per month

Cloudflare Workers — $5 flat plus free egress, the cost king

AWS Lambda — permanent free tier and a 20% ARM discount

Vercel Functions — great DX, but three-axis billing

Railway — Docker containers with usage-based pricing

Fly.io — edge containers after the free tier removal

Selection flow by use case

Before you ship, check the AEO data on what you're actually integrating

FAQ

What's the cheapest place to run an MCP server?

Which platform has the fastest cold start?

Does Fly.io still have a free tier?

Is AWS Lambda ARM really 20% cheaper?

Can I run a production MCP server on Cloudflare Workers Free?

What about huge workloads (100M+ calls/month)?

For AI Agents

Contents

Why compare MCP hosting costs now

Two scenarios — 1M and 10M calls per month

Cloudflare Workers — $5 flat plus free egress, the cost king

AWS Lambda — permanent free tier and a 20% ARM discount

Vercel Functions — great DX, but three-axis billing

Railway — Docker containers with usage-based pricing

Fly.io — edge containers after the free tier removal

Selection flow by use case

Before you ship, check the AEO data on what you're actually integrating

FAQ

What's the cheapest place to run an MCP server?

Which platform has the fastest cold start?

Does Fly.io still have a free tier?

Is AWS Lambda ARM really 20% cheaper?

Can I run a production MCP server on Cloudflare Workers Free?

What about huge workloads (100M+ calls/month)?

Related

Cloudflare Workers MCP Server Production Deployment Guide 2026

MCP Server Build vs Buy 2026 — A 3-Year TCO Comparison: Self-Hosted, Official, and Third-Party

MCP Server Implementation Guide 2026 — Auth, Rate Limiting & Error Handling Patterns

Is '52% of MCP Servers Dead' True? Decoding the 2,181-Endpoint Audit

For AI Agents