MCP (Model Context Protocol) crossed 110 million monthly downloads in early 2026. What started as Anthropic's open standard for connecting AI agents to external tools has become the connective tissue of the agentic automation ecosystem, with contributions from OpenAI, Google, Microsoft, and hundreds of enterprise vendors. If your SaaS product doesn't have an MCP server, your customers' AI agents can't interact with your product.

This guide covers how to build a production MCP server for your SaaS product with OAuth 2.1 authentication, multi-tenant isolation, billing integration, and deployment on AWS. We use the official TypeScript SDK with Streamable HTTP transport (the current recommended protocol) and cover the patterns that separate a demo from a production system.

For a broader introduction to MCP concepts, see our MCP Developer Guide and MCP Server Development Guide.

What This Guide Covers

Why Your SaaS Needs an MCP Server in 2026
Architecture: Remote MCP with Streamable HTTP
OAuth 2.1 Authentication for Multi-Tenant SaaS
Designing Your Tool Schema
Implementation: TypeScript SDK + Express
Multi-Tenancy and Data Isolation
Billing and Usage Metering
Deployment on AWS (ECS Fargate)
Testing and Debugging MCP Servers
Why Lushbinary for Your MCP Server Build

1Why Your SaaS Needs an MCP Server in 2026

The shift is simple: your customers are increasingly interacting with software through AI agents rather than directly through your UI. When a customer asks Claude "create a new project in [YourApp] and invite the team," that request needs to reach your API through a standardized protocol. Without MCP, every AI platform requires a custom integration.

The Business Case

Distribution: MCP servers are discoverable by any MCP-compatible client (Claude, Cursor, Kiro, VS Code, custom agents). One server, many clients
Retention: Customers whose AI workflows depend on your MCP server have higher switching costs
Upsell: AI-driven usage often exceeds manual usage. Customers who connect via MCP tend to make more API calls, which drives usage-based revenue
Competitive moat: Early MCP adopters (Sentry, Linear, Notion, Cal.com) are capturing the AI-native workflow market before competitors

Market Signal

Microsoft Azure AI Foundry, Google Cloud, and AWS all added MCP support in 2026. The protocol was donated to the Linux Foundation's Agentic AI Foundation in December 2025. This is not a single-vendor bet anymore. It is the emerging standard.

2Architecture: Remote MCP with Streamable HTTP

The MCP architecture has three layers: Host (the AI application), Client (the protocol handler inside the host), and Server (your service that exposes tools). For SaaS, you always build a remote server.

Why Streamable HTTP (Not SSE)

MCP spec version 2025-03-26 deprecated SSE (Server-Sent Events) in favor of Streamable HTTP. The key differences:

Feature	SSE (Deprecated)	Streamable HTTP
Endpoints	Separate /sse and /messages	Single /mcp endpoint
Session Management	Connection-based	Mcp-Session-Id header (stateful or stateless)
Batching	Not supported	JSON-RPC batching supported
Load Balancing	Difficult (sticky sessions)	Standard HTTP load balancing
Tool Annotations	Basic metadata	Comprehensive (read-only, destructive, idempotent)

3OAuth 2.1 Authentication for Multi-Tenant SaaS

The MCP spec treats your MCP server as a resource server in OAuth terms. The authentication flow works like this:

MCP client discovers your server's OAuth metadata at /.well-known/oauth-authorization-server
Client redirects user to your authorization endpoint for consent
User authenticates with your SaaS (existing login flow)
Your auth server issues a scoped access token with tenant claims
Client includes the token in every request to your MCP server
Your MCP server validates the token and extracts tenant context on each tool invocation

Token Design for Multi-Tenancy

Your access tokens should include custom claims that scope what the AI agent can do:

// JWT payload for MCP access token
{
  "sub": "user_abc123",
  "tenant_id": "org_xyz789",
  "scope": "mcp:read mcp:write mcp:admin",
  "allowed_tools": ["list_projects", "create_task", "get_analytics"],
  "data_scope": "team:engineering",
  "rate_limit_tier": "standard",
  "exp": 1716000000,
  "iss": "https://auth.yourapp.com"
}

Security Warning

Obsidian Security published research showing common MCP OAuth pitfalls that lead to one-click account takeover. The most common mistake: treating the MCP server as both the authorization server and resource server. Keep them separate. Your existing auth infrastructure issues tokens; your MCP server only validates them.

4Designing Your Tool Schema

MCP tools are the functions your server exposes to AI agents. Good tool design is critical because AI agents use tool names and descriptions to decide which tool to call. Poor naming or vague descriptions lead to incorrect tool selection.

Tool Design Principles

Verb-noun naming: create_project, list_tasks, update_user. AI agents parse these semantically
Specific descriptions: "Creates a new project in the authenticated user's workspace with the given name and optional description" beats "Creates a project"
Flat input schemas: MCP uses JSON Schema for tool inputs. Keep parameters flat rather than deeply nested. AI agents handle flat schemas more reliably
Tool annotations: Mark tools as readOnlyHint, destructiveHint, or idempotentHint so clients can apply appropriate confirmation flows
Pagination: For list operations, include cursor and limit parameters. AI agents will paginate automatically if you return a nextCursor

Example Tool Definition

server.tool(
  "create_task",
  "Creates a new task in the specified project. " +
    "Returns the created task with its ID, status, and assignee.",
  {
    project_id: z.string().describe("The project ID to create the task in"),
    title: z.string().describe("Task title (max 200 characters)"),
    description: z.string().optional().describe("Task description in markdown"),
    assignee_email: z.string().email().optional()
      .describe("Email of the user to assign the task to"),
    priority: z.enum(["low", "medium", "high", "urgent"])
      .default("medium")
      .describe("Task priority level"),
    due_date: z.string().optional()
      .describe("Due date in ISO 8601 format (YYYY-MM-DD)"),
  },
  { annotations: { destructiveHint: false, idempotentHint: false } },
  async ({ project_id, title, description, assignee_email, priority, due_date }, extra) => {
    const tenantId = extra.authContext.tenant_id;
    const task = await yourApi.createTask(tenantId, {
      project_id, title, description, assignee_email, priority, due_date,
    });
    return {
      content: [{ type: "text", text: JSON.stringify(task, null, 2) }],
    };
  }
);

5Implementation: TypeScript SDK + Express

The official @modelcontextprotocol/sdk package (available on npm) provides the server primitives. We pair it with Express for HTTP handling and Streamable HTTP transport.

Project Setup

mkdir your-saas-mcp-server && cd your-saas-mcp-server
npm init -y
npm install @modelcontextprotocol/sdk express zod jsonwebtoken
npm install -D typescript @types/express @types/node tsx

Server Skeleton

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from
  "@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";
import { z } from "zod";
import { verifyToken } from "./auth.js";

const app = express();
app.use(express.json());

// Create MCP server instance
const server = new McpServer({
  name: "your-saas-mcp",
  version: "1.0.0",
});

// Register tools
server.tool(
  "list_projects",
  "Lists all projects accessible to the authenticated user",
  { cursor: z.string().optional(), limit: z.number().default(20) },
  { annotations: { readOnlyHint: true } },
  async ({ cursor, limit }, extra) => {
    const tenantId = extra.authContext.tenant_id;
    const projects = await yourApi.listProjects(tenantId, { cursor, limit });
    return {
      content: [{ type: "text", text: JSON.stringify(projects, null, 2) }],
    };
  }
);

// Streamable HTTP endpoint
app.post("/mcp", async (req, res) => {
  // Validate OAuth token
  const authHeader = req.headers.authorization;
  if (!authHeader?.startsWith("Bearer ")) {
    return res.status(401).json({ error: "Missing bearer token" });
  }

  const token = authHeader.slice(7);
  const authContext = await verifyToken(token);
  if (!authContext) {
    return res.status(403).json({ error: "Invalid or expired token" });
  }

  // Create transport with auth context
  const transport = new StreamableHTTPServerTransport({
    sessionId: req.headers["mcp-session-id"] as string | undefined,
    authContext,
  });

  await server.connect(transport);
  await transport.handleRequest(req, res);
});

app.listen(3001, () => {
  console.log("MCP server running on port 3001");
});

Stateless vs. Stateful

For most SaaS MCP servers, stateless mode is preferred. Each request is independent, which makes horizontal scaling trivial. Use stateful sessions (with Mcp-Session-Id) only if your tools need to maintain conversation context across multiple invocations, like a multi-step wizard flow.

6Multi-Tenancy and Data Isolation

The most critical production concern for SaaS MCP servers is ensuring one tenant's AI agent cannot access another tenant's data. This must be enforced at multiple layers:

Isolation Layers

Token-level: The OAuth token contains tenant_id. Every tool invocation extracts this and passes it to your API layer. Never trust a tenant_id from the tool input parameters
Tool-level: The allowed_tools claim restricts which tools a specific token can invoke. A read-only integration should not have access to delete_project
Data-level: Your API layer must filter all queries by tenant_id. This is the same row-level security you already implement for your web app
Rate-limit-level: Per-tenant rate limits prevent one customer's AI agent from consuming all your server capacity

// Middleware: enforce tenant isolation on every tool call
function enforceTenantIsolation(handler) {
  return async (params, extra) => {
    const { tenant_id, allowed_tools } = extra.authContext;

    // Check tool-level access
    const toolName = extra.toolName;
    if (allowed_tools && !allowed_tools.includes(toolName)) {
      throw new Error(`Tool ${toolName} not authorized for this token`);
    }

    // Inject tenant context (never trust client-provided tenant_id)
    const enrichedParams = { ...params, _tenant_id: tenant_id };
    return handler(enrichedParams, extra);
  };
}

7Billing and Usage Metering

MCP tool invocations are API calls. If your SaaS has usage-based pricing, MCP usage should count toward the customer's quota. If you offer a flat-rate plan, you need rate limiting to prevent AI agents from consuming disproportionate resources.

Metering Strategy

Per-tool metering: Track invocations per tool per tenant. Some tools (read operations) may be free while others (create/update) count toward usage
Async event emission: Emit usage events to your billing system (Stripe Meters, AWS Marketplace Metering, or custom) asynchronously so billing doesn't add latency to tool responses
Quota enforcement: Check remaining quota before executing expensive tools. Return a clear error message so the AI agent can inform the user
Tiered rate limits: Free tier: 100 calls/day. Pro: 10,000 calls/day. Enterprise: custom. Encode the tier in the OAuth token claims

// Usage metering middleware
async function meterUsage(toolName, tenantId, tier) {
  // Check quota
  const usage = await redis.get(`usage:${tenantId}:${today()}`);
  const limit = TIER_LIMITS[tier]; // { free: 100, pro: 10000, enterprise: Infinity }

  if (usage >= limit) {
    throw new McpError(
      ErrorCode.InvalidRequest,
      `Daily API quota exceeded (${limit} calls). Upgrade plan or wait until tomorrow.`
    );
  }

  // Increment counter
  await redis.incr(`usage:${tenantId}:${today()}`);

  // Emit billing event (async, non-blocking)
  billingQueue.emit({
    tenant_id: tenantId,
    tool: toolName,
    timestamp: Date.now(),
    billable: BILLABLE_TOOLS.includes(toolName),
  });
}

8Deployment on AWS (ECS Fargate)

For production SaaS MCP servers, ECS Fargate provides the right balance of scalability, cost, and operational simplicity. Your MCP server is a stateless HTTP service, which maps perfectly to container-based deployment.

Infrastructure Components

Application Load Balancer

HTTPS termination, path-based routing to /mcp endpoint, health checks

ECS Fargate Service

Auto-scaling 2-10 tasks based on request count, 0.5 vCPU / 1GB per task

ElastiCache (Redis)

Rate limiting counters, session state (if stateful), usage metering buffer

CloudWatch + X-Ray

Request logging, latency tracing, error alerting, usage dashboards

Estimated Monthly Cost

For a SaaS MCP server handling 50,000-200,000 tool invocations per day:

ECS Fargate (4 tasks avg): ~$120/month (0.5 vCPU, 1GB each)
ALB: ~$25/month + $0.008 per 1K requests
ElastiCache (t4g.micro): ~$15/month
CloudWatch: ~$20/month
Total: ~$180-$250/month for moderate traffic

9Testing and Debugging MCP Servers

MCP servers are harder to test than REST APIs because the client behavior (AI agent deciding which tool to call) is non-deterministic. Here is the testing strategy:

Testing Layers

Unit tests: Test each tool handler in isolation with mocked API responses. Verify input validation, error handling, and response format
Integration tests: Use the MCP Inspector tool (official debugging client) to send JSON-RPC requests and verify responses
Auth tests: Verify token validation, tenant isolation, and tool-level access control with expired, invalid, and cross-tenant tokens
End-to-end tests: Connect Claude Desktop or Cursor to your server and verify real tool invocations work correctly
Load tests: Simulate concurrent multi-tenant usage to verify rate limiting and isolation under load

// Example: Testing tenant isolation
describe("tenant isolation", () => {
  it("prevents cross-tenant data access", async () => {
    const tokenA = createTestToken({ tenant_id: "org_a" });
    const tokenB = createTestToken({ tenant_id: "org_b" });

    // Create a project as tenant A
    const project = await callTool("create_project", {
      name: "Secret Project",
    }, tokenA);

    // Attempt to list projects as tenant B
    const result = await callTool("list_projects", {}, tokenB);

    // Tenant B should NOT see tenant A's project
    expect(result.projects).not.toContainEqual(
      expect.objectContaining({ id: project.id })
    );
  });
});

10Why Lushbinary for Your MCP Server Build

Building a production MCP server involves more than the protocol itself. You need OAuth integration with your existing auth system, multi-tenant data isolation, billing metering, deployment infrastructure, and ongoing maintenance as the MCP spec evolves. Lushbinary has built MCP servers for SaaS products across project management, analytics, and developer tools.

What We Deliver

Full MCP server implementation: TypeScript, Streamable HTTP, OAuth 2.1, multi-tenant isolation, tool annotations
Integration with your existing API: We wrap your REST/GraphQL endpoints as MCP tools with proper error handling and pagination
Billing integration: Usage metering connected to Stripe, AWS Marketplace, or your custom billing system
AWS deployment: ECS Fargate, ALB, CloudWatch monitoring, auto-scaling, CI/CD pipeline
Testing suite: Unit, integration, auth, and load tests with CI integration
Documentation: Developer docs for your customers showing how to connect their AI agents to your MCP server

Pricing

Scope	Timeline	Cost
Basic (5-10 tools, single-tenant, OAuth)	2-4 weeks	$8K-$15K
Production (15-30 tools, multi-tenant, billing, monitoring)	6-12 weeks	$25K-$60K
Ongoing maintenance	Monthly	$2K-$5K/mo

Free Consultation

Have a SaaS product and want to add MCP support for Claude, Cursor, and other AI agents? Lushbinary will scope your tool surface, design the auth flow, and give you a realistic timeline. 30-minute call, no obligation.

Frequently Asked Questions

What is an MCP server for a SaaS product?

An MCP server exposes your SaaS product's APIs to AI agents like Claude, ChatGPT, and Gemini through a standardized JSON-RPC protocol. It lets AI agents read data, create records, and trigger workflows in your product without custom integrations per AI platform. MCP crossed 110 million monthly downloads in early 2026.

Should I build a local or remote MCP server for my SaaS?

Remote. Local MCP servers (stdio transport) require users to install and run your server on their machine. Remote servers (Streamable HTTP) run on your infrastructure, handle OAuth centrally, and work with any MCP client without installation. For B2B SaaS, remote is the only production-viable option.

What transport protocol should my MCP server use?

Streamable HTTP. It replaced SSE as the recommended remote transport in MCP spec version 2025-03-26. It uses a single /mcp endpoint, supports stateless and stateful sessions, and integrates with standard HTTP middleware. SSE is deprecated for new implementations.

How do I handle authentication in a multi-tenant MCP server?

Use OAuth 2.1 as specified in the MCP authorization extension. Your MCP server validates access tokens issued by your auth server. Each tenant gets scoped tokens with custom claims (tenant_id, allowed_tools, data_scope) that are validated on every tool invocation.

How much does it cost to build an MCP server for a SaaS product?

A basic server (5-10 tools, OAuth, single-tenant) costs $8K-$15K and takes 2-4 weeks. A production multi-tenant server with billing, rate limiting, and monitoring costs $25K-$60K and takes 6-12 weeks. Ongoing maintenance runs $2K-$5K/month.

Sources

Content was rephrased for compliance with licensing restrictions. Protocol specifications sourced from official MCP documentation as of May 2026. Pricing estimates based on Lushbinary project history and AWS pricing calculator. Always verify current pricing on vendor websites.

Build Your SaaS MCP Server

Let your customers' AI agents interact with your product through the standard protocol. We handle OAuth, multi-tenancy, billing, and deployment.

Ready to Build Something Great?

Q: What is an MCP server for a SaaS product?

An MCP (Model Context Protocol) server exposes your SaaS product's APIs to AI agents like Claude, ChatGPT, and Gemini through a standardized JSON-RPC protocol. It lets AI agents read data, create records, and trigger workflows in your product without custom integrations per AI platform. MCP crossed 110 million monthly downloads in early 2026.

Q: What transport protocol should my MCP server use?

Streamable HTTP. It replaced SSE (Server-Sent Events) as the recommended remote transport in MCP spec version 2025-03-26. Streamable HTTP uses a single /mcp endpoint, supports both stateless and stateful sessions via the Mcp-Session-Id header, and integrates with standard HTTP middleware like Express or Hono. SSE is deprecated for new implementations.

Q: How much does it cost to build an MCP server for a SaaS product?

A basic MCP server exposing 5-10 tools with OAuth and single-tenant support costs $8K-$15K and takes 2-4 weeks. A production multi-tenant server with billing integration, rate limiting, audit logging, and comprehensive tool coverage costs $25K-$60K and takes 6-12 weeks. Ongoing maintenance runs $2K-$5K/month for infrastructure, monitoring, and spec updates.

Get a free 30-minute strategy call. We'll map out your project, timeline, and tech stack - no strings attached.

Let's Talk About Your Project

Prefer email? Reach us directly:

connect@lushbinary.com

Build an MCP Server for Your SaaS: Claude Integration Guide with Auth, Billing & Multi-Tenancy