MCP Server Patterns for Tool-Augmented LLMs
Design patterns for building Model Context Protocol servers, from tool registration to transport options to authentication strategies.
Akhil Sharma
February 13, 2026
MCP Server Patterns for Tool-Augmented LLMs
The Model Context Protocol (MCP) has become the standard interface for giving LLMs access to external tools and data sources. Instead of building custom tool-calling integrations for each model provider, you build an MCP server once and any compatible client can use it.
But "build an MCP server" glosses over the design decisions that determine whether your server is reliable, secure, and fast. Here are the patterns that work.
MCP Architecture in 30 Seconds
An MCP server exposes three primitives:
- Tools: Functions the LLM can call (execute query, create file, send message)
- Resources: Data the LLM can read (file contents, database schemas, API docs)
- Prompts: Reusable prompt templates with parameters
Tool Design: The Most Important Decision
The quality of your tools determines the quality of LLM interactions. A poorly designed tool confuses the model and produces bad tool calls.
Rule 1: One tool, one action. Don't build a database tool that accepts an action parameter. Build query_database, insert_record, update_record, and delete_record separately. The model selects tools by name and description — distinct tools with clear names get called correctly far more often.
Rule 2: Descriptions are prompts. The tool description is injected into the model's context. Write it like you're explaining the tool to a competent developer who has never seen your system:
Rule 3: Return structured data, not natural language. When a tool returns results, return JSON or a consistent text format. The model processes structured output more reliably than free-form text. Include metadata like row counts, truncation indicators, and error details.
Transport Selection
MCP supports multiple transport mechanisms. The choice depends on your deployment model.
stdio — The server runs as a subprocess, communicating via stdin/stdout. Simplest setup, lowest latency, but limited to local execution. This is what Claude Code and most IDE integrations use.
AI Engineering Cohort
We build this end-to-end in the cohort.
Live sessions, real systems, your questions answered in real time. Next cohort starts 2nd July 2026 — 20 seats.
Reserve your spot →SSE (Server-Sent Events) — HTTP-based transport. The client connects via HTTP, sends requests as POST, and receives responses as SSE events. Works across networks, easy to deploy behind a reverse proxy.
Streamable HTTP — The newest transport option. Single HTTP endpoint that supports bidirectional streaming. Better for stateless deployments and serverless functions.
| Transport | Latency | Deployment | Statefulness | Best For |
|---|---|---|---|---|
| stdio | Lowest | Local only | Stateful | IDE plugins, CLI tools |
| SSE | Low | Network | Stateful | Internal services |
| Streamable HTTP | Low | Network/Serverless | Stateless option | Cloud deployments, multi-tenant |
Authentication Patterns
For remote MCP servers, you need authentication. The protocol supports OAuth 2.0 out of the box, but the implementation patterns vary.
Pattern 1: Token passthrough. The MCP client includes a bearer token in the initial connection. The server validates it against your auth service. Simple, works for single-tenant deployments.
Pattern 2: OAuth 2.0 with PKCE. For multi-tenant servers where users authenticate via a browser flow. The MCP client initiates the OAuth flow, the user authenticates in a browser, and the client receives a token.
Pattern 3: API key with scoping. Each client gets an API key that maps to a set of allowed tools and resources. Simple to implement, good for service-to-service communication.
Resource Management
Resources expose data that the LLM can read without executing a function. Think of them as a filesystem-like interface to your data.
Pattern: Lazy loading with caching. Don't load all resources at startup. Load them on first access and cache with a TTL. For large datasets (API docs, codebase indexes), use a resource template that loads specific sections on demand.
Pattern: Resource subscriptions. If your resources change (live database schema, updated docs), implement notifications so the client knows to re-fetch:
Error Handling and Timeouts
MCP tool calls can fail in ways that the LLM needs to understand. Don't just throw exceptions — return error information the model can reason about.
The hint field matters. When the model sees a structured error with a hint, it can self-correct and retry with a modified tool call. Without it, the model often repeats the same failing call or gives up.
Composing Multiple MCP Servers
In production, you'll have multiple MCP servers — one for database access, one for file management, one for external APIs. The MCP client connects to all of them simultaneously, and the model sees a unified tool palette.
Keep servers focused on a single domain. A database server shouldn't also manage files. This makes each server simpler to test, deploy, and secure independently.
Name tools to avoid collisions across servers. Prefix tool names with the domain: db_query, db_insert, files_read, files_write, github_create_issue. The model uses these prefixes to understand which domain a tool belongs to, improving tool selection accuracy.
MCP servers are the interface layer between LLMs and your systems. Treat them like APIs — design clear contracts, validate inputs, handle errors gracefully, and version them. The time you invest in tool design pays back every time the model makes a correct tool call instead of hallucinating an action.
More in AI Engineering
Building Reliable LLM Evaluation Pipelines
How to evaluate LLM outputs systematically with automated metrics, LLM-as-judge, human review, and CI/CD integration for prompt regression testing.
Prompt Caching Strategies That Cut Your LLM Costs in Half
Practical caching strategies for LLM applications — from exact match to semantic similarity caching to provider-level prefix caching — with real cost/latency numbers.