All posts

What Is MCP (Model Context Protocol)? A Developer's Guide for 2026

MCP is the standard layer between LLMs and your stack. One protocol for tools, resources, and prompts across Cursor, Claude, and custom agents.

~13 min read

I asked Cursor to debug a failing webhook last month. It confidently blamed a race condition in code that did not exist, because it never saw our staging logs or the internal API schema. The model was smart. It was also blind.

That gap is the whole reason Model Context Protocol (MCP) exists. LLMs do not natively read your Postgres replica, call your on-call runbook API, or grep a monorepo unless something wires those systems in. Before MCP, every IDE and chat product invented its own plugin format. You rebuilt the same GitHub connector three times for three hosts. MCP is Anthropic's open answer: one JSON-RPC-shaped protocol for tools, resources, and prompts, with a defined handshake so hosts can discover what a server exposes.

If you are a developer trying to ship agent features without vendor lock-in, MCP is the vocabulary your team will use in 2026. This post is the technical map I wish I had before I wired my first server.

The problem MCP solves: capable models, zero system access

Function calling gave models a way to request structured actions. It did not give you a standard way to publish those actions across products.

In practice you ended up with:

  • Custom OpenAI function schemas duplicated in Python and TypeScript
  • IDE extensions that only work in one editor
  • RAG pipelines that dump text into context but cannot execute a scoped SQL query
  • Security models that differ per integration ("who approved this deploy?")

MCP does not replace your application logic. It standardizes the boundary between an AI host (Cursor, Claude Desktop, a cron-driven agent) and the outside world (files, tickets, databases, internal HTTP APIs).

Think of it like USB-C for agent integrations: the host speaks one client protocol; you ship adapters (MCP servers) for each backend.

What is MCP (Model Context Protocol)?

MCP is an open protocol for connecting AI applications to external data and actions. A host runs the model and UI. An MCP client inside the host maintains sessions with one or more MCP servers. Each server advertises:

PrimitiveWhat it isExample
ToolsCallable functions with JSON Schema inputssearch_issues, run_readonly_sql, read_file
ResourcesAddressable context the model can readOpenAPI spec, schema.sql, runbook markdown
PromptsPre-built prompt templates with arguments/review-pr with diff and style_guide

The model decides when to use a tool. The host decides whether the call is allowed. That split matters for production: you do not want an LLM silently running DROP TABLE because the tool existed.

Official spec and SDKs live at modelcontextprotocol.io. The protocol is intentionally boring (JSON-RPC 2.0 messages). Boring is good when you need interchangeable clients.

MCP architecture diagram showing host, MCP client, and multiple servers exposing tools and resources

How MCP works under the hood

Session lifecycle

When a host connects to a server:

  1. Initialize: protocol version, capabilities exchange
  2. Discover: tools/list, resources/list, prompts/list
  3. Operate: model invokes tools/call, reads resources/read, or fetches a prompt
  4. Notify (optional): server pushes list_changed when tools update

You can see the shape in mermaid if your renderer supports it:

sequenceDiagram
    participant Host as MCP Host (Cursor)
    participant Client as MCP Client
    participant Server as MCP Server
    Host->>Client: Start session
    Client->>Server: initialize
    Server-->>Client: capabilities
    Client->>Server: tools/list
    Server-->>Client: tool schemas
    Host->>Client: Model requests tool call
    Client->>Server: tools/call
    Server-->>Client: result JSON
    Client-->>Host: Inject result into context

Transports: stdio vs HTTP+SSE

MCP messages are transport-agnostic. Two transports show up in real deployments:

stdio: The host spawns the server as a subprocess and speaks JSON-RPC over stdin/stdout. This is what Cursor and Claude Desktop use for local servers. Low latency, no open port, dies with the IDE.

HTTP with Server-Sent Events: The server runs remotely. Client posts messages; server streams events. Useful for shared team gateways, but you must handle auth, TLS, timeouts, and rate limits yourself.

Comparison of MCP stdio local transport versus HTTP SSE remote transport

I default to stdio for anything on my laptop. I only reach for HTTP+SSE when multiple developers need the same managed connector.

JSON-RPC messages you will debug at 11 p.m.

Every interaction is a JSON-RPC 2.0 envelope. When a server "does not work," tcpdump is rarely the answer. stderr logs and malformed stdout are.

A tools/call request looks like this:

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "tools/call",
  "params": {
    "name": "read_file",
    "arguments": { "path": "src/lib/blogs.ts" }
  }
}

The server responds with content blocks, not raw strings:

{
  "jsonrpc": "2.0",
  "id": 4,
  "result": {
    "content": [
      { "type": "text", "text": "import fs from \"fs\";\n..." }
    ],
    "isError": false
  }
}

Errors belong in the protocol shape, not thrown stack traces on stdout. If you build a custom server, log diagnostics to stderr only. A single console.log("debug") on stdout corrupts the stream and kills the session. I learned that the hard way on a Windows machine where the agent silently stopped seeing tools.

Capability negotiation happens at initialize. The client and server declare supported features (tools, resources, prompts, logging). If you add dynamic tools at runtime, send notifications/tools/list_changed so the host refreshes its registry.

SDKs and language choices for MCP servers

You do not need to implement JSON-RPC by hand. Official SDKs exist for TypeScript, Python, and other languages via the MCP GitHub org.

LanguageWhen I pick it
TypeScriptNode shops, quick wrappers around npm APIs, matching frontend teams
PythonData teams, ML services, FastAPI shops with script-heavy ops
Go/RustLong-running remote servers where memory footprint matters

The SDK handles stdio transport, schema validation helpers, and type definitions for request handlers. Your job is the boring business logic: call GitHub, run SQL, read a file.

Hosts do not care which language wrote the server. They care that tools/list returns valid schemas and that tools/call responds within timeout budgets.

What a tool definition actually looks like

Tools are not magic strings. They are schema-documented functions the model can rank and fill:

{
  "name": "search_github_issues",
  "description": "Search issues in the org/repo. Use for bug triage questions only.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": { "type": "string", "description": "GitHub search query" },
      "limit": { "type": "integer", "minimum": 1, "maximum": 20 }
    },
    "required": ["query"]
  }
}

The description field is part of your API design. Vague descriptions cause wrong tool picks. I treat them like docstrings that a junior engineer would read under pressure.

Resources vs RAG

Resources are first-class URIs (file:///docs/api.md, postgres://schema/users). The host fetches them through MCP instead of you pasting 40k tokens into chat.

RAG still has a place for fuzzy semantic search across huge corpora. I use resources when the answer must come from an authoritative file (schema, policy, versioned spec) and RAG when the user asks "anything similar to this error across five years of tickets."

Real developer use cases that ship

These are patterns I have seen work, not demo theater.

IDE grounding: Cursor connects to filesystem, git, and linter MCP servers so the agent reads real files instead of inventing paths. This is the highest ROI setup for daily coding.

On-call debugging: Read-only Postgres + log search tools. Natural language becomes a query UI, but row limits and read-only DB users keep blast radius small.

API-accurate answers: Expose your OpenAPI JSON as a resource. The model cites /v2/invoices instead of hallucinating /api/invoice/list.

Ticket and PR workflows: GitHub or Linear tools for "summarize open PRs touching auth/." Pair with human approval before create_issue or merge.

Internal glue: A 200-line MCP server wrapping your staging deploy script is often faster than waiting for a SaaS integration. I built one for toggling feature flags in a dev environment. Ugly code, saved hours.

For orchestration patterns once tools exist, see Building AI agent workflows with MCP.

How hosts map MCP into model tool calls

Cursor and Claude do not expose raw JSON-RPC to you. They:

  1. Connect to configured servers at session start
  2. Cache tools/list results and inject summaries into the model context
  3. Translate model tool-call requests into MCP tools/call
  4. Feed results back as assistant/tool messages

That translation layer is why the same Postgres MCP server works when you switch from Claude Sonnet to GPT-class models inside Cursor. The MCP surface stays constant; only the planner changes.

When debugging "the model ignored my tool," check both sides:

  • Discovery: Did the host load the server? Is the tool name exactly what you think?
  • Description match: Does the tool description sound like the user task?
  • Approval gate: Did the UI block a write you did not notice?
  • Timeout: Did the server take longer than the host allows?

I keep a scratch file of three golden prompts per server. If those fail, the problem is config or schema, not "AI randomness."

MCP vs alternatives: what I did not choose

Vendor-native function calling only

OpenAI, Anthropic, and Google each support JSON tool schemas in their chat APIs. That works for a single model provider and a single app you control.

It falls apart when:

  • You want the same GitHub connector in Cursor and a nightly cron agent
  • You switch models but keep integrations
  • You need discovery (tools/list) without hardcoding schemas in app config

MCP sits below model-specific adapters. Your server writes once; hosts adapt.

ChatGPT-style plugins (legacy shape)

Early plugin ecosystems used OpenAPI documents and OAuth per vendor. MCP generalizes the idea with a smaller, explicit surface: tools, resources, prompts, uniform errors.

If you already maintain OpenAPI, generating MCP tool wrappers is straightforward. You are not throwing away that work.

"Just paste more context"

Dumping repo files into the prompt window hits token limits, leaks secrets you forgot to redact, and goes stale on the next commit. MCP lets the model pull just the file or schema it needs, when it needs it.

Fully autonomous agents without gates

MCP does not stop a reckless agent loop. Pair MCP with step limits, spend caps, and explicit approval for writes. The protocol gives you hooks; your policy engine still owns safety.

Trade-offs and limitations (read this before production)

Operational overhead: Every MCP server is another process or service to deploy, monitor, and version. A flaky server means the model silently loses a capability mid-session.

Schema discipline tax: Bad tool names and vague descriptions show up as production bugs ("why did it call delete_user for a read query?"). Budget time for schema review like you would for REST endpoints.

Auth is on you: MCP defines messages, not identity. You still need scoped tokens, network policies, and audit logs. A server running on your laptop with full filesystem access is convenient and terrifying.

Not a observability standard: MCP traces tool calls, but you still need OpenTelemetry or structured logs around the server itself. See AI agents landscape in 2026 for how teams layer governance on top.

Ecosystem maturity: Reference servers exist (filesystem, fetch, memory, GitHub, Postgres). Enterprise-grade connectors vary. Expect to write custom servers for internal APIs.

Latency stacking: Each tool call is a network or process hop plus model re-roll. A five-tool chain can feel slow in the IDE even when each step is "fast." Batch read-only queries where you can, and cap agent loops in automation contexts.

Version skew: Server v2 renamed a tool while the host cached v1 schemas. Pin server versions in mcp.json and treat breaking schema changes like API migrations.

Security model: treat the LLM as an unpredictable client

I assume the model will eventually call the most dangerous tool with the worst arguments. Design for that.

  • Least privilege: Read-only DB roles, GitHub PATs scoped to one repo, filesystem servers chrooted to project root
  • Human-in-the-loop: Cursor and Claude can prompt before destructive tools run. Keep that on for writes
  • No secrets in resources: Never expose .env or kubeconfig as readable resources
  • Audit: Log tool_name, args, caller, duration, result status per invocation
  • Network egress: Remote MCP servers should not become open proxies to your VPC

MCP is not a security product. It is a integration standard. Your threat model still treats tool execution like microservice RPC from an untrusted planner.

How to implement your first MCP server

Start small. One workflow, one server, one host.

  1. Pick a pain point: "Explain this table" or "find usages of UserSession"
  2. Clone a reference server: @modelcontextprotocol/server-filesystem or the Python SDK template
  3. Define one tool and one resource: resist the urge to wrap your entire API on day one
  4. Connect from Cursor or Claude Desktop: verify tools/list in the host UI
  5. Run golden prompts: expected tool sequence, expected citations

Minimal TypeScript server sketch (illustrative, not production-hardened):

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { CallToolRequestSchema, ListToolsRequestSchema } from "@modelcontextprotocol/sdk/types.js";

const server = new Server({ name: "staging-logs", version: "0.1.0" }, { capabilities: { tools: {} } });

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [{
    name: "tail_service_log",
    description: "Return last N lines from staging api-service log",
    inputSchema: {
      type: "object",
      properties: { lines: { type: "integer", minimum: 1, maximum: 200 } },
      required: ["lines"],
    },
  }],
}));

server.setRequestHandler(CallToolRequestSchema, async (req) => {
  if (req.params.name !== "tail_service_log") throw new Error("Unknown tool");
  const lines = req.params.arguments?.lines ?? 50;
  // fetch logs from your internal endpoint here
  return { content: [{ type: "text", text: `(last ${lines} lines...)` }] };
});

const transport = new StdioServerTransport();
await server.connect(transport);

Wire it in the host config, restart, confirm the tool appears, then iterate.

For Cursor-specific steps and mcp.json examples, read How to use MCP with Cursor and Claude.

Where MCP fits in your career stack

Interviewers and teams increasingly ask: "How do you connect AI to our systems without rewriting integrations per vendor?" MCP is a credible answer because it names real primitives (tools, resources, prompts) and real transports (stdio, HTTP+SSE).

You do not need to become an "AI engineer" overnight. You need to know how to wrap domain expertise as well-scoped tools and to enforce approvals on side effects. That is software engineering with a new RPC layer.

FAQ

Is MCP only for Anthropic and Claude?

No. MCP is an open protocol. Cursor, Claude Desktop, and community agent frameworks implement MCP clients. Model providers still differ in reasoning quality, but your server porting story is cleaner than bespoke plugins per host.

Do I still need RAG if I use MCP?

Often yes, for different jobs. Use MCP resources and tools when you need authoritative, structured, or executable access. Use RAG when you need semantic search over large, unstructured corpora. Many production setups use both.

How is MCP different from OpenAI function calling?

Function calling is a model API feature. MCP is a host-to-server integration protocol with discovery, resources, and prompts. Hosts map MCP tools into whatever the model API expects. You can think of MCP as the integration layer one step below the model.

Can MCP servers perform writes in production?

They can, but you should gate writes behind human approval, idempotent design, and tight scopes. I keep first iterations read-only, then add write tools with explicit confirmation UI and audit logs.


Written by Rohit Singh, software developer in Jaipur. More on the blog. I also build Study Stream Black, an offline desktop app for course libraries and timestamped notes.