# The Model Context Protocol (MCP) Explained: Architecture and Internals

Before MCP, connecting an AI assistant to your tools was an N×M problem and everyone felt it. Every assistant (Claude, an IDE, some internal agent) needed a bespoke integration with every system it should reach (GitHub, Postgres, Jira, your internal API), and each integration was a one-off — different auth, different shapes, all of it reinvented per pair. Build five assistants that each talk to ten systems and you've written fifty integrations that don't compose. The **Model Context Protocol** is the boring, correct fix: turn N×M into N+M by standardizing the interface in the middle, the way USB-C did for chargers.

MCP is an **open protocol that standardizes how applications provide context and tools to large language models.** Introduced by Anthropic in late 2024, it spread through 2025 because the problem was universal and the design is deliberately simple — it borrows the proven shape of the Language Server Protocol that did the same thing for editors and language tooling. I'll walk the architecture, the three primitives, the transports, a real tool-call flow, and — most importantly — the security questions it forces you to answer.

## The architecture: host, client, server

MCP has three roles, and keeping them straight is most of the battle:

- **Host** — the AI application the user interacts with (a desktop assistant, an IDE, an agent runtime). It contains the LLM and decides what to connect to.

- **Client** — a connector living inside the host, one per server, that speaks the MCP protocol. The host spins up a client for each server it wants to use; the client maintains a 1:1 session with that server.

- **Server** — a separate program that exposes a specific system's capabilities (a GitHub server, a filesystem server, a database server) in MCP's standard vocabulary. Servers are the reusable, shareable unit — write one, and any MCP host can use it.

```mermaid
graph TD
    subgraph HOST["Host application (contains the LLM)"]
        LLM["LLM"]
        CL1["MCP client A"]
        CL2["MCP client B"]
    end
    S1["MCP server: GitHub(tools, resources, prompts)"]
    S2["MCP server: Postgres(tools, resources, prompts)"]
    EXT1["GitHub API"]
    EXT2["Database"]
    LLM --- CL1
    LLM --- CL2
    CL1 <-->|"MCP / JSON-RPC"| S1
    CL2 <-->|"MCP / JSON-RPC"| S2
    S1 --> EXT1
    S2 --> EXT2
          
```

MCP's three roles. The host holds the model and runs one client per server, each client keeping a 1:1 session with its server over JSON-RPC. Servers wrap real systems and expose them in MCP's standard primitives. Because the interface is standardized, a server written once works with any host — that's the N×M-to-N+M collapse.

## The three primitives: tools, resources, prompts

A server exposes its capabilities as three kinds of things, and the distinction between them is genuinely important — it encodes *who is in control* of each interaction.

| Primitive | What it is | Controlled by | Example |
| --- | --- | --- | --- |
| **Tools** | Functions the model can invoke to take action or fetch data | Model (decides when to call) | `create_issue`, `run_query` |
| **Resources** | Read-only data the host can load into context | Application | A file's contents, a record, a schema |
| **Prompts** | Reusable, parameterized prompt templates / workflows | User (invokes deliberately) | A "/review-pr" template |

**Tools** are model-controlled: the model sees their descriptions and chooses to call them — this is the action surface, and the one with teeth. **Resources** are application-controlled context the host pulls in (a file, a row, an API response) without the model deciding to "act." **Prompts** are user-controlled templates surfaced as slash-commands or menu items the user picks on purpose. The clean way to hold it: tools are for the model to *do*, resources are for the app to *show*, prompts are for the user to *start*. Mixing them up — exposing a destructive action as a casual "resource," say — is how MCP servers get designed badly.

## Transports: JSON-RPC over stdio or HTTP

Underneath, MCP is **JSON-RPC 2.0** — a simple request/response (and notification) message format. Client and server exchange JSON-RPC messages to negotiate capabilities, list tools/resources/prompts, and invoke them. There are two main transports:

- **stdio** — the host launches the server as a local subprocess and talks to it over standard input/output. Simple, fast, no network; ideal for local tools like a filesystem or a local database. This is the most common way local servers run.

- **HTTP-based transport** — for remote servers reached over the network (with Server-Sent Events for streaming server-to-client messages). This is what lets a hosted, multi-user MCP server exist, and it's where authentication and authorization (OAuth-style flows) become essential.

The session begins with a capability handshake: the client and server each declare what they support, so the protocol can evolve and features degrade gracefully. After that, the client can ask `tools/list` to discover what's available and `tools/call` to invoke one. The shape is intentionally familiar to anyone who's implemented an LSP server.

```json
// Client discovers tools
{ "jsonrpc": "2.0", "id": 1, "method": "tools/list" }

// Server responds with tool definitions (name, description, JSON-Schema inputs)
{ "jsonrpc": "2.0", "id": 1, "result": { "tools": [
  { "name": "run_query",
    "description": "Run a read-only SQL query against the analytics DB",
    "inputSchema": { "type": "object",
      "properties": { "sql": { "type": "string" } }, "required": ["sql"] } }
] } }

// Later: the host asks the server to actually invoke it
{ "jsonrpc": "2.0", "id": 2, "method": "tools/call",
  "params": { "name": "run_query", "arguments": { "sql": "SELECT count(*) FROM orders" } } }
```

## How a tool call actually flows

Putting it together end to end, because the control flow surprises people — the server never talks to the model directly:

1. On connect, the client calls `tools/list`; the host gives the model the available tool definitions (names, descriptions, input schemas) as part of its context.

1. The user asks something. The model decides a tool would help and emits a request to call it with arguments.

1. The **host** intercepts that — typically pausing for user approval on anything consequential — then tells the client to issue `tools/call` to the server.

1. The server executes against the real system (hits the GitHub API, runs the SQL) and returns the result over JSON-RPC.

1. The host feeds the result back into the model's context, and the model continues — answering, or calling another tool.

The load-bearing detail: **the model proposes, the host disposes.** The model only ever *requests* a tool call; the host is the gatekeeper that decides whether to actually execute it. That boundary is exactly where human-in-the-loop approval and policy enforcement live, and it's why the host/client split matters rather than being ceremony.

**MCP standardizes the plumbing, not the trust — and that's the part teams skip.** An MCP server is code you're letting an AI drive against real systems, and the protocol itself doesn't make that safe. Three things bite. *Prompt injection through tool results:* content a server returns (a web page, an email, a file) can carry instructions the model then follows — a tool result is untrusted data, not a command. *Over-broad tools:* a server with a "run any SQL" or "execute shell" tool hands the model the keys; scope tools tightly and prefer read-only. *Authentication on remote servers:* an HTTP MCP server exposing your systems needs real authn/authz, not an open port. Treat every server you connect like a third-party integration with production access — because that's what it is. Vet it, sandbox it, and keep consequential actions behind explicit approval.

This is the same governance problem that [designing auditable multi-agent systems](regulated-ai-multi-agent-design) grapples with — a standardized tool interface makes integration easy, which makes *controlling* what agents can do the actual hard part. MCP gives you the seam to enforce policy at (the host's approval gate); it doesn't enforce it for you.

## What to carry away

The Model Context Protocol is an open standard that turns the N×M mess of AI-to-tool integrations into N+M by fixing the interface in the middle. Its architecture is **host** (the app with the model), **client** (one per server, inside the host), and **server** (a reusable wrapper around a real system). Servers expose three primitives that encode who's in control — **tools** the model calls, **resources** the app loads, **prompts** the user invokes. It all rides **JSON-RPC** over stdio (local subprocess) or HTTP (remote), starting from a capability handshake.

The flow to remember is that the model only ever *proposes* a tool call and the host decides whether to run it — which is precisely where approval and policy belong. MCP makes connecting capabilities trivial; it deliberately leaves trust to you, so the security work (prompt-injection-aware handling of tool results, tightly scoped tools, real auth on remote servers) is the part that actually determines whether your setup is safe. Write a server once and every host can use it — that reusability is why, a few months after launch, MCP servers are everywhere.