Skills, Plugins, and MCP: The Three Layers of Claude Code Extensibility

Table of Contents

Claude Code has three extension layers, and on a casual reading they look interchangeable. Skills are markdown. Plugins are bundles. MCP servers are processes. All three “give the agent more capabilities.” The marketing material treats them as a continuum.

They are not a continuum. They live at different abstraction levels, run at different times in the agent loop, and have completely different operational profiles. Picking the wrong one means either ignored capability (the agent never invokes your thing) or a bloated context window (the agent invokes it every turn). This is the reference for picking right.

The thirty-second model
#

Skills are instructions for the agent. Markdown files that teach Claude Code how to do a specific task — when to do it, what tools to use, what to avoid.
Plugins are distributable bundles. A directory containing skills, MCP servers, slash commands, and metadata, installable via a URL.
MCP servers are external tool providers. Long-running processes that expose actual functions (read this DB, deploy that service, query Linear) the agent can call.

If you remember nothing else: skills tell the agent how to think, MCP servers give the agent new things to do, plugins are how you ship either to other people.

Skills: the instruction layer
#

A skill is a markdown file. That’s it. Place it in ~/.claude/skills/<name>/SKILL.md (user-level) or .claude/skills/<name>/SKILL.md (project-level), with optional supporting files in the directory, and Claude Code reads it during session bootstrap.

The structure that works:

---
name: deploy-staging
description: Deploy the current branch to the staging environment. Triggered by "deploy to staging" or "push to staging".
---

# Deploy to staging

When the user asks to deploy to staging:

1. Verify all tests pass with `npm test`
2. Build the production artifact: `npm run build`
3. Run `bin/deploy --env staging`
4. Tail logs for 60 seconds to confirm healthy startup

If the deploy fails, the rollback is automatic — do not retry without the user's explicit confirmation.

Never deploy `main` directly without a tagged release.

What makes a skill effective is what makes any spec effective: it tells the agent what to do, when to do it, and what success and failure look like. The description field is the trigger — Claude Code surfaces a skill when the user’s request matches its description, so the description has to be specific enough that the agent doesn’t activate it on every adjacent topic.

When to reach for a skill: any task that needs more than three sentences to explain, any workflow that has a fixed sequence, any operation where “do this exact thing in this exact order” matters more than “figure it out.” Deploys, releases, schema migrations, debugging playbooks, code review checklists, “how we structure a new microservice in this org.”

When skills fail silently: when the description is too generic (“helps with deployments” — gets activated on every deployment-adjacent question, eats context). When the instructions are too prescriptive (“run command X exactly” — but the codebase has moved on). When the skill duplicates information already in CLAUDE.md and the two contradict.

The scaling skills across an engineering org deep dive covers the team-level pattern: a shared skill marketplace where senior engineers package their playbooks once and the rest of the org consumes them.

Plugins: the distribution layer
#

A plugin is a packaged bundle of skills, MCP servers, slash commands, and metadata. Install with /plugin install <git-url-or-marketplace-name>. Plugins solve the “how do I share extensions across a team or the world” problem.

The directory structure is simple:

my-plugin/
├── plugin.json          # name, version, description, dependencies
├── skills/
│   ├── deploy-staging/SKILL.md
│   └── code-review/SKILL.md
├── mcp/
│   └── linear-server.json   # MCP server configuration
└── commands/
    └── status.md            # custom slash command

The plugin.json declares which skills/MCP servers/commands ship together, what version of Claude Code is required, and any external dependencies (npm packages, Python wheels, OS binaries) that need to be present before activation. Anthropic’s plugin marketplace publishes a few first-party plugins; private repos work too — /plugin install https://github.com/your-org/plugin-name.

When to reach for a plugin: when more than one person needs the same extension, when the extension has more than one moving part (a skill and an MCP server, for instance), when versioning matters (skill v2 changes the deploy flow, you don’t want stragglers stuck on v1).

When plugins are the wrong tool: a single skill for one developer’s personal workflow. A throwaway experiment. Anything you’d be embarrassed to find still installed in a year.

MCP servers: the capability layer
#

This is where the architecture changes shape. A skill is a string; a plugin is a directory; an MCP server is a process. It runs alongside Claude Code (locally or remotely), speaks the Model Context Protocol, and exposes functions the agent can call: linear.create_issue, s3.list_buckets, db.run_query. The MCP server does the actual work; the agent decides when to call it.

The protocol’s value proposition has played out: 97 million downloads, Microsoft Agent Framework, Salesforce, Google A2A, OpenAI all building on it. MCP let the AI tool ecosystem stop reinventing per-vendor function-calling for every integration.

Three flavors of MCP server:

Local stdio servers — a subprocess Claude Code spawns. Fastest, no network, useful for filesystem-bounded tools. Most Anthropic-built servers (filesystem, GitHub-via-token, sqlite) are stdio.
Remote HTTP/SSE servers — long-running services your team operates. Best for shared resources (databases, deploy systems, internal APIs) where you want central auth, audit, and rate-limiting.
Cloud-hosted servers — the Pinterest blueprint shows what this looks like at scale: a registry of internal MCP servers, two-layer JWT auth, observability, ROI dashboards.

Configure servers in ~/.claude/settings.json or per-project .claude/settings.json:

{
  "mcpServers": {
    "linear": {
      "command": "npx",
      "args": ["-y", "@linear/mcp-server"],
      "env": { "LINEAR_API_KEY": "$LINEAR_API_KEY" }
    },
    "company-db": {
      "url": "https://mcp.internal.company.com/db",
      "auth": { "type": "bearer", "token_env": "COMPANY_MCP_TOKEN" }
    }
  }
}

When to reach for an MCP server: when the agent needs to do something that doesn’t already exist as a CLI or filesystem operation. Read your Linear issues. Query your prod read-replica. Deploy via your custom CI system. Talk to Slack. Anything that crosses an authentication boundary, anything that requires a non-trivial library, anything that’d be unsafe to let the agent do via shell.

When MCP is overkill: anything that already has a CLI. Asking the agent to call gh, kubectl, psql, aws, terraform directly is faster, simpler, and uses the security model you already trust. Don’t build an MCP wrapper around a CLI just to feel professional — Claude Code is good at calling CLIs.

The decision matrix
#

Need	Use
Teach the agent a multi-step workflow	Skill
Give the agent a new external capability	MCP server
Wrap an existing CLI for the agent	Just let it call the CLI
Share a workflow with a team	Plugin (containing skills)
Share a capability with a team	Plugin (containing MCP server)
Quickly trigger a known sequence	Slash command (in plugin or `~/.claude/commands/`)
One-off “always remind the agent of X”	CLAUDE.md (not a skill)

The CLAUDE.md vs skill distinction trips people up. Rule of thumb: CLAUDE.md is for things the agent should always know (project conventions, build commands, file layout). Skills are for things the agent should know when triggered (a specific workflow, a niche operation). Putting a deploy procedure in CLAUDE.md eats context on every turn; putting “this codebase uses kebab-case” in a skill means the agent forgets it half the time.

The failure mode nobody warns you about
#

The seductive failure: stuffing everything into CLAUDE.md and skills until the agent’s context window is mostly preamble. CLAUDE.md is loaded on every session. Skills load when their description matches. If you have 40 skills with vague descriptions, half of them activate on any given session, and your effective context shrinks by thousands of tokens before you’ve said hello.

The check: every skill description should be a sentence the agent could plausibly answer “yes, that matches the user’s request” or “no, it doesn’t” with confidence. If you read your skill descriptions and they all sound the same, they will all activate the same.

The other failure mode: MCP servers without security boundaries. An MCP server that exposes db.run_query with a connection string to your prod read-write database is a foot-gun. Use read-replicas. Use scoped credentials. Use the Lucidworks pattern of one server per data domain with its own auth, not a single mega-server with admin access to everything.

Going further
#

The MCP topic hub — protocol roadmap, governance, production case studies
The CLAUDE.md guide — what to put in CLAUDE.md, what to leave out, the security pattern
Pinterest’s production MCP architecture — the registry, JWT auth, ROI numbers
Scaling Claude Code Skills — the team-level pattern
The CLAUDE.md trap — CVE-2026-21852, why provenance matters

The three layers are a feature, not a confusion. Skills give the agent judgment, MCP gives it capability, plugins give you distribution. Pick the smallest layer that does the job, and stop reaching for a heavier one until the lighter one runs out.

The thirty-second model#

Skills: the instruction layer#

Plugins: the distribution layer#

MCP servers: the capability layer#

The decision matrix#

The failure mode nobody warns you about#

Going further#

Related