Agents need procedural knowledge that no model possesses: how to deploy your specific system, how to debug your specific infrastructure, how to follow your team's specific conventions.
The common solution is to cram everything into the system prompt. This fails at scale: a 10,000-token system prompt wastes context on procedures irrelevant to the current task, and maintaining a monolithic prompt becomes increasingly painful.
Package procedural knowledge into modular Skills: self-contained SKILL.md files loaded on demand when a task matches the skill's description.
skills/
├── deploy-staging/
│ ├── SKILL.md
│ └── scripts/deploy.sh
├── fix-log-permissions/
│ └── SKILL.md
├── debug-docker/
│ └── SKILL.md
└── pr-review/
└── SKILL.md
When an agent encounters a task, it checks available skill descriptions and loads the relevant one. After the task, the skill context is discarded.
Each skill has YAML frontmatter (name, description) and Markdown instructions:
---
name: fix-log-permissions
description: >
Fix file permission errors after deployment.
Triggers on: permissions, www-data, storage/logs.
---The description is the matching surface. Write it as "use when..." with specific trigger words.
One skill = one concern. If a skill covers deployment AND monitoring, split it. Target under 2,000 tokens per SKILL.md.
Run `scripts/deploy.sh staging` to deploy.Scripts encode the exact commands. The SKILL.md provides context and decision logic.
When a problem appears three times, create a skill for it:
- First time: solve it manually, log the fix
- Second time: recognize the pattern, refine the approach
- Third time: codify it as a skill
This prevents premature abstraction while ensuring recurring work gets automated.
The agent (or orchestrator) needs a mechanism to match tasks to skills. Options:
- Description matching: Compare task intent to skill descriptions (semantic similarity)
- Keyword triggers: Explicit trigger words in descriptions
- Framework-native: Most agent frameworks have tool/skill routing built in
- Maintenance: Skills need updating when procedures change. Outdated skills are worse than no skills.
- Discovery failures: If the task doesn't match any skill description, the agent falls back to general knowledge. Good descriptions reduce false negatives.
- Fragmentation: Too many tiny skills increase discovery overhead. Too few large skills lose the benefits.
- Tasks are always novel (research, creative work). Skills shine for repeatable procedures.
- Your agent operates in a single, well-defined domain. A focused system prompt may suffice.
- You have fewer than 3-4 distinct procedures. The overhead of skills isn't justified.
- Skill-Based Recovery — skills specifically for fixing known issues
- Identity as Architecture — identity shapes how skills are applied
- See also: AgentSkills Spec for the full format specification