For the past two years, the dominant paradigm in AI-assisted development has been Context Engineering—the art of giving agents access to the right knowledge. RAG and its variants, agentic search. The problem was one-dimensional: how do we get more relevant information into the context window?
This was important work. Claude Code and similar tools have now solved it convincingly. You can point an agent at a million-line codebase and it will find what it needs. Context is no longer the bottleneck.
But here's what I find fascinating: Context Engineering only solves the what to know problem. It's static. The agent gets knowledge, completes a task, and that's it. The knowledge doesn't improve. The agent doesn't evolve.
The higher-dimensional problem is different: how does an agent learn how to work—and get better at it over time, eventually surpassing human experts?
In my world view, the next paradigm shift is from static knowledge retrieval to dynamic capability evolution. I'm calling this Compound Engineering.
Context Engineering asks: "What information does the agent need for this task?"
Compound Engineering asks: "How does the agent improve itself through every task it completes?"
The core insight: every task contains learnable structure. A code review isn't just a code review—it's a pattern that will repeat. A data migration isn't unique—it follows templates. The question is whether that structure gets captured, refined, and reused—or evaporates after each session.
Skills are the mechanism for this capture. The point isn't just efficiency—it's capability and liberation. Once a task becomes a Skill, humans don't need to do it anymore. The agent handles it autonomously. You move on to higher-level work. The organization's output multiplies while human involvement in routine tasks approaches zero.
The key difference isn't just "prompts vs Skills." It's the dimension of the problem.
Context Engineering operates in one dimension: more context → better single-task performance. It's a retrieval optimization problem.
Compound Engineering operates in a higher dimension: each task → improved capability → better future performance → eventually surpassing human expert baselines. It's an evolution problem.
What makes Skills work for this evolution: they're auditable, versionable, and composable. You can see what went wrong and fix it. Two Skills can combine to create workflows neither could achieve alone. Ten Skills don't give you 10x capability—they give you potentially 100x through composition. And critically, they can be tested against outcomes and iteratively improved.
I've seen several convincing demonstrations this year.
Dan Shipper's team at Every documented their transition in "Compound Engineering: How Every Codes With Agents". Their claim: one developer now produces what previously required five—not because AI writes faster code, but because every error, every optimization, every pattern gets captured and reused. The compounding effect becomes visible over months.
Nikunj Kothari ran a simple experiment: he asked Claude to analyze his chat history and suggest Skills. The model identified 12 repeating task patterns he hadn't consciously noticed. This is the discovery mechanism—AI finding the hidden structure in your own work.
Boris Cherny's team implemented automatic knowledge capture during code review. Every PR that gets reviewed adds to a persistent knowledge base. The institutional memory doesn't walk out the door when people leave.