Cognitive Orchestration

Free preview · Lesson 4

Context Engineering

As soon as systems run over many turns with tools and retrieved data, the prompt is no longer one message — it is a crowded, finite context window. Context engineering is the discipline of curating that window: choosing the right tokens to have present at each step, and managing everything else.

What you'll take away
  • Treat the context window as finite working memory using Karpathy's CPU/RAM analogy.
  • Use the four moves — select, compress, order, evict — to curate what the model sees.

As soon as systems run over many turns with tools and retrieved data, the prompt is no longer one message — it is a crowded, finite context window. Context engineering is the discipline of curating that window: choosing the right tokens to have present at each step, and managing everything else.

What competes for space in the window: system prompt, tools, retrieved data, history, and the live request.
What competes for space in the window: system prompt, tools, retrieved data, history, and the live request.

Anthropic frames context engineering as the natural successor to prompt engineering: prompt engineering writes a good message; context engineering manages the entire context state — system instructions, tools, MCP connections, external data, and message history — across a long-running task. Andrej Karpathy's analogy is the clearest: the model is the CPU, and its context window is the RAM — limited working memory you must load deliberately.

The CPU/RAM analogy: the discipline is to select, compress, order, and evict so the right information is resident at the right step.
The CPU/RAM analogy: the discipline is to select, compress, order, and evict so the right information is resident at the right step.

The core moves

  • Select — retrieve or include only what this step needs.
  • Compress — summarise long history instead of carrying it verbatim.
  • Order — put the most decision-relevant material where the model attends to it.
  • Evict — drop or offload what no longer earns its place.

Example. A coding agent fifty turns deep cannot keep every file it has read in context. Good context engineering keeps a compact running summary, re-retrieves specific files on demand, and lets the rest go — trading perfect recall for a working memory that still fits.

References & further reading