How Orb reuses the LLM KV cache — and where the reasoning toggle forks it

A companion to docs/architecture/kv-cache.md. This walkthrough is single-model mode: Director, Writer, and Editor all run on the same endpoint and model. Step through a full turn (Director → Writer → Editor), then across turns, then try the toggles at the end.

Concept
Reasoning per pass: click to flip · the passes below redraw live
An LLM call
served from cache (hit) computed now (miss) warm prefix on server
The inference server
A prompt's cache only reuses a matching prefix, from the top down.