Coarsest review cadence: the Builder self-certifies the build phases and the Adversary does ONE comprehensive cold-verification of the whole accumulated build in a final `review` phase (vs orig per-phase, lean per-gate). Full original prompts + a DEFERRED REVIEW CADENCE override, so it isolates verification cadence. Cheapest coordination; the trade-off is the independent check arrives late (late rework risk + self-certification drift on build phases). README spells it out. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
49 lines
2.7 KiB
Markdown
49 lines
2.7 KiB
Markdown
# Builder/Adversary example — deferred review (verify after a long segment)
|
||
|
||
The coarsest point on the **review-cadence spectrum**. Same pattern, same full original prompts as
|
||
`../builder-adversary` — only *when* the Adversary verifies changes:
|
||
|
||
| variant | the Adversary verifies… | handshakes (calculator task) |
|
||
|---|---|--:|
|
||
| `builder-adversary-lean` | per **gate** | ~12 claim/verify round-trips |
|
||
| `builder-adversary` (orig) | per **phase** | ~3 |
|
||
| **`builder-adversary-deferred`** | **once, after the whole build** | **1** |
|
||
|
||
## How it works
|
||
|
||
The Builder **self-certifies** the build phases (`wc`, then `json`) — builds to each phase's DoD, runs
|
||
its own tests until green, writes `## DONE`, and advances *without* waiting for the Adversary. The
|
||
Adversary stays out of the build. Only in the final **`review` phase** does it do **one comprehensive
|
||
cold-verification of the entire accumulated calculator** (`plans/review.md`): re-run every DoD item
|
||
from every phase from a fresh clone, plus cross-feature break-it probes, file all findings at once,
|
||
re-verify after fixes, then PASS. That single pass is the only adversary gate in the run.
|
||
|
||
## The trade-off
|
||
|
||
- **Cheapest coordination.** One handshake instead of 3–12 — no per-gate/per-phase round-trips, the
|
||
Builder isn't interrupted mid-build. (The benchmark showed coordination round-trips are a real
|
||
token cost; deferring to one pass minimises them.)
|
||
- **But the independent check arrives late.** Two risks the per-gate/per-phase cadences guard
|
||
against:
|
||
- **Late discovery / rework.** If the Builder built phase 2 on a wrong assumption from phase 1, an
|
||
early adversary would have caught it at gate 1; here it surfaces only at the end, after more work
|
||
was piled on the flaw — potentially a larger, costlier fix.
|
||
- **Self-certification drift.** The build phases are self-certified, so a bug the Builder
|
||
rubber-stamps survives until the final review. The comprehensive pass is the only safety net, so
|
||
it must be thorough.
|
||
- **Better at cross-feature bugs.** Because it verifies the whole system at once, it's positioned to
|
||
catch *interactions* (e.g. `--json` × every flag) that a per-gate view, looking at one item at a
|
||
time, can miss.
|
||
|
||
So `deferred` trades *early, incremental* assurance for *minimal coordination + one holistic pass*.
|
||
It suits work where features are independent and cheap to fix late; it's risky where early decisions
|
||
constrain later ones.
|
||
|
||
```bash
|
||
python3 ../../agents.py status --config agents.toml
|
||
python3 ../../agents.py up --config agents.toml # needs `claude` on PATH
|
||
```
|
||
|
||
> **Prompt base:** the full original `builder-adversary` prompts + a DEFERRED REVIEW CADENCE override
|
||
> — so comparing this to `builder-adversary`/`lean` isolates *only* the verification cadence.
|