A self-contained examples/builder-adversary/ that distills the cc-ci production loop pair into a tiny, fully-local task (build a `wc` CLI in two phases): - agents.toml: builder + adversary loops, persistent orchestrator, on_complete reporter, cleanlogs service; phase machine with a per-phase model override - prompts/: kickoff template + builder/adversary roles carrying the load-bearing protocol (claim()/review() handoff, machine-docs file-location rule, WHAT+HOW+EXPECTED+WHERE=STATUS / WHY=JOURNAL anti-anchoring, WAITING-UNTIL liveness) - plans/: two phase plans (wc, json) each with a cold-verifiable Definition of Done - README: how to run, the work-repo two-clone isolation model, how to adapt Verified: `agents.py status --config agents.toml` parses and lists all agents. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
86 lines
4.3 KiB
Markdown
86 lines
4.3 KiB
Markdown
# Builder/Adversary example
|
|
|
|
A complete, self-contained instance of the **Builder/Adversary loop pair** — the pattern
|
|
[cc-ci](https://git.autonomic.zone) runs in production, distilled to a tiny, fully-local task so you
|
|
can read it end-to-end and run it without any infrastructure.
|
|
|
|
Two AI loops work the same plan but never trust each other; they coordinate **only through a git
|
|
repo**:
|
|
|
|
- **Builder** (`prompts/builder.md`) — builds to the phase plan's Definition of Done, and *claims*
|
|
each gate with a `claim(...)`-prefixed commit when it believes a DoD item is met.
|
|
- **Adversary** (`prompts/adversary.md`) — *disbelieves* the Builder, cold-verifies every claim from
|
|
its **own clone**, and records PASS/FAIL with a `review(...)`-prefixed commit. Holds veto.
|
|
- **Orchestrator** (persistent) supervises; **Reporter** (one-shot) writes a summary when the phase
|
|
sequence finishes.
|
|
|
|
The watchdog keeps the loops alive, paces them, and turns those commit prefixes into the handoff:
|
|
a `claim(` commit pings the Adversary, a `review(` commit pings the Builder.
|
|
|
|
## Files
|
|
|
|
```
|
|
agents.toml the whole project: backends, the 4 agents + a service, the phase machine
|
|
prompts/
|
|
kickoff.md per-phase preamble (slots {phase_id}/{plan}/{status}/{role})
|
|
builder.md Builder role + loop protocol
|
|
adversary.md Adversary role + anti-anchoring verification discipline
|
|
plans/
|
|
wc.md phase 1 — build a `wc` CLI (the single source of truth for that phase)
|
|
json.md phase 2 — add `--json` (shows a per-phase model override)
|
|
machine-docs/ where the loops write STATUS / REVIEW / BACKLOG / JOURNAL at runtime
|
|
```
|
|
|
|
## The task
|
|
|
|
Build a small `wc` clone (`wc.py` + a `pytest` suite) in the **work repo**, in two phases. It is
|
|
deliberately trivial and offline — the point is to exercise the *protocol* (claim → cold-verify →
|
|
PASS/FAIL → advance), not to build anything hard. See `plans/wc.md` and `plans/json.md` for the
|
|
Definitions of Done.
|
|
|
|
## Run it
|
|
|
|
Needs `claude` on `PATH` (the loops are real agents). From this directory:
|
|
|
|
```bash
|
|
python3 ../../agents.py status --config agents.toml # read-only: what would run
|
|
python3 ../../agents.py up --config agents.toml # start builder + adversary + orchestrator + watchdog
|
|
python3 ../../agents.py logs builder --config agents.toml
|
|
python3 ../../agents.py phase show --config agents.toml
|
|
python3 ../../agents.py down --config agents.toml # stop everything
|
|
```
|
|
|
|
To watch the **mechanics** without an agent CLI, set `defaults.backend = "demo"` in `agents.toml`
|
|
(the demo backend just idles) and run `up` / `status` / `down` — sessions start and the watchdog
|
|
ticks, but no real work happens. The repo's top-level `./smoke.sh` shows this end-to-end for the
|
|
sibling `agents.example.toml`.
|
|
|
|
## The work repo (and isolation)
|
|
|
|
The loops build in a **work repo** — `handoff.repo` in `agents.toml`, here `./work`. For this
|
|
quick start both loops can share it, but the pattern's real strength is **cold verification**: give
|
|
each loop its **own clone of the same remote** so the Adversary verifies from a genuinely
|
|
independent checkout (exactly what cc-ci does with separate `cc-ci` / `cc-ci-adv` clones).
|
|
|
|
To set that up:
|
|
|
|
1. Create the work repo with a remote both loops can push/pull (any git host, or a bare repo on the
|
|
same box). Put `machine-docs/` in it.
|
|
2. Clone it twice: into `./work` (Builder's `dir`) and `./work-adv` (Adversary's `dir`).
|
|
3. Point `handoff.repo` at the Builder's clone (`./work`).
|
|
|
|
The watchdog then watches that repo's `origin/main` for `claim(`/`review(` commits and the two
|
|
`*-INBOX.md` files, and pings the right loop on each.
|
|
|
|
## How to adapt it
|
|
|
|
- **Different task** → rewrite `plans/*.md` (each is one phase's source of truth + DoD) and adjust
|
|
the `[loop].phases` list. Nothing else needs to change.
|
|
- **More/fewer phases** → add or remove entries in `[loop].phases`; the watchdog advances when a
|
|
phase's `status` file contains `## DONE`.
|
|
- **Per-phase models** → `models = { builder = "...", adversary = "..." }` on a phase (see `json`).
|
|
- **A periodic supervisor nudge** → uncomment the `wake = { ... }` line on the `orchestrator` agent.
|
|
|
|
This example carries **no** project-orchestrator/fleet metadata — like any project, it can be run by
|
|
hand and has no idea a fleet exists. See the repo root `README.md` for the full harness reference.
|