A self-contained examples/builder-adversary/ that distills the cc-ci production loop pair into a tiny, fully-local task (build a `wc` CLI in two phases): - agents.toml: builder + adversary loops, persistent orchestrator, on_complete reporter, cleanlogs service; phase machine with a per-phase model override - prompts/: kickoff template + builder/adversary roles carrying the load-bearing protocol (claim()/review() handoff, machine-docs file-location rule, WHAT+HOW+EXPECTED+WHERE=STATUS / WHY=JOURNAL anti-anchoring, WAITING-UNTIL liveness) - plans/: two phase plans (wc, json) each with a cold-verifiable Definition of Done - README: how to run, the work-repo two-clone isolation model, how to adapt Verified: `agents.py status --config agents.toml` parses and lists all agents. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Builder/Adversary example
A complete, self-contained instance of the Builder/Adversary loop pair — the pattern cc-ci runs in production, distilled to a tiny, fully-local task so you can read it end-to-end and run it without any infrastructure.
Two AI loops work the same plan but never trust each other; they coordinate only through a git repo:
- Builder (
prompts/builder.md) — builds to the phase plan's Definition of Done, and claims each gate with aclaim(...)-prefixed commit when it believes a DoD item is met. - Adversary (
prompts/adversary.md) — disbelieves the Builder, cold-verifies every claim from its own clone, and records PASS/FAIL with areview(...)-prefixed commit. Holds veto. - Orchestrator (persistent) supervises; Reporter (one-shot) writes a summary when the phase sequence finishes.
The watchdog keeps the loops alive, paces them, and turns those commit prefixes into the handoff:
a claim( commit pings the Adversary, a review( commit pings the Builder.
Files
agents.toml the whole project: backends, the 4 agents + a service, the phase machine
prompts/
kickoff.md per-phase preamble (slots {phase_id}/{plan}/{status}/{role})
builder.md Builder role + loop protocol
adversary.md Adversary role + anti-anchoring verification discipline
plans/
wc.md phase 1 — build a `wc` CLI (the single source of truth for that phase)
json.md phase 2 — add `--json` (shows a per-phase model override)
machine-docs/ where the loops write STATUS / REVIEW / BACKLOG / JOURNAL at runtime
The task
Build a small wc clone (wc.py + a pytest suite) in the work repo, in two phases. It is
deliberately trivial and offline — the point is to exercise the protocol (claim → cold-verify →
PASS/FAIL → advance), not to build anything hard. See plans/wc.md and plans/json.md for the
Definitions of Done.
Run it
Needs claude on PATH (the loops are real agents). From this directory:
python3 ../../agents.py status --config agents.toml # read-only: what would run
python3 ../../agents.py up --config agents.toml # start builder + adversary + orchestrator + watchdog
python3 ../../agents.py logs builder --config agents.toml
python3 ../../agents.py phase show --config agents.toml
python3 ../../agents.py down --config agents.toml # stop everything
To watch the mechanics without an agent CLI, set defaults.backend = "demo" in agents.toml
(the demo backend just idles) and run up / status / down — sessions start and the watchdog
ticks, but no real work happens. The repo's top-level ./smoke.sh shows this end-to-end for the
sibling agents.example.toml.
The work repo (and isolation)
The loops build in a work repo — handoff.repo in agents.toml, here ./work. For this
quick start both loops can share it, but the pattern's real strength is cold verification: give
each loop its own clone of the same remote so the Adversary verifies from a genuinely
independent checkout (exactly what cc-ci does with separate cc-ci / cc-ci-adv clones).
To set that up:
- Create the work repo with a remote both loops can push/pull (any git host, or a bare repo on the
same box). Put
machine-docs/in it. - Clone it twice: into
./work(Builder'sdir) and./work-adv(Adversary'sdir). - Point
handoff.repoat the Builder's clone (./work).
The watchdog then watches that repo's origin/main for claim(/review( commits and the two
*-INBOX.md files, and pings the right loop on each.
How to adapt it
- Different task → rewrite
plans/*.md(each is one phase's source of truth + DoD) and adjust the[loop].phaseslist. Nothing else needs to change. - More/fewer phases → add or remove entries in
[loop].phases; the watchdog advances when a phase'sstatusfile contains## DONE. - Per-phase models →
models = { builder = "...", adversary = "..." }on a phase (seejson). - A periodic supervisor nudge → uncomment the
wake = { ... }line on theorchestratoragent.
This example carries no project-orchestrator/fleet metadata — like any project, it can be run by
hand and has no idea a fleet exists. See the repo root README.md for the full harness reference.