Files
agent-orchestrator/examples/builder-adversary
mfowler 7f237a522c docs(examples): add a Builder/Adversary loop-pair example (the cc-ci pattern)
A self-contained examples/builder-adversary/ that distills the cc-ci production
loop pair into a tiny, fully-local task (build a `wc` CLI in two phases):

- agents.toml: builder + adversary loops, persistent orchestrator, on_complete
  reporter, cleanlogs service; phase machine with a per-phase model override
- prompts/: kickoff template + builder/adversary roles carrying the load-bearing
  protocol (claim()/review() handoff, machine-docs file-location rule,
  WHAT+HOW+EXPECTED+WHERE=STATUS / WHY=JOURNAL anti-anchoring, WAITING-UNTIL liveness)
- plans/: two phase plans (wc, json) each with a cold-verifiable Definition of Done
- README: how to run, the work-repo two-clone isolation model, how to adapt

Verified: `agents.py status --config agents.toml` parses and lists all agents.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 17:50:42 +00:00
..

Builder/Adversary example

A complete, self-contained instance of the Builder/Adversary loop pair — the pattern cc-ci runs in production, distilled to a tiny, fully-local task so you can read it end-to-end and run it without any infrastructure.

Two AI loops work the same plan but never trust each other; they coordinate only through a git repo:

  • Builder (prompts/builder.md) — builds to the phase plan's Definition of Done, and claims each gate with a claim(...)-prefixed commit when it believes a DoD item is met.
  • Adversary (prompts/adversary.md) — disbelieves the Builder, cold-verifies every claim from its own clone, and records PASS/FAIL with a review(...)-prefixed commit. Holds veto.
  • Orchestrator (persistent) supervises; Reporter (one-shot) writes a summary when the phase sequence finishes.

The watchdog keeps the loops alive, paces them, and turns those commit prefixes into the handoff: a claim( commit pings the Adversary, a review( commit pings the Builder.

Files

agents.toml            the whole project: backends, the 4 agents + a service, the phase machine
prompts/
  kickoff.md           per-phase preamble (slots {phase_id}/{plan}/{status}/{role})
  builder.md           Builder role + loop protocol
  adversary.md         Adversary role + anti-anchoring verification discipline
plans/
  wc.md                phase 1 — build a `wc` CLI (the single source of truth for that phase)
  json.md              phase 2 — add `--json` (shows a per-phase model override)
machine-docs/          where the loops write STATUS / REVIEW / BACKLOG / JOURNAL at runtime

The task

Build a small wc clone (wc.py + a pytest suite) in the work repo, in two phases. It is deliberately trivial and offline — the point is to exercise the protocol (claim → cold-verify → PASS/FAIL → advance), not to build anything hard. See plans/wc.md and plans/json.md for the Definitions of Done.

Run it

Needs claude on PATH (the loops are real agents). From this directory:

python3 ../../agents.py status --config agents.toml      # read-only: what would run
python3 ../../agents.py up     --config agents.toml      # start builder + adversary + orchestrator + watchdog
python3 ../../agents.py logs   builder  --config agents.toml
python3 ../../agents.py phase  show     --config agents.toml
python3 ../../agents.py down   --config agents.toml      # stop everything

To watch the mechanics without an agent CLI, set defaults.backend = "demo" in agents.toml (the demo backend just idles) and run up / status / down — sessions start and the watchdog ticks, but no real work happens. The repo's top-level ./smoke.sh shows this end-to-end for the sibling agents.example.toml.

The work repo (and isolation)

The loops build in a work repohandoff.repo in agents.toml, here ./work. For this quick start both loops can share it, but the pattern's real strength is cold verification: give each loop its own clone of the same remote so the Adversary verifies from a genuinely independent checkout (exactly what cc-ci does with separate cc-ci / cc-ci-adv clones).

To set that up:

  1. Create the work repo with a remote both loops can push/pull (any git host, or a bare repo on the same box). Put machine-docs/ in it.
  2. Clone it twice: into ./work (Builder's dir) and ./work-adv (Adversary's dir).
  3. Point handoff.repo at the Builder's clone (./work).

The watchdog then watches that repo's origin/main for claim(/review( commits and the two *-INBOX.md files, and pings the right loop on each.

How to adapt it

  • Different task → rewrite plans/*.md (each is one phase's source of truth + DoD) and adjust the [loop].phases list. Nothing else needs to change.
  • More/fewer phases → add or remove entries in [loop].phases; the watchdog advances when a phase's status file contains ## DONE.
  • Per-phase modelsmodels = { builder = "...", adversary = "..." } on a phase (see json).
  • A periodic supervisor nudge → uncomment the wake = { ... } line on the orchestrator agent.

This example carries no project-orchestrator/fleet metadata — like any project, it can be run by hand and has no idea a fleet exists. See the repo root README.md for the full harness reference.