Files

mfowler a0f7652e9e docs(examples): add builder-solo — single builder, no adversary (control)

A single Builder that builds AND self-verifies (same DoD rigor), with NO
independent Adversary and no claim/review handoff. The control for measuring
what the AI adversary costs (its tokens, ~half of a loop-pair run) and buys
(independent cold verification vs self-certification).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-15 02:34:50 +00:00

machine-docs

docs(examples): add builder-solo — single builder, no adversary (control)

2026-06-15 02:34:50 +00:00

plans

docs(examples): add builder-solo — single builder, no adversary (control)

2026-06-15 02:34:50 +00:00

prompts

docs(examples): add builder-solo — single builder, no adversary (control)

2026-06-15 02:34:50 +00:00

agents.toml

docs(examples): add builder-solo — single builder, no adversary (control)

2026-06-15 02:34:50 +00:00

README.md

docs(examples): add builder-solo — single builder, no adversary (control)

2026-06-15 02:34:50 +00:00

README.md

Builder-solo example — no Adversary (self-verification baseline)

A single Builder agent, same task spec as ../builder-adversary, but with no Adversary: the Builder builds and verifies its own work, then self-certifies ## DONE. No claim(/review( handoff — there's nothing to hand off to.

This is the control for the AI-as-adversary design. Comparing it against builder-adversary on the same task answers two things:

Cost: how much of a run's tokens is the independent Adversary? (In the loop-pair runs the Adversary is ~45–53% of the total — this variant removes that.)
Quality: does an independent cold verifier catch things a self-checking builder misses? Self- certification has an obvious failure mode — the same agent that wrote the bug decides whether it's a bug. This variant measures what you give up by dropping the second pair of eyes.

The Builder's role prompt keeps the same verification rigor (run every DoD check, try to break it, paste observed output, no self-rubber-stamping) — the only thing removed is the independent adversary. So the comparison is "independent verification vs self-verification," not "verification vs none."

python3 ../../agents.py status --config agents.toml
python3 ../../agents.py up     --config agents.toml      # needs `claude` on PATH

The agent-orchestrator-benchmark repo runs this head-to-head with the other variants on the same multi-phase task and reports tokens + the efficiency ratios.

README.md Unescape Escape

Builder-solo example — no Adversary (self-verification baseline)

README.md