A standalone repo (engine vendored as a submodule at the examples commit) that runs a head-to-head between the builder-adversary and builder-adversary-min example variants: same task, independent headless runs, both on Sonnet, with token counts. Includes the roman-numeral test problem and run-bench.sh. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1.4 KiB
1.4 KiB
Phase roman — integer → Roman numeral
Mission. In the work repo, implement roman.py plus a test_roman.py unittest suite. Pure
stdlib, no dependencies. This file is the single source of truth for the phase.
Definition of Done
- D1 —
to_roman(n). Returns the Roman-numeral string for an int1 ≤ n ≤ 3999(e.g.to_roman(1994) == "MCMXCIV"). - D2 — validation.
to_romanraisesValueErrorforn < 1,n > 3999, or a non-int. - D3 — CLI.
python roman.py 1994printsMCMXCIV(and exits 0); a bad argument exits non-zero. - D4 — tests green.
test_roman.py(stdlibunittest) passes underpython -m unittest, with 0 failures, covering at least:1→I, 4→IV, 9→IX, 40→XL, 90→XC, 400→CD, 900→CM, 1994→MCMXCIV, 3888→MMMDCCCLXXXVIII, 3999→MMMCMXCIX, andValueErroron0,4000, and"x".
How to verify (cold)
From a fresh clone of the work repo:
python -m unittest -q # D4: must report OK (0 failures)
python roman.py 1994 # D3: expect MCMXCIV
python roman.py 3888 # expect MMMDCCCLXXXVIII
Expected outputs are above. The Builder restates the exact commands + expected outputs + commit sha
in machine-docs/STATUS-roman.md; the Adversary re-runs them from its own clone and records
roman: PASS/FAIL in machine-docs/REVIEW-roman.md.