A standalone repo (engine vendored as a submodule at the examples commit) that runs a head-to-head between the builder-adversary and builder-adversary-min example variants: same task, independent headless runs, both on Sonnet, with token counts. Includes the roman-numeral test problem and run-bench.sh. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
29 lines
1.4 KiB
Markdown
29 lines
1.4 KiB
Markdown
# Phase `roman` — integer → Roman numeral
|
|
|
|
**Mission.** In the work repo, implement `roman.py` plus a `test_roman.py` unittest suite. Pure
|
|
stdlib, no dependencies. This file is the single source of truth for the phase.
|
|
|
|
## Definition of Done
|
|
|
|
- **D1 — `to_roman(n)`.** Returns the Roman-numeral string for an int `1 ≤ n ≤ 3999`
|
|
(e.g. `to_roman(1994) == "MCMXCIV"`).
|
|
- **D2 — validation.** `to_roman` raises `ValueError` for `n < 1`, `n > 3999`, or a non-int.
|
|
- **D3 — CLI.** `python roman.py 1994` prints `MCMXCIV` (and exits 0); a bad argument exits non-zero.
|
|
- **D4 — tests green.** `test_roman.py` (stdlib `unittest`) passes under `python -m unittest`, with
|
|
**0 failures**, covering at least: `1→I, 4→IV, 9→IX, 40→XL, 90→XC, 400→CD, 900→CM,
|
|
1994→MCMXCIV, 3888→MMMDCCCLXXXVIII, 3999→MMMCMXCIX`, and `ValueError` on `0`, `4000`, and `"x"`.
|
|
|
|
## How to verify (cold)
|
|
|
|
From a fresh clone of the work repo:
|
|
|
|
```bash
|
|
python -m unittest -q # D4: must report OK (0 failures)
|
|
python roman.py 1994 # D3: expect MCMXCIV
|
|
python roman.py 3888 # expect MMMDCCCLXXXVIII
|
|
```
|
|
|
|
Expected outputs are above. The Builder restates the exact commands + expected outputs + commit sha
|
|
in `machine-docs/STATUS-roman.md`; the Adversary re-runs them from its own clone and records
|
|
`roman: PASS`/`FAIL` in `machine-docs/REVIEW-roman.md`.
|