# calculators/ — the artifacts the benchmark built Every benchmark run had a Builder/Adversary loop pair (or a solo Builder) build a Python calculator to the spec in [`../plans/calc/`](../plans/calc/). This folder preserves the **actual calculators they produced** — the 5 canonical successful runs per variant (the N=5 the analysis is based on; the wedged/limit/superseded runs are not included). 30 calculators in all. ## Layout ``` calculators//run-NN/ calc.py the CLI entry point calc/ lexer.py, parser.py, evaluator.py + test_*.py (the built calculator) machine-docs/ the loop's coordination artifacts for this run: STATUS-.md (Builder's claims: WHAT/HOW/EXPECTED/WHERE) REVIEW-.md (Adversary's verdicts + findings) JOURNAL-.md (Builder's reasoning — kept out of STATUS) BACKLOG/DECISIONS.md GIT-LOG.txt the run's commit history — the claim()/review() handshake SOURCE.txt the original /tmp run path ``` `` is one of the six: `builder-adversary`, `builder-adversary-min`, `builder-adversary-stateless`, `builder-adversary-lean`, `builder-adversary-deferred`, `builder-solo`. These are **working-tree snapshots** (not nested git repos — that would confuse the parent repo). The commit history that shows *how* each was built — the per-gate/per-phase `claim(`/`review(` exchange — is captured in each `GIT-LOG.txt`. Compare, say, a `builder-adversary-lean` log (per-gate, ~28 commits) against a `builder-adversary-deferred` log (one comprehensive review at the end) to see the cadence difference in action. ## What they're good for - **Inspect the deliverable** each variant produced (all behaviorally identical — verified — but the code/test style and volume vary; e.g. `-min` runs have leaner test suites). - **Read the actual review exchange** in `machine-docs/REVIEW-*.md` + `GIT-LOG.txt` — the Adversary's cold verdicts, findings, and the Builder's STATUS hand-offs. Run any of them: ```bash cd calculators/builder-adversary/run-01 python -m unittest -q # tests pass python calc.py "2+3*4" # 14 ``` See [`../FINDINGS.md`](../FINDINGS.md) for what the benchmark concluded and [`../RESULTS-campaign.md`](../RESULTS-campaign.md) for the per-run numbers.