- plans/calc/{lex,parse,eval}.md: a 3-phase calculator with multiple gates per
phase (tokenizer → recursive-descent parser → evaluator+CLI), rich adversarial
edge cases (precedence/associativity/unary/div-zero)
- run-harness-bench.sh: stands up a real agents.py up Builder/Adversary loop pair
+ watchdog over a shared work repo per variant, runs to SEQUENCE-COMPLETE, and
clocks tokens from the session transcripts (AI-as-adversary kept intact)
- RESULTS.md: baseline single-pass roman-numeral run (prompt size had ~0 token
effect; cache-read of the working context dominates)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>