Files
mfowler 8c3f38dbf4 feat: multi-phase calculator problem + full-harness benchmark runner
- plans/calc/{lex,parse,eval}.md: a 3-phase calculator with multiple gates per
  phase (tokenizer → recursive-descent parser → evaluator+CLI), rich adversarial
  edge cases (precedence/associativity/unary/div-zero)
- run-harness-bench.sh: stands up a real agents.py up Builder/Adversary loop pair
  + watchdog over a shared work repo per variant, runs to SEQUENCE-COMPLETE, and
  clocks tokens from the session transcripts (AI-as-adversary kept intact)
- RESULTS.md: baseline single-pass roman-numeral run (prompt size had ~0 token
  effect; cache-read of the working context dominates)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 20:40:14 +00:00

1.8 KiB
Raw Permalink Blame History

Phase lex — tokenizer

Mission. Start a Python arithmetic calculator. In this phase build the lexer: calc/lexer.py exposing tokenize(src: str) -> list[Token], plus a unittest suite. Pure stdlib. This file is the single source of truth for the phase. (Later phases add the parser and evaluator — design the Token type so they can consume it.)

A Token has at least a kind and a value. Kinds: NUMBER, PLUS, MINUS, STAR, SLASH, LPAREN, RPAREN, and EOF as the final token.

Definition of Done (each Dn is a gate: Builder claims it, Adversary cold-verifies)

  • D1 — numbers. Integers (42) and floats (3.14, .5, 10.) tokenize to one NUMBER token whose value is the numeric value (int or float). tokenize("42")[NUMBER(42), EOF].
  • D2 — operators & parens. + - * / ( ) each tokenize to the right kind; tokenize("1+2*3") yields NUMBER PLUS NUMBER STAR NUMBER EOF.
  • D3 — whitespace & errors. Spaces/tabs between tokens are skipped; an invalid character (e.g. @, $, a letter) raises LexError (define it in the module) with the offending character and its position in the message.
  • D4 — tests green. calc/test_lexer.py (unittest) passes under python -m unittest, 0 failures, covering D1D3 including: " 12 + 3 ", "3.5*(1-2)", and that "1 @ 2" raises LexError.

Verify (cold)

python -m unittest -q                              # D4
python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('3.5*(1-2)')])"
python -c "from calc.lexer import tokenize; tokenize('1 @ 2')"   # must raise LexError

The Builder restates the exact commands + expected token lists + commit sha in machine-docs/STATUS-lex.md; the Adversary re-runs from its own clone and records lex/Dn: PASS|FAIL in machine-docs/REVIEW-lex.md.