Files
agent-orchestrator-benchmark/plans/calc/parse.md
mfowler 8c3f38dbf4 feat: multi-phase calculator problem + full-harness benchmark runner
- plans/calc/{lex,parse,eval}.md: a 3-phase calculator with multiple gates per
  phase (tokenizer → recursive-descent parser → evaluator+CLI), rich adversarial
  edge cases (precedence/associativity/unary/div-zero)
- run-harness-bench.sh: stands up a real agents.py up Builder/Adversary loop pair
  + watchdog over a shared work repo per variant, runs to SEQUENCE-COMPLETE, and
  clocks tokens from the session transcripts (AI-as-adversary kept intact)
- RESULTS.md: baseline single-pass roman-numeral run (prompt size had ~0 token
  effect; cache-read of the working context dominates)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 20:40:14 +00:00

35 lines
2.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase `parse` — recursive-descent parser
**Mission.** Build `calc/parser.py` exposing `parse(tokens) -> Node` (consuming the `lex` phase's
tokens) that produces an **AST** with correct arithmetic precedence and associativity, plus a
`unittest` suite. SSOT for this phase. Do NOT evaluate yet — just build the tree (the `eval` phase
consumes it). Represent nodes however you like (e.g. `Num(value)` and `BinOp(op, left, right)`,
`Unary(op, operand)`), but expose a stable, documented shape the evaluator can walk.
## Definition of Done (each Dn is a gate)
- **D1 — precedence.** `*` and `/` bind tighter than `+` and `-`: `1+2*3` parses as `1+(2*3)`, not
`(1+2)*3`.
- **D2 — left associativity.** Same-precedence operators associate left: `8-3-2` parses as
`(8-3)-2`; `8/4/2` as `(8/4)/2`.
- **D3 — parentheses.** Parens override precedence: `(1+2)*3` parses with the `+` under the `*`.
- **D4 — unary minus.** Leading and nested unary minus parses: `-5`, `-(1+2)`, `3 * -2`.
- **D5 — errors.** Malformed input raises `ParseError` (define it): `"1 +"`, `"(1"`, `"1 2"`, `")("`,
and the empty string each raise (not crash with a different exception).
- **D6 — tests green.** `calc/test_parser.py` (`unittest`) passes under `python -m unittest`, 0
failures, covering D1D5. Assert on tree structure (e.g. a `repr`/shape helper), not on evaluation.
## Verify (cold)
```bash
python -m unittest -q # D6
# D1/D3 differ in structure — the Builder's STATUS gives the exact shape assertion to re-run:
python -c "from calc.lexer import tokenize; from calc.parser import parse; print(parse(tokenize('1+2*3')))"
python -c "from calc.lexer import tokenize; from calc.parser import parse; parse(tokenize('1 +'))" # ParseError
```
The Builder documents the AST shape + exact assertions in `machine-docs/STATUS-parse.md`; the
Adversary cold-verifies and records `parse/Dn: PASS|FAIL` in `machine-docs/REVIEW-parse.md`. Watch
especially for a precedence/associativity bug that still passes a weak test — re-derive the expected
tree yourself from the plan.