Files
agent-orchestrator-benchmark/calculators/builder-adversary/run-03/machine-docs/JOURNAL-parse.md

2.3 KiB

JOURNAL-parse

2026-06-15 — Initial implementation

Design choices

Grammar used:

expr    = term (('+' | '-') term)*
term    = unary (('*' | '/') unary)*
unary   = '-' unary | primary
primary = NUMBER | '(' expr ')'

This naturally encodes precedence (* and / via term, + and - via expr) and left-associativity (via the while loop that builds left-deep trees in _expr and _term). Unary minus is right-recursive via _unary → _unary, which handles chaining (--5) correctly.

Operator representation

The Adversary's pre-claim probes in REVIEW-parse.md used symbol format ('+', '-', '*', '/') rather than token kind names ('PLUS', 'MINUS', etc.). I aligned the implementation to use symbols to match their expected cold-verification output.

Test run output

$ python -m unittest -q
Ran 44 tests in 0.001s
OK

D1 shape verification

$ python -c "from calc.lexer import tokenize; from calc.parser import parse; print(parse(tokenize('1+2*3')))"
BinOp('+', Num(1), BinOp('*', Num(2), Num(3)))

D2 shape verification

$ python -c "from calc.lexer import tokenize; from calc.parser import parse; print(parse(tokenize('8-3-2')))"
BinOp('-', BinOp('-', Num(8), Num(3)), Num(2))

$ python -c "from calc.lexer import tokenize; from calc.parser import parse; print(parse(tokenize('8/4/2')))"
BinOp('/', BinOp('/', Num(8), Num(4)), Num(2))

D3 shape verification

$ python -c "from calc.lexer import tokenize; from calc.parser import parse; print(parse(tokenize('(1+2)*3')))"
BinOp('*', BinOp('+', Num(1), Num(2)), Num(3))

D4 shape verification

$ python -c "from calc.lexer import tokenize; from calc.parser import parse; print(parse(tokenize('-5')))"
Unary('-', Num(5))

$ python -c "from calc.lexer import tokenize; from calc.parser import parse; print(parse(tokenize('-(1+2)')))"
Unary('-', BinOp('+', Num(1), Num(2)))

$ python -c "from calc.lexer import tokenize; from calc.parser import parse; print(parse(tokenize('3 * -2')))"
BinOp('*', Num(3), Unary('-', Num(2)))

D5 error verification

All five required error cases raise ParseError specifically:

  • "1 +" → ParseError: unexpected end of input
  • "(1" → ParseError: unclosed parenthesis, expected ')'
  • "1 2" → ParseError: unexpected token 'NUMBER'
  • ")(" → ParseError: unexpected token 'RPAREN'
  • "" → ParseError: empty expression