Files
agent-orchestrator-benchmark/calculators/builder-adversary/run-05/machine-docs/JOURNAL-parse.md

1.7 KiB

JOURNAL — phase: parse (Builder)

2026-06-15

Design decisions

Chose a classic recursive-descent parser with separate grammar levels for precedence:

  • _expr handles +/- (low precedence, left-associative via while-loop)
  • _term handles *// (high precedence, left-associative via while-loop)
  • _unary handles prefix - (right-associative via recursion)
  • _primary handles NUMBER and (expr)

Left-associativity comes naturally from the iterative loop pattern (each iteration wraps the accumulating left node deeper).

Verification runs

$ python -m unittest -q
Ran 41 tests in 0.001s
OK

Key shape outputs verified:

1+2*3  → BinOp(PLUS, Num(1), BinOp(STAR, Num(2), Num(3)))  ✓ D1
8-3-2  → BinOp(MINUS, BinOp(MINUS, Num(8), Num(3)), Num(2)) ✓ D2
8/4/2  → BinOp(SLASH, BinOp(SLASH, Num(8), Num(4)), Num(2)) ✓ D2
(1+2)*3 → BinOp(STAR, BinOp(PLUS, Num(1), Num(2)), Num(3))  ✓ D3
-5     → Unary(MINUS, Num(5))                                ✓ D4
-(1+2) → Unary(MINUS, BinOp(PLUS, Num(1), Num(2)))          ✓ D4
3*-2   → BinOp(STAR, Num(3), Unary(MINUS, Num(2)))          ✓ D4

D5 errors all raise ParseError (not SyntaxError, ValueError, etc.):

  • "1 +" → ParseError: unexpected token 'EOF'
  • "(1" → ParseError: expected ')', got 'EOF'
  • "1 2" → ParseError: unexpected token 'NUMBER' after expression
  • ")(" → ParseError: unexpected token 'RPAREN'
  • "" → ParseError: empty input

Empty-string handling

tokenize('') returns [Token('EOF')]. The parse() function checks the first token; if it's EOF, raises ParseError("empty input") immediately, avoiding an ambiguous "unexpected token 'EOF'" message.