Files
agent-orchestrator-benchmark/calculators/builder-adversary-lean/run-02/machine-docs/REVIEW-lex.md

2.5 KiB

REVIEW — Phase lex (Adversary)

Verdicts

lex/D1: PASS @2026-06-15T05:05Z

Cold-run evidence:

  • tokenize('42')NUMBER 42 int ✓ (int type confirmed)
  • tokenize('3.14')NUMBER 3.14 float ✓ (float type confirmed)
  • tokenize('.5')NUMBER 0.5 ✓ (leading dot)
  • tokenize('10.')NUMBER 10.0 ✓ (trailing dot → float)
  • EOF appended in all cases ✓

lex/D2: PASS @2026-06-15T05:05Z

Cold-run evidence:

  • tokenize('1+2*3')[('NUMBER', 1), ('PLUS', '+'), ('NUMBER', 2), ('STAR', '*'), ('NUMBER', 3), ('EOF', None)]
  • Matches expected exactly ✓
  • All 6 operator/paren kinds verified in test suite ✓

lex/D3: PASS @2026-06-15T05:05Z

Cold-run evidence:

  • tokenize(' 12 + 3 ')['NUMBER', 'PLUS', 'NUMBER', 'EOF'] ✓ (spaces skipped)
  • tokenize('1 @ 2') raises calc.lexer.LexError: unexpected character '@' at position 2
    • Offending character '@' in message ✓
    • Position 2 in message ✓
  • Letters (abc), $ also raise LexError per test suite ✓

lex/D4: PASS @2026-06-15T05:05Z

Cold-run evidence:

  • python -m unittest -qRan 18 tests in 0.000s OK
  • tokenize('3.5*(1-2)')[('NUMBER', 3.5), ('STAR', '*'), ('LPAREN', '('), ('NUMBER', 1), ('MINUS', '-'), ('NUMBER', 2), ('RPAREN', ')'), ('EOF', None)] ✓ (exact plan match)
  • tokenize('1 @ 2') raises LexError (exit 1) ✓
  • Required test cases present: " 12 + 3 ", "3.5*(1-2)", "1 @ 2"
  • 18 tests, 0 failures ✓

Adversary findings (non-blocking for this phase)

AF-1: ValueError leaks on malformed number tokens

Repro:

python -c "from calc.lexer import tokenize; tokenize('1.2.3')"
# → ValueError: could not convert string to float: '1.2.3'

python -c "from calc.lexer import tokenize; tokenize('.')"
# → ValueError: could not convert string to float: '.'

The number-scanning loop (ch.isdigit() or ch == '.') greedily consumes all digits and dots, then hands the raw span to float() which raises ValueError on malformed input like 1.2.3 or bare .. These should raise LexError for consistency — the caller can't distinguish a lexer malfunction from a Python type error.

Severity: Not blocking — the DoD only requires LexError for invalid characters (@, $, letters). 1.2.3 and . are outside the explicit D1/D3 test cases. However, the parser phase will likely encounter these and must handle them.

Recommendation: Wrap the float(raw) call in a try/except ValueError and re-raise as LexError. Flag for builder attention in BUILDER-INBOX.