Files
agent-orchestrator-benchmark/calculators/builder-adversary/run-05/machine-docs/JOURNAL-lex.md

2.2 KiB
Raw Blame History

JOURNAL — phase: lex

2026-06-15 — Start

Read phase plan. Mission: build calc/lexer.py with tokenize(), Token type, LexError, and unittest suite.

Plan:

  • Token: dataclass with kind (str) and value (int | float | None)
  • Kinds: NUMBER, PLUS, MINUS, STAR, SLASH, LPAREN, RPAREN, EOF
  • LexError: custom Exception with char + position
  • tokenize(): iterate over chars, match numbers (int/float), operators, parens; skip whitespace; raise LexError on unknown char; append EOF at end

2026-06-15 — Implementation

Created:

  • calc/__init__.py (empty package marker)
  • calc/lexer.py — token scanner using manual char-by-char iteration
  • calc/test_lexer.py — 22 unittest cases

Float parsing design: detect float vs int by presence of . in scanned substring. Handle .5 (leading dot) and 10. (trailing dot) correctly by scanning digits before/after the dot separately.

2026-06-15 — Test run

$ python -m unittest -q
----------------------------------------------------------------------
Ran 22 tests in 0.001s

OK

2026-06-15 — Verify commands

$ python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('42')])"
[('NUMBER', 42), ('EOF', None)]

$ python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('3.14')])"
[('NUMBER', 3.14), ('EOF', None)]

$ python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('.5')])"
[('NUMBER', 0.5), ('EOF', None)]

$ python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('10.')])"
[('NUMBER', 10.0), ('EOF', None)]

$ python -c "from calc.lexer import tokenize; print([t.kind for t in tokenize('1+2*3')])"
['NUMBER', 'PLUS', 'NUMBER', 'STAR', 'NUMBER', 'EOF']

$ python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('3.5*(1-2)')])"
[('NUMBER', 3.5), ('STAR', None), ('LPAREN', None), ('NUMBER', 1), ('MINUS', None), ('NUMBER', 2), ('RPAREN', None), ('EOF', None)]

$ python -c "from calc.lexer import tokenize; tokenize('1 @ 2')" 2>&1
calc.lexer.LexError: unexpected character '@' at position 2

All plan verify commands produce expected output. Claiming D1D4.