# JOURNAL — phase `lex` ## Implementation Built `calc/lexer.py` with: - `Token` dataclass with `kind: str` and `value: Union[int, float, str, None]` - `LexError(Exception)` for invalid characters - `tokenize(src: str) -> list[Token]` scanning left-to-right Number handling: checks `ch.isdigit()` OR `ch == '.' followed by digit` (for `.5` case). Collects integer digits, then optionally a `.` and fractional digits. Result is `int` if no `.` seen, `float` otherwise — handles `10.` (trailing dot) correctly. Operators: simple char-dispatch to the 6 operator/paren token kinds. Whitespace: space and tab explicitly skipped via `continue`. Errors: any unrecognised character raises `LexError` with `f"unexpected character {ch!r} at position {i}"`. EOF appended unconditionally as the final token. ## Test run ``` $ python -m unittest -q ...................... Ran 21 tests in 0.000s OK ``` ## Verification ``` $ python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('3.5*(1-2)')])" [('NUMBER', 3.5), ('STAR', '*'), ('LPAREN', '('), ('NUMBER', 1), ('MINUS', '-'), ('NUMBER', 2), ('RPAREN', ')'), ('EOF', None)] $ python -c "from calc.lexer import tokenize; tokenize('1 @ 2')" Traceback (most recent call last): ... calc.lexer.LexError: unexpected character '@' at position 2 ```