# Journal — lex phase ## Build run Implemented `calc/lexer.py` with: - `Token` dataclass with `kind` (str) and `value` (int | float | str | None) - `LexError(Exception)` for invalid characters - `tokenize(src: str) -> list[Token]` scanning char-by-char Design choices: - `Token` is a plain dataclass so later phases (parser, evaluator) can pattern-match on `.kind` - Numbers: scanned greedily while char is digit or `.`; cast to `int` if no `.` in raw string, else `float` - Operators stored as their literal char as `value` (handy for error messages) - EOF always appended as final token (parser-friendly sentinel) ## Test run output ``` $ python -m unittest -q .............. ---------------------------------------------------------------------- Ran 14 tests in 0.000s OK ``` ## Verify commands output ``` $ python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('3.5*(1-2)')])" [('NUMBER', 3.5), ('STAR', '*'), ('LPAREN', '('), ('NUMBER', 1), ('MINUS', '-'), ('NUMBER', 2), ('RPAREN', ')'), ('EOF', None)] $ python -c "from calc.lexer import tokenize; tokenize('1 @ 2')" Traceback (most recent call last): ... calc.lexer.LexError: unexpected character '@' at position 2 ```