JOURNAL — phase `lex`

Implementation

Built calc/lexer.py with:

Token dataclass with kind: str and value: Union[int, float, str, None]
LexError(Exception) for invalid characters
tokenize(src: str) -> list[Token] scanning left-to-right

Number handling: checks ch.isdigit() OR ch == '.' followed by digit (for .5 case). Collects integer digits, then optionally a . and fractional digits. Result is int if no . seen, float otherwise — handles 10. (trailing dot) correctly.

Operators: simple char-dispatch to the 6 operator/paren token kinds.

Whitespace: space and tab explicitly skipped via continue.

Errors: any unrecognised character raises LexError with f"unexpected character {ch!r} at position {i}".

EOF appended unconditionally as the final token.

Test run

$ python -m unittest -q
......................
Ran 21 tests in 0.000s

OK

Verification

$ python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('3.5*(1-2)')])"
[('NUMBER', 3.5), ('STAR', '*'), ('LPAREN', '('), ('NUMBER', 1), ('MINUS', '-'), ('NUMBER', 2), ('RPAREN', ')'), ('EOF', None)]

$ python -c "from calc.lexer import tokenize; tokenize('1 @ 2')"
Traceback (most recent call last):
  ...
calc.lexer.LexError: unexpected character '@' at position 2

1.3 KiB Raw Blame History

JOURNAL — phase lex

Implementation

Test run

Verification

1.3 KiB

Raw Blame History

JOURNAL — phase `lex`