2.8 KiB
STATUS — phase lex
DONE
claim(D1): numbers tokenize correctly
What: Integers and floats (including .5, 10.) produce a single NUMBER token with the correct numeric type (int or float), followed by EOF.
How to verify:
python -c "from calc.lexer import tokenize; toks = tokenize('42'); print(toks[0].kind, toks[0].value, type(toks[0].value).__name__, toks[1].kind)"
python -c "from calc.lexer import tokenize; toks = tokenize('3.14'); print(toks[0].kind, toks[0].value, type(toks[0].value).__name__)"
python -c "from calc.lexer import tokenize; toks = tokenize('.5'); print(toks[0].kind, toks[0].value)"
python -c "from calc.lexer import tokenize; toks = tokenize('10.'); print(toks[0].kind, toks[0].value, type(toks[0].value).__name__)"
Expected:
NUMBER 42 int EOF
NUMBER 3.14 float
NUMBER 0.5 float
NUMBER 10.0 float
Files: calc/lexer.py, calc/__init__.py
claim(D2): operators and parens tokenize correctly
What: + - * / ( ) each produce their respective token kind; tokenize("1+2*3") yields NUMBER PLUS NUMBER STAR NUMBER EOF.
How to verify:
python -c "from calc.lexer import tokenize; print([t.kind for t in tokenize('1+2*3')])"
python -c "from calc.lexer import tokenize; print([t.kind for t in tokenize('+-*/()')] )"
Expected:
['NUMBER', 'PLUS', 'NUMBER', 'STAR', 'NUMBER', 'EOF']
['PLUS', 'MINUS', 'STAR', 'SLASH', 'LPAREN', 'RPAREN', 'EOF']
Files: calc/lexer.py
claim(D3): whitespace skipped, invalid chars raise LexError
What: Spaces/tabs between tokens are silently skipped. Any unrecognized character (e.g. @, $, letter) raises LexError with the offending character and its position in the message.
How to verify:
python -c "from calc.lexer import tokenize; print([t.kind for t in tokenize(' 12 + 3 ')])"
python -c "from calc.lexer import tokenize; tokenize('1 @ 2')"
Expected:
- First command:
['NUMBER', 'PLUS', 'NUMBER', 'EOF'] - Second command: raises
calc.lexer.LexError: unexpected character '@' at position 2
Files: calc/lexer.py
claim(D4): all tests pass
What: calc/test_lexer.py (15 test cases covering D1–D3, including the three required expressions) passes with 0 failures.
How to verify:
python -m unittest -q
Expected:
Ran 15 tests in 0.000s
OK
Also run the plan's exact cold-verify commands:
python -c "from calc.lexer import tokenize; print([(t.kind,t.value) for t in tokenize('3.5*(1-2)')])"
python -c "from calc.lexer import tokenize; tokenize('1 @ 2')"
Expected:
[('NUMBER', 3.5), ('STAR', '*'), ('LPAREN', '('), ('NUMBER', 1), ('MINUS', '-'), ('NUMBER', 2), ('RPAREN', ')'), ('EOF', None)]
Traceback ... LexError: unexpected character '@' at position 2
Files: calc/test_lexer.py