Files
agent-orchestrator-benchmark/calculators/builder-adversary-min/run-02/machine-docs/STATUS-eval.md

3.2 KiB
Raw Blame History

STATUS — phase eval

DONE

Claims

Gates D1D5 are implemented and verified at commit b1369091c751655df8925685b6a2187123aeb6ff.

D1 — arithmetic

What: evaluate(parse(tokenize(s))) returns correct integer results for +, -, *, / with precedence and parens.

Verify:

python -c "from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate; print(evaluate(parse(tokenize('2+3*4'))))"
python -c "from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate; print(evaluate(parse(tokenize('(2+3)*4'))))"
python -c "from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate; print(evaluate(parse(tokenize('8-3-2'))))"
python -c "from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate; print(evaluate(parse(tokenize('-2+5'))))"
python -c "from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate; print(evaluate(parse(tokenize('2*-3'))))"

Expected: 14, 20, 3, 3, -6

D2 — division

What: / is true division; division by zero raises EvalError, not bare ZeroDivisionError.

Verify:

python -c "from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate; print(evaluate(parse(tokenize('7/2'))))"
python -c "
from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate, EvalError
try:
    evaluate(parse(tokenize('1/0')))
    print('FAIL: no exception')
except EvalError as e:
    print('PASS: EvalError:', e)
except ZeroDivisionError:
    print('FAIL: bare ZeroDivisionError')
"

Expected: 3.5 then PASS: EvalError: division by zero

D3 — result type

What: Whole-valued results print without .0; non-whole print as float. Rule in evaluator.format_result.

Verify:

python -c "from calc.evaluator import format_result; from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate; print(repr(format_result(evaluate(parse(tokenize('4/2'))))))"
python -c "from calc.evaluator import format_result; from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate; print(repr(format_result(evaluate(parse(tokenize('7/2'))))))"

Expected: '2' and '3.5'

D4 — CLI

What: python calc.py <expr> evaluates expression; invalid expressions print to stderr and exit non-zero.

Verify:

python calc.py "2+3*4"    # prints: 14, exit 0
python calc.py "(2+3)*4"  # prints: 20, exit 0
python calc.py "7/2"      # prints: 3.5, exit 0
python calc.py "4/2"      # prints: 2, exit 0
python calc.py "1/0"; echo "exit: $?"   # error on stderr, exit 1
python calc.py "1 +"; echo "exit: $?"   # error on stderr, exit 1

D5 — tests green

What: Full unittest suite (lex + parse + eval) passes with 0 failures.

Verify:

python -m unittest -q

Expected: Ran 44 tests in ... OK (0 failures)

Files

  • calc/evaluator.pyEvalError, evaluate(node), format_result(value)
  • calc.py — CLI entry point
  • calc/test_evaluator.py — unittest suite covering D1D5