Files
agent-orchestrator-benchmark/calculators/builder-adversary-lean/run-02/machine-docs/STATUS-eval.md

3.1 KiB
Raw Blame History

STATUS — Phase eval

Gate D1 CLAIMED — awaiting Adversary

WHAT: evaluate(parse(tokenize(s))) is correct for + - * /, precedence, parens, and unary minus.

HOW:

python -c "
from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate
def c(s): return evaluate(parse(tokenize(s)))
assert c('2+3*4') == 14
assert c('(2+3)*4') == 20
assert c('8-3-2') == 3
assert c('-2+5') == 3
assert c('2*-3') == -6
print('D1 OK')
"

EXPECTED: prints D1 OK, no assertion errors.

WHERE: calc/evaluator.py @ commit 7167e33


Gate D2 CLAIMED — awaiting Adversary

WHAT: / is true division ("7/2" → 3.5). Division by zero raises EvalError, not ZeroDivisionError.

HOW:

python -c "
from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate, EvalError
def c(s): return evaluate(parse(tokenize(s)))
assert c('7/2') == 3.5
try:
    c('1/0')
    assert False, 'no error raised'
except EvalError:
    pass
except ZeroDivisionError:
    assert False, 'bare ZeroDivisionError escaped'
print('D2 OK')
"

EXPECTED: prints D2 OK, no assertion errors.

WHERE: calc/evaluator.py @ commit 7167e33


Gate D3 CLAIMED — awaiting Adversary

WHAT: Whole-valued results print without .0 ("4/2"2), non-whole as float ("7/2"3.5).

Rule: after division, if result == int(result), return int(result); otherwise return float.

HOW:

python -c "
from calc.lexer import tokenize; from calc.parser import parse; from calc.evaluator import evaluate
def c(s): return evaluate(parse(tokenize(s)))
r1 = c('4/2'); assert r1 == 2 and isinstance(r1, int), f'got {r1!r}'
r2 = c('7/2'); assert isinstance(r2, float) and r2 == 3.5, f'got {r2!r}'
print('D3 OK')
"
python calc.py '4/2'   # must print: 2
python calc.py '7/2'   # must print: 3.5

EXPECTED: D3 OK, then 2, then 3.5.

WHERE: calc/evaluator.py @ commit 7167e33


Gate D4 CLAIMED — awaiting Adversary

WHAT: python calc.py "2+3*4" prints 14 and exits 0; invalid expression prints error to stderr and exits non-zero (no traceback).

HOW:

python calc.py "2+3*4"    # stdout: 14, exit 0
python calc.py "(2+3)*4"  # stdout: 20, exit 0
python calc.py "7/2"      # stdout: 3.5, exit 0
python calc.py "4/2"      # stdout: 2, exit 0
python calc.py "1/0"      # stderr: error: ..., exit non-zero, stdout empty
python calc.py "1 +"      # stderr: error: ..., exit non-zero, stdout empty

EXPECTED: exactly as above — no Python traceback in stderr, error message starts with error:.

WHERE: calc.py @ commit 7167e33


Gate D5 CLAIMED — awaiting Adversary

WHAT: Full test suite passes: python -m unittest -q, 0 failures. New calc/test_evaluator.py covers D1D4. Prior lex+parse suite (37 tests) still passes (no regression). Total: 63 tests.

HOW:

python -m unittest -q

EXPECTED:

Ran 63 tests in X.XXXs

OK

WHERE: calc/test_evaluator.py (26 new tests) + calc/test_lexer.py + calc/test_parser.py @ commit 7167e33