Files
agent-orchestrator-benchmark/calculators/builder-adversary-stateless/run-04/machine-docs/STATUS-eval.md

2.3 KiB
Raw Blame History

STATUS — phase eval

DONE

All gates D1D5 Adversary-verified PASS @2026-06-15T04:28:26Z. No vetoes. Phase complete.


Gates: D1D5 CLAIMED, awaiting Adversary

All five gates implemented and locally verified. Claiming all simultaneously.

Commit: (see git log — latest claim commit)


D1 — arithmetic (CLAIMED)

WHAT: evaluate(parse(tokenize(s))) correct for +, -, *, /, precedence, parens, unary minus.

HOW:

python calc.py "2+3*4"    # 14
python calc.py "(2+3)*4"  # 20
python calc.py "8-3-2"    # 3
python calc.py "-2+5"     # 3
python calc.py "2*-3"     # -6

EXPECTED:

14
20
3
3
-6

WHERE: calc/evaluator.pyevaluate() dispatches on node type; Unary negates, BinOp applies op.


D2 — division (CLAIMED)

WHAT: / is true division; division by zero raises EvalError, not bare ZeroDivisionError.

HOW:

python calc.py "7/2"   # 3.5
python calc.py "1/0"   # error to stderr, exit 1

EXPECTED:

3.5
error: division by zero   (stderr, exit code 1)

WHERE: calc/evaluator.pySLASH branch uses Python / and guards right == 0.


D3 — result type (CLAIMED)

WHAT: Whole-valued results print without .0; non-whole as float. Rule in calc.py:fmt(): if isinstance(value, float) and value == int(value) → print as int.

HOW:

python calc.py "4/2"  # 2
python calc.py "7/2"  # 3.5

EXPECTED:

2
3.5

WHERE: calc.pyfmt() function.


D4 — CLI (CLAIMED)

WHAT: python calc.py "2+3*4" prints 14 exits 0; python calc.py "1 +" prints error to stderr exits non-zero.

HOW:

python calc.py "2+3*4"; echo "exit:$?"
python calc.py "1 +" 2>&1; echo "exit:$?"

EXPECTED:

14
exit:0
error: unexpected token 'EOF'
exit:1

WHERE: calc.pymain() catches LexError|ParseError|EvalError, prints to stderr, exits 1.


D5 — tests green + end-to-end (CLAIMED)

WHAT: 50 tests total (17 lex + 22 parse + 11 eval), 0 failures under python -m unittest -q.

HOW:

python -m unittest -q

EXPECTED:

----------------------------------------------------------------------
Ran 50 tests in ...s

OK

WHERE: calc/test_evaluator.py — 11 tests across 3 classes (TestArithmetic, TestDivision, TestResultType).