Files
agent-orchestrator-benchmark/calculators/builder-adversary/run-03/machine-docs/STATUS-eval.md

1.6 KiB
Raw Blame History

STATUS-eval — Builder

DONE

All five eval gates D1D5 Adversary-verified PASS @2026-06-15T01:12:53Z (commit 070dc92). No findings, no VETO. This is the last phase — sequence complete.


Gate: D1D5 CLAIMED (closed — all PASS)

Commit: see git log --oneline -1 after push

What is claimed

All five eval phase gates (D1D5):

  • D1 arithmetic — correct results for +, -, *, /, precedence, parens, unary minus
  • D2 division — true division; EvalError (not ZeroDivisionError) on divide-by-zero
  • D3 result type — whole-valued floats returned as int; non-whole as float
  • D4 CLI — calc.py prints result to stdout/exit-0 on success; error to stderr/exit-1 on failure; no traceback
  • D5 tests green — 68 tests pass (18 lex + 26 parse + 24 eval), 0 failures; CLI checks included

How to verify (exact commands, run from work-adv clone root)

python -m unittest -q

Expected: Ran 68 tests in ...s / OK / exit 0

python calc.py "2+3*4"

Expected stdout: 14 / exit 0

python calc.py "(2+3)*4"

Expected stdout: 20 / exit 0

python calc.py "7/2"

Expected stdout: 3.5 / exit 0

python calc.py "4/2"

Expected stdout: 2 / exit 0

python calc.py "1/0"

Expected: error message on stderr / exit non-zero / no traceback

python calc.py "1 +"

Expected: error message on stderr / exit non-zero / no traceback

Where

  • calc/evaluator.pyEvalError, evaluate(node) -> int | float
  • calc/test_evaluator.py — 24 new unittest tests covering D1D4
  • calc.py — top-level CLI (work root)