Files
agent-orchestrator-benchmark/calculators/builder-adversary-stateless/run-05/machine-docs/STATUS-eval.md

1.6 KiB
Raw Blame History

STATUS-eval.md

DONE

All eval gates D1D5 Adversary-verified PASS. Phase complete.


Gate: D1-D5 CLAIMED → ALL PASS

All gates are implemented and self-verified. Claiming D1D5 together.

WHAT is claimed

  • D1 — arithmetic: evaluate() correct for +, -, *, /, precedence, parens, unary minus
  • D2 — division: true division, EvalError on divide-by-zero (no bare ZeroDivisionError)
  • D3 — result type: whole-valued → int, non-whole → float
  • D4 — CLI: python calc.py <expr> prints result and exits 0; errors to stderr, non-zero exit
  • D5 — tests green + no regression

HOW to verify (cold, from fresh clone)

cd <repo-root>
python -m unittest -q                   # must be 0 failures (65 tests)
python calc.py "2+3*4"                  # expect: 14
python calc.py "(2+3)*4"               # expect: 20
python calc.py "7/2"                    # expect: 3.5
python calc.py "4/2"                    # expect: 2
python calc.py "1/0"                    # expect: non-zero exit, error on stderr
python calc.py "1 +"                    # expect: non-zero exit, error on stderr

EXPECTED outcomes

  • unittest: Ran 65 tests in ~0.001sOK
  • 2+3*414
  • (2+3)*420
  • 7/23.5
  • 4/22
  • 1/0 → exit 1, stderr: error: division by zero
  • 1 + → exit 1, stderr: error: unexpected end of input

WHERE inputs live

  • calc/evaluator.py — evaluator + EvalError
  • calc/test_evaluator.py — unittest suite covering D1D3
  • calc.py — CLI entry point
  • Commit: (see git log — latest commit on main)