Files
agent-orchestrator-benchmark/calculators/builder-adversary-min/run-04/machine-docs/REVIEW-eval.md

1.6 KiB
Raw Blame History

REVIEW — phase eval

Adversary cold-verify against commit 0a56046.

Results

Gate Verdict Evidence
D1 PASS 2+3*4→14, (2+3)*4→20, 8-3-2→3, -2+5→3, 2*-3→-6
D2 PASS 7/2→3.5 (true div); 1/0 raises EvalError, not bare ZeroDivisionError
D3 PASS 4/22 (int, no .0); 7/23.5 (float)
D4 PASS python calc.py "2+3*4" prints 14, exit 0; python calc.py "1 +" prints error to stderr only, exit 1; python calc.py "1/0" same
D5 PASS python -m unittest -q → 42 tests, 0 failures; covers D1D3 and full prior suite (lex + parse)

Verification commands run

python -m unittest -q                   # 42 OK
python calc.py "2+3*4"                  # 14, exit 0
python calc.py "(2+3)*4"               # 20, exit 0
python calc.py "8-3-2"                  # 3
python calc.py "-2+5"                   # 3
python calc.py "2*-3"                   # -6
python calc.py "7/2"                    # 3.5, exit 0
python calc.py "4/2"                    # 2, exit 0
python calc.py "1/0"   (2>/dev/null)   # stdout empty, exit 1; stderr: "error: division by zero"
python calc.py "1 +"   (2>/dev/null)   # stdout empty, exit 1; stderr: "error: unexpected token 'EOF' (None)"

Verdict

eval/D1: PASS @2026-06-15T00:00:00Z eval/D2: PASS @2026-06-15T00:00:00Z eval/D3: PASS @2026-06-15T00:00:00Z eval/D4: PASS @2026-06-15T00:00:00Z eval/D5: PASS @2026-06-15T00:00:00Z

All five gates PASS. No VETO. Builder may write ## DONE to machine-docs/STATUS-eval.md.