3.5 KiB
REVIEW-eval — Adversary Verdicts
Legend
- PASS @ — gate accepted, evidence below
- FAIL — repro steps below, Builder must fix
D1 — arithmetic
PASS @2026-06-15T00:54Z
Cold run — all 5 DoD-mandated cases:
'2+3*4' -> 14 expected 14 OK
'(2+3)*4' -> 20 expected 20 OK
'8-3-2' -> 3 expected 3 OK
'-2+5' -> 3 expected 3 OK
'2*-3' -> -6 expected -6 OK
Extra break-it probes (all correct):
'2+3+4' -> 9 OK (left-assoc addition)
'10-2-3' -> 5 OK (left-assoc subtraction)
'2*3*4' -> 24 OK (left-assoc multiplication)
'--5' -> 5 OK (double unary minus)
'(-3)*(-2)' -> 6 OK (unary in parens)
'(1+2)*(3+4)' -> 21 OK (nested parens)
'0*100' -> 0 OK
python -m unittest calc.test_evaluator.TestArithmetic -q — 0 failures.
D2 — division
PASS @2026-06-15T00:54Z
Cold run:
'7/2' -> 3.5 OK (true division)
'1/0' -> EvalError: division by zero OK (not ZeroDivisionError)
'5/(3-3)' -> EvalError: division by zero OK (dynamic zero denominator)
Implementation: explicit if right == 0: raise EvalError(...) at calc/evaluator.py:18-21 — ZeroDivisionError cannot escape the API boundary.
python -m unittest calc.test_evaluator.TestDivision -q — 0 failures.
D3 — result type
PASS @2026-06-15T00:54Z
Cold run — CLI output (stdout only, no stderr):
'4/2' -> '2' OK (whole float -> int display)
'9/3' -> '3' OK (whole float -> int display)
'0/5' -> '0' OK (zero result -> int display)
'7/2' -> '3.5' OK (non-whole)
'1/3' -> '0.3333333333333333' OK (non-whole)
'22/7' -> '3.142857142857143' OK (non-whole)
Rule confirmed: _fmt() in calc.py calls value.is_integer() on floats; whole → cast to int for display.
python -m unittest calc.test_evaluator.TestResultType -q — 0 failures.
D4 — CLI
PASS @2026-06-15T00:54Z
Cold run — all DoD cases:
python calc.py "2+3*4" -> stdout='14' stderr='' exit=0 OK
python calc.py "(2+3)*4" -> stdout='20' stderr='' exit=0 OK
python calc.py "7/2" -> stdout='3.5' stderr='' exit=0 exit=0 OK
python calc.py "4/2" -> stdout='2' stderr='' exit=0 OK
python calc.py "1/0" -> stdout='' stderr='error: division by zero' exit=1 OK
python calc.py "1 +" -> stdout='' stderr='error: unexpected token ...' exit=1 OK
Additional probes:
- No-arg: stderr='usage: calc.py ', exit=1 OK
- Empty string
"": stderr='error: empty expression', exit=1 OK - No traceback in any error case (grepped for "Traceback" — not found) OK
- Errors go to stderr, stdout is empty on error (verified via redirect) OK
D5 — tests green + end-to-end
PASS @2026-06-15T00:54Z
Cold run:
$ python -m unittest -q
----------------------------------------------------------------------
Ran 68 tests in 0.210s
OK
Exit code 0. 68/68 pass (24 lex + 22 parse + 22 eval, including 6 CLI subprocess tests).
No regression in prior lex/parse tests.
Summary
| Gate | Verdict |
|---|---|
| D1 — arithmetic | PASS |
| D2 — division | PASS |
| D3 — result type | PASS |
| D4 — CLI | PASS |
| D5 — tests green | PASS |
All gates PASS. No findings. Builder may write "## DONE" to STATUS-eval.md.