artifacts: add calculators/ — the 30 built calculators (5/variant) + machine-docs + git logs
This commit is contained in:
@ -0,0 +1,65 @@
|
||||
# REVIEW-eval — Adversary Verdicts
|
||||
|
||||
Phase: eval
|
||||
Plan SSOT: /home/loops/project-orchestrator/projects/agent-orchestrator-benchmark/plans/calc/eval.md
|
||||
|
||||
## Gates
|
||||
|
||||
- D1 — arithmetic: PASS @2026-06-15T01:12:53Z
|
||||
- D2 — division / EvalError: PASS @2026-06-15T01:12:53Z
|
||||
- D3 — result type (no trailing .0): PASS @2026-06-15T01:12:53Z
|
||||
- D4 — CLI: PASS @2026-06-15T01:12:53Z
|
||||
- D5 — tests green + end-to-end: PASS @2026-06-15T01:12:53Z
|
||||
|
||||
## Verdicts
|
||||
|
||||
### D1 — arithmetic: PASS @2026-06-15T01:12:53Z
|
||||
|
||||
Cold-verified from work-adv clone (commit after pull: 070dc92).
|
||||
|
||||
Evidence (all outputs match expected):
|
||||
- `python calc.py "2+3*4"` → `14` exit 0 ✓
|
||||
- `python calc.py "(2+3)*4"` → `20` exit 0 ✓
|
||||
- `python calc.py "8-3-2"` → `3` exit 0 ✓
|
||||
- `python calc.py "-2+5"` → `3` exit 0 ✓
|
||||
- `python calc.py "2*-3"` → `-6` exit 0 ✓
|
||||
- `python calc.py "--5"` → `5` exit 0 ✓ (double unary)
|
||||
- `python calc.py "3-3"` → `0` exit 0 ✓
|
||||
|
||||
### D2 — division / EvalError: PASS @2026-06-15T01:12:53Z
|
||||
|
||||
Evidence:
|
||||
- `python calc.py "7/2"` → `3.5` exit 0 ✓ (true division)
|
||||
- `1/0` raises `EvalError("division by zero")`, NOT bare `ZeroDivisionError` ✓
|
||||
- `5/(3-3)` also raises `EvalError` ✓
|
||||
|
||||
### D3 — result type: PASS @2026-06-15T01:12:53Z
|
||||
|
||||
Evidence (types confirmed via Python `isinstance` check):
|
||||
- `4/2` → `int(2)` (not `float(2.0)`) ✓
|
||||
- `7/2` → `float(3.5)` ✓
|
||||
- `2+3*4` → `int(14)` ✓
|
||||
- `0.0/1` → `int(0)` (whole-float coercion works for zero) ✓
|
||||
- `1.5+1.5` → `3` exit 0 (coerces 3.0 → int) ✓
|
||||
- Rule documented in evaluator.py docstring ✓
|
||||
|
||||
### D4 — CLI: PASS @2026-06-15T01:12:53Z
|
||||
|
||||
Evidence:
|
||||
- `python calc.py "2+3*4"` → stdout `14`, exit 0 ✓
|
||||
- `python calc.py "1 +"` → stderr error, exit 1, no "Traceback" ✓
|
||||
- `python calc.py "1/0"` → stderr error, exit 1, no "Traceback" ✓
|
||||
- `python calc.py` (no args) → stderr usage msg, exit 1 ✓
|
||||
- Error output confirmed routed to stderr (stdout suppressed, still exits 1) ✓
|
||||
|
||||
### D5 — tests green + end-to-end: PASS @2026-06-15T01:12:53Z
|
||||
|
||||
Evidence:
|
||||
- `python -m unittest -q` → `Ran 68 tests in ...s` / `OK` ✓
|
||||
- Breakdown: 18 lex + 26 parse + 24 eval = 68 total ✓
|
||||
- Prior 44 tests (lex + parse) still pass — no regression ✓
|
||||
- `python -m unittest calc.test_lexer calc.test_parser -q` → 44 tests OK ✓
|
||||
|
||||
## Adversary findings
|
||||
|
||||
None. No defects found. No VETO.
|
||||
Reference in New Issue
Block a user