# REVIEW — eval phase (Adversary)

## Gates

| Gate | Status | Verified at |
|------|--------|-------------|
| D1 (arithmetic) | **PASS** | 2026-06-15T04:28:26Z |
| D2 (division / EvalError) | **PASS** | 2026-06-15T04:28:26Z |
| D3 (result type) | **PASS** | 2026-06-15T04:28:26Z |
| D4 (CLI) | **PASS** | 2026-06-15T04:28:26Z |
| D5 (tests green + end-to-end) | **PASS** | 2026-06-15T04:28:26Z |

No VETO.

---

## D1 — arithmetic: PASS @2026-06-15T04:28:26Z

Cold-run all plan-specified cases:
```
python calc.py "2+3*4"    → 14    ✓
python calc.py "(2+3)*4"  → 20    ✓
python calc.py "8-3-2"    → 3     ✓
python calc.py "-2+5"     → 3     ✓
python calc.py "2*-3"     → -6    ✓
```
Also tested: `--5` → 5 (double unary, correct), `-(2+3)` → -5, deep nested parens `((((1+2)*3)-4)/5)` → 1. All correct.

---

## D2 — division / EvalError: PASS @2026-06-15T04:28:26Z

```
python calc.py "7/2"  → 3.5            ✓
python calc.py "1/0"  → stderr: "error: division by zero", exit 1  ✓
```
Verified `EvalError` (not bare `ZeroDivisionError`) is raised at the API level:
```python
from calc.evaluator import evaluate, EvalError
# 1/0 → EvalError("division by zero")  ✓
```
Also tested `5/(3-3)` — raises `EvalError`. Error output confirmed on stderr only (stdout empty).

---

## D3 — result type: PASS @2026-06-15T04:28:26Z

```
python calc.py "4/2"  → "2"    (not "2.0")  ✓
python calc.py "7/2"  → "3.5"              ✓
```
Note: `evaluate()` returns `float(2.0)` for `4/2`; `fmt()` in `calc.py` converts whole-valued floats to int for display. Rule is correct and consistent. Also tested `6/2`→`3`, `9/3`→`3`, `0/5`→`0`, `1/1`→`1`. All print without `.0`.

---

## D4 — CLI: PASS @2026-06-15T04:28:26Z

```
python calc.py "2+3*4"   → stdout: "14", exit 0   ✓
python calc.py "1 +"     → stderr: "error: unexpected token 'EOF'", exit 1  ✓
```
No-argument case: prints usage to stderr, exits 1 (acceptable/correct). Empty string: raises ParseError, prints to stderr, exits 1.

---

## D5 — tests green + end-to-end: PASS @2026-06-15T04:28:26Z

```
python -m unittest -q
→ Ran 50 tests in 0.002s — OK   ✓
```
Test count breakdown: 17 lex + 22 parse + 11 eval = 50. No regressions.

Test coverage verified:
- `TestArithmetic` (5 tests): covers D1 plan cases
- `TestDivision` (3 tests): covers D2 including `5/(3-3)` zero-division via expression
- `TestResultType` (3 tests): covers D3 including integer arithmetic type preservation