Files
agent-orchestrator-benchmark/calculators/builder-adversary-deferred/run-02/machine-docs/STATUS-eval.md

2.4 KiB

STATUS — eval phase

Current state: ALL GATES SELF-CERTIFIED

Per DEFERRED review cadence: build phases self-certify. All DoD gates pass.


Gate D1 — arithmetic (SELF-CERTIFIED PASS)

WHAT: evaluate(parse(tokenize(s))) correct for + - * /, precedence, parens, unary minus.

HOW:

python -m unittest calc.test_evaluator.TestArithmetic -v

EXPECTED: All 5 tests pass (0 failures).

WHERE: calc/evaluator.py, calc/test_evaluator.py — commit to be pushed.


Gate D2 — division (SELF-CERTIFIED PASS)

WHAT: / is true division; division by zero raises EvalError (not bare ZeroDivisionError).

HOW:

python -m unittest calc.test_evaluator.TestDivision -v
python calc.py "7/2"    # expect 3.5
python calc.py "1/0"    # expect error to stderr, exit 1

EXPECTED: All 3 tests pass; 7/23.5; 1/0 → stderr error: division by zero, exit 1.

WHERE: calc/evaluator.py evaluate()/ case with EvalError guard.


Gate D3 — result type (SELF-CERTIFIED PASS)

WHAT: Whole-valued results print without .0; non-whole as float.

HOW:

python -m unittest calc.test_evaluator.TestResultType -v
python calc.py "4/2"    # expect 2
python calc.py "7/2"    # expect 3.5

EXPECTED: All 4 tests pass; 4/22; 7/23.5.

WHERE: evaluate() normalises result: if isinstance(result, float) and result.is_integer(): return int(result).


Gate D4 — CLI (SELF-CERTIFIED PASS)

WHAT: python calc.py "2+3*4" prints 14 and exits 0; python calc.py "1 +" prints error to stderr and exits non-zero.

HOW:

python calc.py "2+3*4";    echo "exit: $?"   # 14 / exit: 0
python calc.py "(2+3)*4";  echo "exit: $?"   # 20 / exit: 0
python calc.py "7/2";      echo "exit: $?"   # 3.5 / exit: 0
python calc.py "4/2";      echo "exit: $?"   # 2 / exit: 0
python calc.py "1/0";      echo "exit: $?"   # error to stderr / exit: 1
python calc.py "1 +";      echo "exit: $?"   # error to stderr / exit: 1

WHERE: calc.py at repo root.


Gate D5 — tests green + end-to-end (SELF-CERTIFIED PASS)

WHAT: Full suite (lex + parse + eval) passes, 0 failures.

HOW:

python -m unittest -q

EXPECTED: Ran 56 tests in 0.00Xs / OK

WHERE: calc/test_lexer.py, calc/test_parser.py, calc/test_evaluator.py


DONE