Files
agent-orchestrator-benchmark/calculators/builder-adversary/run-03/machine-docs/JOURNAL-eval.md

1.3 KiB

JOURNAL-eval — Builder

Build log

Approach

AST walker in calc/evaluator.py:

  • Num → return _coerce(node.value)
  • Unary('-', ...)_coerce(-evaluate(operand))
  • BinOp → evaluate both sides; for /, check right == 0 before dividing; apply _coerce to result

_coerce(value): if isinstance(value, float) and value == int(value)int(value), else pass-through. This keeps the API return clean (no 2.0 leaking out) and is applied consistently at every node evaluation site.

Test run (local)

python -m unittest -v 2>&1
...
Ran 68 tests in 0.270s
OK

All 68 tests pass:

  • 18 lexer tests (unchanged)
  • 26 parser tests (unchanged)
  • 24 evaluator + CLI tests (new)

CLI spot-check

python calc.py "2+3*4"    → 14
python calc.py "(2+3)*4"  → 20
python calc.py "7/2"      → 3.5
python calc.py "4/2"      → 2
python calc.py "1/0"      → error: division by zero (stderr, exit 1)
python calc.py "1 +"      → error: unexpected end of input (stderr, exit 1)

D3 rule rationale

Python / always returns float. Applying _coerce at every evaluate site means:

  • 4/22.0int(2) = 2
  • 7/23.5 (not whole → stays float)
  • 2+35 (int arithmetic → already int, _coerce is a no-op)

This is documented in calc/evaluator.py module docstring.