# REVIEW — phase aotest (Adversary log) **Phase plan:** `/srv/cc-ci/cc-ci-plan/plan-phase-aotest-verify.md` **Deliverable repo:** `recipe-maintainers/agent-orchestrator` on git.autonomic.zone --- ## Adversary orientation @2026-06-13T18:44Z **Mission:** Verify the agent-orchestrator harness runs a real project generically on BOTH claude and opencode backends, fully isolated, with a committed test suite. **DoD items to verify (from phase plan):** 1. Unit tests PASS — run from clean /tmp checkout inside `nix develop` 2. claude smoke test PASSES via the harness (isolated, cleaned up) 3. opencode smoke test PASSES or SKIPs with clear, justified reason recorded here 4. No leftover `aotest-*` tmux sessions or held ports after the run; live cc-ci sessions (cc-ci-orchestrator/watchdog/assistant3) untouched 5. Test suite + runner committed and documented in README **Key guardrails for my verification:** - Must use a non-`cc-ci-` session prefix (aotest-* is correct) - opencode port must ≠ 4096 (the live cc-ci port) - Do NOT touch live launch system: `/srv/cc-ci/cc-ci-plan/agents.py`, `agents.toml`, `cc-ci-plan/state/`, or running tmux sessions - Verify from COLD START: fresh shell, /tmp checkout, no cached state **Repo state at orientation:** v0.1.0 (commit `289ef07`) — no tests/ dir present yet. Awaiting Builder to push the aotest deliverable. **Code orientation @2026-06-13T18:44Z (from clean /tmp/ao-adv-check clone):** Key functions the unit tests MUST exercise (from reading agents.py 929 lines): - `load_config`: session_prefix required → hard die; log_dir required → hard die; defaults merge; project_dir resolution; agents inherit defaults; services inherit defaults - `build_loop_kickoff`: reads `[loop].kickoff_template`, fills `{phase_id}/{plan}/{status}/{role}`, then appends `/.md`. No project text in code — must test slot substitution. - `phase_done`: reads `status_basename` from `handoff_repo(cfg)`, looks for `done_marker` line; skips DONE_PLACEHOLDER_RE lines. Must test: file absent → False, no marker → False, marker present → True, placeholder line → False. - `phase_advance_check`: auto-advance on DONE marker; idempotent when SEQUENCE-COMPLETE exists; appending a phase clears SEQUENCE-COMPLETE marker and resumes. - `_parse_reset_epoch`: AM/PM handling (12pm=12:00, 12am=00:00), 24h format, invalid hour/minute returns None, no match returns None. Takes the LAST match. - `_parse_waiting_until`: footer_ui branch uses last non-empty line only; non-footer scans whole pane. ISO-8601 with Z suffix. Invalid format returns None. - `pane_active`: claude backend uses `active_re` match; opencode uses `footer_ui` branch (only last line of 3 matters); limit banner + idle = not active (tested in selftest). **Live smoke isolation requirements (DoD verification):** - claude smoke: session prefix must be `aotest-` (NOT `cc-ci-`), isolated log dir under /tmp - opencode smoke: port must ≠ 4096 (live cc-ci port is 4096), own server, own prefix - Post-run: `tmux ls | grep aotest` → zero results; live sessions intact **Specific break-it checks I will run:** 1. `tmux ls | grep aotest` before AND after — no leakage 2. `ss -ltn | grep 4096` — opencode test must NOT use this port 3. Check cc-ci sessions: cc-ci-orchestrator, cc-ci-watchdog, cc-ci-assistant3 still present 4. Try to interrupt the live smoke mid-run (if isolatable) — cleanup still fires 5. Unit test edge cases: - load_config with missing session_prefix → expect die() - load_config with missing log_dir → expect die() - phase_done with ## DONE followed only by placeholder → expect False - _parse_reset_epoch("resets Jun 16, 12pm") → 12:00 (NOT 24:00 which is invalid) - _parse_reset_epoch("resets Jun 16, 12am") → 00:00 (not 12:00) - _parse_waiting_until with footer_ui=True: only last non-empty line checked 6. Confirm selftest (DoD-3 of aoeng) still passes after any test infrastructure changes --- ## Verdicts *(none yet — awaiting Builder push of tests/ dir)*