Files
cc-ci/machine-docs/JOURNAL-aotest.md
autonomic-bot 034e85d786
Some checks failed
continuous-integration/drone/push Build is failing
chore(aotest): Adversary JOURNAL — all DoD PASS, phase complete
2026-06-13 19:02:32 +00:00

4.2 KiB

JOURNAL — phase aotest (Adversary)


2026-06-13T18:44Z — Phase orientation + initial files created

  • Read plan-phase-aotest-verify.md: mission is to verify agent-orchestrator has a committed tests/ dir covering unit tests + isolated live smoke tests on both claude and opencode backends.
  • Checked agent-orchestrator repo: current state is v0.1.0 (commit 289ef07), no tests/ dir.
  • Created phase-namespaced files: STATUS-aotest.md, REVIEW-aotest.md, BACKLOG-aotest.md, JOURNAL-aotest.md.
  • Builder has not yet pushed any aotest work. Entering polling stance.

Next: poll agent-orchestrator for new commits every ~10 min.


2026-06-13T18:56Z — (Builder) test suite built, all DoD met, gate CLAIMED

Approach. The harness (agents.py) is mostly pure functions with a thin tmux shell-out layer, so I split testing into (a) unit tests that exercise the pure logic directly and (b) live smokes that drive agents.py end-to-end on each real backend.

Unit tests (tests/test_unit.py, stdlib unittest, 51 tests). Each builds a throwaway project (config + prompts + machine-docs) in a tempdir and calls the harness functions directly — no agents, no live tmux. The one function that would spawn sessions, phase_advance_check, calls module-level stop_loops/start_loops/handoff_reset; I monkeypatch those three to recorders so the phase-machine logic (advance, idempotent sequence-complete, append-a-phase resumes + clears the stale marker) is covered without launching anything. I also load the shipped agents.example.toml so an example regression is caught.

  • Gotcha: my BASE_TOML fixture had \d+/· regexes; in a normal triple-quoted string those collapse to single backslashes and tomllib rejects the invalid escape. Fixed by making the fixture a raw string (r"""…""") so the on-disk TOML keeps the doubled backslash, like the real agents.example.toml.

Live smokes. smoke_claude.sh / smoke_opencode.sh each spin up a throwaway persistent "probe" through agents.py up in a sandbox with a unique session_prefix and temp log_dir, confirm the session attaches (pane command claude/opencode), status shows RUNNING, down removes it; a cleanup trap (EXIT INT TERM) kills everything. claude uses the cheap claude-haiku-4-5. opencode generalizes cc-ci test-opencode.sh onto this repo with its own server on :4097 (a guard refuses 4096).

  • Gotcha: the opencode server runs in a subshell ( … serve … ) &, so $SERVER_PID is the subshell, not the listener — killing it left :4097 held (a DoD-4 leftover-port failure I caught on the first standalone run). Fixed cleanup to also pkill -f "opencode serve.*--port ${PORT}" and wait for the port to free. Re-ran: freed.

Verification. Cold-cloned to /tmp/aotest-cold and ran inside nix develop (python311) — the Adversary's exact path: unit=PASS (51) claude=PASS opencode=PASS isolation=PASS, rc=0; afterwards no aotest-* sessions, :4097 free, cc-ci-orchestrator/watchdog/assistant3 present. Pushed the deliverable as cdcece9; clean tree; claimed the gate.


2026-06-13T19:00Z — Adversary cold verification COMPLETE — ALL DoD PASS

Independent cold verification from /tmp/ao-adv-check clone (cloned before reading Builder STATUS):

  • DoD-1 Unit tests: Ran 51 testsOK, rc=0 inside nix develop
  • DoD-2 claude smoke: === CLAUDE BACKEND SMOKE: PASS === — isolated prefix aotest-c-681472-, pane command claude, TUI alive, status RUNNING, down cleans up ✓
  • DoD-3 opencode smoke: === OPENCODE BACKEND SMOKE: PASS === — dedicated port :4097 (not 4096), isolated prefix aotest-o-681566-, TUI attached, status RUNNING, down cleans up + port freed ✓
  • DoD-4 Isolation: no aotest-* sessions; port 4097 free; cc-ci-orchestrator/watchdog/assistant3 all present ✓
  • DoD-5 Committed + documented: tests/ in commit cdcece9, README ## Testing section covers invocation, layers, env vars, skip conditions, and safety ✓
  • Full suite via run.sh: SUMMARY: unit=PASS claude=PASS opencode=PASS isolation=PASS — rc=0 ✓

Verdict written to REVIEW-aotest.md. Committed with review(aotest) prefix → watchdog pings Builder. Phase aotest DONE (Adversary side). Awaiting Builder to write ## DONE to STATUS-aotest.md.