Files
cc-ci/machine-docs/JOURNAL-aotest.md
autonomic-bot c838c9250d
Some checks failed
continuous-integration/drone/push Build is failing
claim(aotest): test suite pushed (deliverable cdcece9) — unit+claude+opencode smokes PASS, isolated, awaiting Adversary
Unit 51/51 PASS, claude smoke PASS, opencode smoke PASS (own :4097), no
leftover aotest-* sessions/ports, cc-ci sessions intact. Cold-verified from
/tmp clone inside nix develop. HOW/EXPECTED/WHERE in STATUS-aotest.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 18:59:11 +00:00

3.1 KiB

JOURNAL — phase aotest (Adversary)


2026-06-13T18:44Z — Phase orientation + initial files created

  • Read plan-phase-aotest-verify.md: mission is to verify agent-orchestrator has a committed tests/ dir covering unit tests + isolated live smoke tests on both claude and opencode backends.
  • Checked agent-orchestrator repo: current state is v0.1.0 (commit 289ef07), no tests/ dir.
  • Created phase-namespaced files: STATUS-aotest.md, REVIEW-aotest.md, BACKLOG-aotest.md, JOURNAL-aotest.md.
  • Builder has not yet pushed any aotest work. Entering polling stance.

Next: poll agent-orchestrator for new commits every ~10 min.


2026-06-13T18:56Z — (Builder) test suite built, all DoD met, gate CLAIMED

Approach. The harness (agents.py) is mostly pure functions with a thin tmux shell-out layer, so I split testing into (a) unit tests that exercise the pure logic directly and (b) live smokes that drive agents.py end-to-end on each real backend.

Unit tests (tests/test_unit.py, stdlib unittest, 51 tests). Each builds a throwaway project (config + prompts + machine-docs) in a tempdir and calls the harness functions directly — no agents, no live tmux. The one function that would spawn sessions, phase_advance_check, calls module-level stop_loops/start_loops/handoff_reset; I monkeypatch those three to recorders so the phase-machine logic (advance, idempotent sequence-complete, append-a-phase resumes + clears the stale marker) is covered without launching anything. I also load the shipped agents.example.toml so an example regression is caught.

  • Gotcha: my BASE_TOML fixture had \d+/· regexes; in a normal triple-quoted string those collapse to single backslashes and tomllib rejects the invalid escape. Fixed by making the fixture a raw string (r"""…""") so the on-disk TOML keeps the doubled backslash, like the real agents.example.toml.

Live smokes. smoke_claude.sh / smoke_opencode.sh each spin up a throwaway persistent "probe" through agents.py up in a sandbox with a unique session_prefix and temp log_dir, confirm the session attaches (pane command claude/opencode), status shows RUNNING, down removes it; a cleanup trap (EXIT INT TERM) kills everything. claude uses the cheap claude-haiku-4-5. opencode generalizes cc-ci test-opencode.sh onto this repo with its own server on :4097 (a guard refuses 4096).

  • Gotcha: the opencode server runs in a subshell ( … serve … ) &, so $SERVER_PID is the subshell, not the listener — killing it left :4097 held (a DoD-4 leftover-port failure I caught on the first standalone run). Fixed cleanup to also pkill -f "opencode serve.*--port ${PORT}" and wait for the port to free. Re-ran: freed.

Verification. Cold-cloned to /tmp/aotest-cold and ran inside nix develop (python311) — the Adversary's exact path: unit=PASS (51) claude=PASS opencode=PASS isolation=PASS, rc=0; afterwards no aotest-* sessions, :4097 free, cc-ci-orchestrator/watchdog/assistant3 present. Pushed the deliverable as cdcece9; clean tree; claimed the gate.