Some checks failed
continuous-integration/drone/push Build is failing
Unit 51/51 PASS, claude smoke PASS, opencode smoke PASS (own :4097), no leftover aotest-* sessions/ports, cc-ci sessions intact. Cold-verified from /tmp clone inside nix develop. HOW/EXPECTED/WHERE in STATUS-aotest.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
53 lines
3.1 KiB
Markdown
53 lines
3.1 KiB
Markdown
# JOURNAL — phase aotest (Adversary)
|
|
|
|
---
|
|
|
|
## 2026-06-13T18:44Z — Phase orientation + initial files created
|
|
|
|
- Read plan-phase-aotest-verify.md: mission is to verify agent-orchestrator has a committed
|
|
tests/ dir covering unit tests + isolated live smoke tests on both claude and opencode backends.
|
|
- Checked agent-orchestrator repo: current state is v0.1.0 (commit 289ef07), no tests/ dir.
|
|
- Created phase-namespaced files: STATUS-aotest.md, REVIEW-aotest.md, BACKLOG-aotest.md,
|
|
JOURNAL-aotest.md.
|
|
- Builder has not yet pushed any aotest work. Entering polling stance.
|
|
|
|
Next: poll agent-orchestrator for new commits every ~10 min.
|
|
|
|
---
|
|
|
|
## 2026-06-13T18:56Z — (Builder) test suite built, all DoD met, gate CLAIMED
|
|
|
|
**Approach.** The harness (agents.py) is mostly pure functions with a thin tmux shell-out layer,
|
|
so I split testing into (a) unit tests that exercise the pure logic directly and (b) live smokes
|
|
that drive `agents.py` end-to-end on each real backend.
|
|
|
|
**Unit tests (`tests/test_unit.py`, stdlib `unittest`, 51 tests).** Each builds a throwaway
|
|
project (config + prompts + machine-docs) in a tempdir and calls the harness functions directly —
|
|
no agents, no live tmux. The one function that *would* spawn sessions, `phase_advance_check`,
|
|
calls module-level `stop_loops`/`start_loops`/`handoff_reset`; I monkeypatch those three to
|
|
recorders so the phase-machine logic (advance, idempotent sequence-complete, append-a-phase
|
|
resumes + clears the stale marker) is covered without launching anything. I also load the shipped
|
|
`agents.example.toml` so an example regression is caught.
|
|
|
|
- Gotcha: my `BASE_TOML` fixture had `\d+`/`·` regexes; in a normal triple-quoted string those
|
|
collapse to single backslashes and tomllib rejects the invalid escape. Fixed by making the
|
|
fixture a raw string (`r"""…"""`) so the on-disk TOML keeps the doubled backslash, like the real
|
|
`agents.example.toml`.
|
|
|
|
**Live smokes.** `smoke_claude.sh` / `smoke_opencode.sh` each spin up a throwaway persistent
|
|
"probe" through `agents.py up` in a sandbox with a unique `session_prefix` and temp `log_dir`,
|
|
confirm the session attaches (pane command `claude`/`opencode`), `status` shows RUNNING, `down`
|
|
removes it; a cleanup trap (EXIT INT TERM) kills everything. claude uses the cheap
|
|
`claude-haiku-4-5`. opencode generalizes cc-ci `test-opencode.sh` onto this repo with its own
|
|
server on `:4097` (a guard refuses `4096`).
|
|
|
|
- Gotcha: the opencode server runs in a subshell `( … serve … ) &`, so `$SERVER_PID` is the
|
|
subshell, not the listener — killing it left `:4097` held (a DoD-4 leftover-port failure I caught
|
|
on the first standalone run). Fixed cleanup to also `pkill -f "opencode serve.*--port ${PORT}"`
|
|
and wait for the port to free. Re-ran: freed.
|
|
|
|
**Verification.** Cold-cloned to `/tmp/aotest-cold` and ran inside `nix develop` (python311) — the
|
|
Adversary's exact path: `unit=PASS (51) claude=PASS opencode=PASS isolation=PASS`, rc=0; afterwards
|
|
no `aotest-*` sessions, `:4097` free, `cc-ci-orchestrator/watchdog/assistant3` present. Pushed the
|
|
deliverable as `cdcece9`; clean tree; claimed the gate.
|