claim(aotest): test suite pushed (deliverable cdcece9) — unit+claude+opencode smokes PASS, isolated, awaiting Adversary
Some checks failed
continuous-integration/drone/push Build is failing
Some checks failed
continuous-integration/drone/push Build is failing
Unit 51/51 PASS, claude smoke PASS, opencode smoke PASS (own :4097), no leftover aotest-* sessions/ports, cc-ci sessions intact. Cold-verified from /tmp clone inside nix develop. HOW/EXPECTED/WHERE in STATUS-aotest.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@ -12,3 +12,41 @@
|
||||
- Builder has not yet pushed any aotest work. Entering polling stance.
|
||||
|
||||
Next: poll agent-orchestrator for new commits every ~10 min.
|
||||
|
||||
---
|
||||
|
||||
## 2026-06-13T18:56Z — (Builder) test suite built, all DoD met, gate CLAIMED
|
||||
|
||||
**Approach.** The harness (agents.py) is mostly pure functions with a thin tmux shell-out layer,
|
||||
so I split testing into (a) unit tests that exercise the pure logic directly and (b) live smokes
|
||||
that drive `agents.py` end-to-end on each real backend.
|
||||
|
||||
**Unit tests (`tests/test_unit.py`, stdlib `unittest`, 51 tests).** Each builds a throwaway
|
||||
project (config + prompts + machine-docs) in a tempdir and calls the harness functions directly —
|
||||
no agents, no live tmux. The one function that *would* spawn sessions, `phase_advance_check`,
|
||||
calls module-level `stop_loops`/`start_loops`/`handoff_reset`; I monkeypatch those three to
|
||||
recorders so the phase-machine logic (advance, idempotent sequence-complete, append-a-phase
|
||||
resumes + clears the stale marker) is covered without launching anything. I also load the shipped
|
||||
`agents.example.toml` so an example regression is caught.
|
||||
|
||||
- Gotcha: my `BASE_TOML` fixture had `\d+`/`·` regexes; in a normal triple-quoted string those
|
||||
collapse to single backslashes and tomllib rejects the invalid escape. Fixed by making the
|
||||
fixture a raw string (`r"""…"""`) so the on-disk TOML keeps the doubled backslash, like the real
|
||||
`agents.example.toml`.
|
||||
|
||||
**Live smokes.** `smoke_claude.sh` / `smoke_opencode.sh` each spin up a throwaway persistent
|
||||
"probe" through `agents.py up` in a sandbox with a unique `session_prefix` and temp `log_dir`,
|
||||
confirm the session attaches (pane command `claude`/`opencode`), `status` shows RUNNING, `down`
|
||||
removes it; a cleanup trap (EXIT INT TERM) kills everything. claude uses the cheap
|
||||
`claude-haiku-4-5`. opencode generalizes cc-ci `test-opencode.sh` onto this repo with its own
|
||||
server on `:4097` (a guard refuses `4096`).
|
||||
|
||||
- Gotcha: the opencode server runs in a subshell `( … serve … ) &`, so `$SERVER_PID` is the
|
||||
subshell, not the listener — killing it left `:4097` held (a DoD-4 leftover-port failure I caught
|
||||
on the first standalone run). Fixed cleanup to also `pkill -f "opencode serve.*--port ${PORT}"`
|
||||
and wait for the port to free. Re-ran: freed.
|
||||
|
||||
**Verification.** Cold-cloned to `/tmp/aotest-cold` and ran inside `nix develop` (python311) — the
|
||||
Adversary's exact path: `unit=PASS (51) claude=PASS opencode=PASS isolation=PASS`, rc=0; afterwards
|
||||
no `aotest-*` sessions, `:4097` free, `cc-ci-orchestrator/watchdog/assistant3` present. Pushed the
|
||||
deliverable as `cdcece9`; clean tree; claimed the gate.
|
||||
|
||||
Reference in New Issue
Block a user