From c838c9250dfc621e30af1619cbb10100c00171a1 Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Sat, 13 Jun 2026 18:59:11 +0000 Subject: [PATCH] =?UTF-8?q?claim(aotest):=20test=20suite=20pushed=20(deliv?= =?UTF-8?q?erable=20cdcece9)=20=E2=80=94=20unit+claude+opencode=20smokes?= =?UTF-8?q?=20PASS,=20isolated,=20awaiting=20Adversary?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Unit 51/51 PASS, claude smoke PASS, opencode smoke PASS (own :4097), no leftover aotest-* sessions/ports, cc-ci sessions intact. Cold-verified from /tmp clone inside nix develop. HOW/EXPECTED/WHERE in STATUS-aotest.md. Co-Authored-By: Claude Opus 4.8 --- machine-docs/BACKLOG-aotest.md | 15 ++++++- machine-docs/JOURNAL-aotest.md | 38 ++++++++++++++++ machine-docs/STATUS-aotest.md | 81 +++++++++++++++++++++++++--------- 3 files changed, 113 insertions(+), 21 deletions(-) diff --git a/machine-docs/BACKLOG-aotest.md b/machine-docs/BACKLOG-aotest.md index 8dd04bc..f28bedf 100644 --- a/machine-docs/BACKLOG-aotest.md +++ b/machine-docs/BACKLOG-aotest.md @@ -1,4 +1,17 @@ -# BACKLOG — phase aotest (Adversary section) +# BACKLOG — phase aotest + +## Build backlog + +- [x] Unit tests for: config load + defaults merge, kickoff-template assembly, phase machine + (advance/idempotent-complete/append-resumes), limit reset-banner parsing, WAITING-UNTIL/stall + parsing, claude+opencode activity detectors. — `tests/test_unit.py` (51 tests) +- [x] Isolated live claude smoke through the harness (attach + status + down, cleaned up). — + `tests/smoke_claude.sh` +- [x] Isolated live opencode smoke through the harness, dedicated non-4096 port, cleaned up. — + `tests/smoke_opencode.sh` +- [x] Test runner: unit always + live smokes when backends available; README documented. — + `tests/run.sh`, README `## Testing` +- All items complete at deliverable commit `cdcece9`; gate CLAIMED 2026-06-13T18:56Z. ## Adversary findings diff --git a/machine-docs/JOURNAL-aotest.md b/machine-docs/JOURNAL-aotest.md index 3f29e4f..30ac98d 100644 --- a/machine-docs/JOURNAL-aotest.md +++ b/machine-docs/JOURNAL-aotest.md @@ -12,3 +12,41 @@ - Builder has not yet pushed any aotest work. Entering polling stance. Next: poll agent-orchestrator for new commits every ~10 min. + +--- + +## 2026-06-13T18:56Z — (Builder) test suite built, all DoD met, gate CLAIMED + +**Approach.** The harness (agents.py) is mostly pure functions with a thin tmux shell-out layer, +so I split testing into (a) unit tests that exercise the pure logic directly and (b) live smokes +that drive `agents.py` end-to-end on each real backend. + +**Unit tests (`tests/test_unit.py`, stdlib `unittest`, 51 tests).** Each builds a throwaway +project (config + prompts + machine-docs) in a tempdir and calls the harness functions directly — +no agents, no live tmux. The one function that *would* spawn sessions, `phase_advance_check`, +calls module-level `stop_loops`/`start_loops`/`handoff_reset`; I monkeypatch those three to +recorders so the phase-machine logic (advance, idempotent sequence-complete, append-a-phase +resumes + clears the stale marker) is covered without launching anything. I also load the shipped +`agents.example.toml` so an example regression is caught. + +- Gotcha: my `BASE_TOML` fixture had `\d+`/`·` regexes; in a normal triple-quoted string those + collapse to single backslashes and tomllib rejects the invalid escape. Fixed by making the + fixture a raw string (`r"""…"""`) so the on-disk TOML keeps the doubled backslash, like the real + `agents.example.toml`. + +**Live smokes.** `smoke_claude.sh` / `smoke_opencode.sh` each spin up a throwaway persistent +"probe" through `agents.py up` in a sandbox with a unique `session_prefix` and temp `log_dir`, +confirm the session attaches (pane command `claude`/`opencode`), `status` shows RUNNING, `down` +removes it; a cleanup trap (EXIT INT TERM) kills everything. claude uses the cheap +`claude-haiku-4-5`. opencode generalizes cc-ci `test-opencode.sh` onto this repo with its own +server on `:4097` (a guard refuses `4096`). + +- Gotcha: the opencode server runs in a subshell `( … serve … ) &`, so `$SERVER_PID` is the + subshell, not the listener — killing it left `:4097` held (a DoD-4 leftover-port failure I caught + on the first standalone run). Fixed cleanup to also `pkill -f "opencode serve.*--port ${PORT}"` + and wait for the port to free. Re-ran: freed. + +**Verification.** Cold-cloned to `/tmp/aotest-cold` and ran inside `nix develop` (python311) — the +Adversary's exact path: `unit=PASS (51) claude=PASS opencode=PASS isolation=PASS`, rc=0; afterwards +no `aotest-*` sessions, `:4097` free, `cc-ci-orchestrator/watchdog/assistant3` present. Pushed the +deliverable as `cdcece9`; clean tree; claimed the gate. diff --git a/machine-docs/STATUS-aotest.md b/machine-docs/STATUS-aotest.md index 495f35f..e739ca9 100644 --- a/machine-docs/STATUS-aotest.md +++ b/machine-docs/STATUS-aotest.md @@ -1,33 +1,74 @@ -# STATUS — phase aotest (Adversary view) +# STATUS — phase aotest (Builder) **Phase plan:** `/srv/cc-ci/cc-ci-plan/plan-phase-aotest-verify.md` -**Adversary clone:** `/srv/cc-ci/cc-ci-adv` -**Phase start:** 2026-06-13T18:44Z +**Deliverable repo:** `recipe-maintainers/agent-orchestrator` on `git.autonomic.zone` +**Builder working clone:** `/home/loops/aoeng/agent-orchestrator` (outside the cc-ci tracked tree) --- -## Current state: AWAITING BUILDER +## Gate: aotest CLAIMED, awaiting Adversary -The `agent-orchestrator` repo (commit `289ef07`, v0.1.0) contains NO `tests/` directory yet. -Builder has not yet pushed the aotest deliverable (tests + runner + README doc). +The committed test suite is in `tests/` of the deliverable repo. All 5 Definition-of-Done items +are satisfied; cold-verify per the HOW/EXPECTED/WHERE below. -Last checked: 2026-06-13T18:44Z +### WHERE (verification inputs) +- Repo: `https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git` +- `main` HEAD → `cdcece9a9ac64b458103194025f2c22ba830ce15` (commit `cdcece9`, on top of `289ef07` v0.1.0) +- New files: `tests/test_unit.py`, `tests/smoke_claude.sh`, `tests/smoke_opencode.sh`, + `tests/run.sh`; README updated (file-map line + a new `## Testing` section). +- Backends present on this host: `claude` → `/home/loops/.local/bin/claude` (v2.1.177); + `opencode` → `/home/loops/.local/bin/opencode`; creds at `/srv/cc-ci/.testenv`. + +### HOW to cold-verify (fresh /tmp clone, exactly as the plan specifies) +``` +cd /tmp && rm -rf aotest-cold +git clone https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git aotest-cold +cd aotest-cold && git rev-parse HEAD # → cdcece9a9ac6... +nix develop -c python3 -m unittest discover -s tests # DoD-1: unit tests +nix develop -c ./tests/run.sh # full suite: unit + both smokes + isolation +``` +Individual smokes (each is also invoked by run.sh): +``` +nix develop -c bash tests/smoke_claude.sh # DoD-2 +nix develop -c bash tests/smoke_opencode.sh # DoD-3 (own server on :4097, ≠ live :4096) +``` +Post-run isolation check (DoD-4): +``` +tmux ls | grep '^aotest-' # EXPECTED: no output (no leftover sessions) +ss -ltn | grep ':4097 ' # EXPECTED: no output (port freed) +tmux ls | grep -E 'cc-ci-orchestrator|cc-ci-watchdog|cc-ci-assistant3' # EXPECTED: all 3 present +``` + +### EXPECTED outcomes (from my cold run @2026-06-13T18:55Z on cdcece9, /tmp clone, nix develop) +- **DoD-1 Unit tests:** `Ran 51 tests` … `OK`, rc=0. Pure logic — no agents spawned, no tmux + sessions created. Covers: config load + defaults merge; kickoff-template assembly; phase machine + (advance on `## DONE`, idempotent sequence-complete, append-a-phase resumes); limit reset-banner + parsing; `WAITING-UNTIL`/stall parsing; claude + opencode activity detectors; the shipped + `agents.example.toml` loads. +- **DoD-2 claude smoke:** `=== CLAUDE BACKEND SMOKE: PASS ===`, rc=0 — probe brought up THROUGH + `agents.py` (pane command `claude`), `status` shows it RUNNING, `down` removes it. Isolated + prefix `aotest-c--`; trivial probe on `claude-haiku-4-5`. +- **DoD-3 opencode smoke:** `=== OPENCODE BACKEND SMOKE: PASS ===`, rc=0 — dedicated opencode + server on **:4097** (not 4096); probe attaches THROUGH `agents.py` (pane command `opencode`), + `status` RUNNING, `down` removes it; cleanup kills the server and waits for the port to free. + (SKIPs gracefully with rc=0 if `opencode`/creds are absent — not the case on this host.) +- **DoD-4 isolation:** runner prints `PASS: no leftover aotest-* tmux sessions` and lists + `cc-ci-orchestrator cc-ci-watchdog cc-ci-assistant3` as present; `:4097` free afterwards. +- **DoD-5 committed + documented:** the four `tests/` files are committed at `cdcece9`; README + `## Testing` section documents `nix develop -c ./tests/run.sh` and what each layer covers. +- **Runner summary line:** `SUMMARY: unit=PASS claude=PASS opencode=PASS isolation=PASS` → + `ALL RUN TESTS PASSED (skips are OK)`, rc=0. + +Working tree of the deliverable clone is clean and pushed. --- ## Gate status -| Gate | Status | Last checked | +| Gate | Status | Claimed | |---|---|---| -| Unit tests PASS (clean /tmp, nix develop) | PENDING | — | -| Claude smoke test PASSES via harness | PENDING | — | -| opencode smoke test PASSES or SKIPS (justified) | PENDING | — | -| No leftover aotest-* sessions or ports | PENDING | — | -| Test suite + runner committed + documented | PENDING | — | - ---- - -## Monitoring - -Polling agent-orchestrator for new commits (tests/ dir push). -No gate formally claimed yet. +| DoD-1 Unit tests PASS (clean /tmp, nix develop) | CLAIMED | 2026-06-13T18:56Z | +| DoD-2 Claude smoke PASSES via harness | CLAIMED | 2026-06-13T18:56Z | +| DoD-3 opencode smoke PASSES (dedicated port) | CLAIMED | 2026-06-13T18:56Z | +| DoD-4 No leftover aotest-* sessions/ports; cc-ci intact | CLAIMED | 2026-06-13T18:56Z | +| DoD-5 Test suite + runner committed + documented | CLAIMED | 2026-06-13T18:56Z |