claim(aotest): test suite pushed (deliverable cdcece9) — unit+claude+opencode smokes PASS, isolated, awaiting Adversary
Some checks failed
continuous-integration/drone/push Build is failing

Unit 51/51 PASS, claude smoke PASS, opencode smoke PASS (own :4097), no
leftover aotest-* sessions/ports, cc-ci sessions intact. Cold-verified from
/tmp clone inside nix develop. HOW/EXPECTED/WHERE in STATUS-aotest.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
autonomic-bot
2026-06-13 18:59:11 +00:00
parent 1c15cbb934
commit c838c9250d
3 changed files with 113 additions and 21 deletions

View File

@ -1,4 +1,17 @@
# BACKLOG — phase aotest (Adversary section)
# BACKLOG — phase aotest
## Build backlog
- [x] Unit tests for: config load + defaults merge, kickoff-template assembly, phase machine
(advance/idempotent-complete/append-resumes), limit reset-banner parsing, WAITING-UNTIL/stall
parsing, claude+opencode activity detectors. — `tests/test_unit.py` (51 tests)
- [x] Isolated live claude smoke through the harness (attach + status + down, cleaned up). —
`tests/smoke_claude.sh`
- [x] Isolated live opencode smoke through the harness, dedicated non-4096 port, cleaned up. —
`tests/smoke_opencode.sh`
- [x] Test runner: unit always + live smokes when backends available; README documented. —
`tests/run.sh`, README `## Testing`
- All items complete at deliverable commit `cdcece9`; gate CLAIMED 2026-06-13T18:56Z.
## Adversary findings

View File

@ -12,3 +12,41 @@
- Builder has not yet pushed any aotest work. Entering polling stance.
Next: poll agent-orchestrator for new commits every ~10 min.
---
## 2026-06-13T18:56Z — (Builder) test suite built, all DoD met, gate CLAIMED
**Approach.** The harness (agents.py) is mostly pure functions with a thin tmux shell-out layer,
so I split testing into (a) unit tests that exercise the pure logic directly and (b) live smokes
that drive `agents.py` end-to-end on each real backend.
**Unit tests (`tests/test_unit.py`, stdlib `unittest`, 51 tests).** Each builds a throwaway
project (config + prompts + machine-docs) in a tempdir and calls the harness functions directly —
no agents, no live tmux. The one function that *would* spawn sessions, `phase_advance_check`,
calls module-level `stop_loops`/`start_loops`/`handoff_reset`; I monkeypatch those three to
recorders so the phase-machine logic (advance, idempotent sequence-complete, append-a-phase
resumes + clears the stale marker) is covered without launching anything. I also load the shipped
`agents.example.toml` so an example regression is caught.
- Gotcha: my `BASE_TOML` fixture had `\d+`/`·` regexes; in a normal triple-quoted string those
collapse to single backslashes and tomllib rejects the invalid escape. Fixed by making the
fixture a raw string (`r"""…"""`) so the on-disk TOML keeps the doubled backslash, like the real
`agents.example.toml`.
**Live smokes.** `smoke_claude.sh` / `smoke_opencode.sh` each spin up a throwaway persistent
"probe" through `agents.py up` in a sandbox with a unique `session_prefix` and temp `log_dir`,
confirm the session attaches (pane command `claude`/`opencode`), `status` shows RUNNING, `down`
removes it; a cleanup trap (EXIT INT TERM) kills everything. claude uses the cheap
`claude-haiku-4-5`. opencode generalizes cc-ci `test-opencode.sh` onto this repo with its own
server on `:4097` (a guard refuses `4096`).
- Gotcha: the opencode server runs in a subshell `( … serve … ) &`, so `$SERVER_PID` is the
subshell, not the listener — killing it left `:4097` held (a DoD-4 leftover-port failure I caught
on the first standalone run). Fixed cleanup to also `pkill -f "opencode serve.*--port ${PORT}"`
and wait for the port to free. Re-ran: freed.
**Verification.** Cold-cloned to `/tmp/aotest-cold` and ran inside `nix develop` (python311) — the
Adversary's exact path: `unit=PASS (51) claude=PASS opencode=PASS isolation=PASS`, rc=0; afterwards
no `aotest-*` sessions, `:4097` free, `cc-ci-orchestrator/watchdog/assistant3` present. Pushed the
deliverable as `cdcece9`; clean tree; claimed the gate.

View File

@ -1,33 +1,74 @@
# STATUS — phase aotest (Adversary view)
# STATUS — phase aotest (Builder)
**Phase plan:** `/srv/cc-ci/cc-ci-plan/plan-phase-aotest-verify.md`
**Adversary clone:** `/srv/cc-ci/cc-ci-adv`
**Phase start:** 2026-06-13T18:44Z
**Deliverable repo:** `recipe-maintainers/agent-orchestrator` on `git.autonomic.zone`
**Builder working clone:** `/home/loops/aoeng/agent-orchestrator` (outside the cc-ci tracked tree)
---
## Current state: AWAITING BUILDER
## Gate: aotest CLAIMED, awaiting Adversary
The `agent-orchestrator` repo (commit `289ef07`, v0.1.0) contains NO `tests/` directory yet.
Builder has not yet pushed the aotest deliverable (tests + runner + README doc).
The committed test suite is in `tests/` of the deliverable repo. All 5 Definition-of-Done items
are satisfied; cold-verify per the HOW/EXPECTED/WHERE below.
Last checked: 2026-06-13T18:44Z
### WHERE (verification inputs)
- Repo: `https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git`
- `main` HEAD → `cdcece9a9ac64b458103194025f2c22ba830ce15` (commit `cdcece9`, on top of `289ef07` v0.1.0)
- New files: `tests/test_unit.py`, `tests/smoke_claude.sh`, `tests/smoke_opencode.sh`,
`tests/run.sh`; README updated (file-map line + a new `## Testing` section).
- Backends present on this host: `claude``/home/loops/.local/bin/claude` (v2.1.177);
`opencode``/home/loops/.local/bin/opencode`; creds at `/srv/cc-ci/.testenv`.
### HOW to cold-verify (fresh /tmp clone, exactly as the plan specifies)
```
cd /tmp && rm -rf aotest-cold
git clone https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git aotest-cold
cd aotest-cold && git rev-parse HEAD # → cdcece9a9ac6...
nix develop -c python3 -m unittest discover -s tests # DoD-1: unit tests
nix develop -c ./tests/run.sh # full suite: unit + both smokes + isolation
```
Individual smokes (each is also invoked by run.sh):
```
nix develop -c bash tests/smoke_claude.sh # DoD-2
nix develop -c bash tests/smoke_opencode.sh # DoD-3 (own server on :4097, ≠ live :4096)
```
Post-run isolation check (DoD-4):
```
tmux ls | grep '^aotest-' # EXPECTED: no output (no leftover sessions)
ss -ltn | grep ':4097 ' # EXPECTED: no output (port freed)
tmux ls | grep -E 'cc-ci-orchestrator|cc-ci-watchdog|cc-ci-assistant3' # EXPECTED: all 3 present
```
### EXPECTED outcomes (from my cold run @2026-06-13T18:55Z on cdcece9, /tmp clone, nix develop)
- **DoD-1 Unit tests:** `Ran 51 tests``OK`, rc=0. Pure logic — no agents spawned, no tmux
sessions created. Covers: config load + defaults merge; kickoff-template assembly; phase machine
(advance on `## DONE`, idempotent sequence-complete, append-a-phase resumes); limit reset-banner
parsing; `WAITING-UNTIL`/stall parsing; claude + opencode activity detectors; the shipped
`agents.example.toml` loads.
- **DoD-2 claude smoke:** `=== CLAUDE BACKEND SMOKE: PASS ===`, rc=0 — probe brought up THROUGH
`agents.py` (pane command `claude`), `status` shows it RUNNING, `down` removes it. Isolated
prefix `aotest-c-<pid>-`; trivial probe on `claude-haiku-4-5`.
- **DoD-3 opencode smoke:** `=== OPENCODE BACKEND SMOKE: PASS ===`, rc=0 — dedicated opencode
server on **:4097** (not 4096); probe attaches THROUGH `agents.py` (pane command `opencode`),
`status` RUNNING, `down` removes it; cleanup kills the server and waits for the port to free.
(SKIPs gracefully with rc=0 if `opencode`/creds are absent — not the case on this host.)
- **DoD-4 isolation:** runner prints `PASS: no leftover aotest-* tmux sessions` and lists
`cc-ci-orchestrator cc-ci-watchdog cc-ci-assistant3` as present; `:4097` free afterwards.
- **DoD-5 committed + documented:** the four `tests/` files are committed at `cdcece9`; README
`## Testing` section documents `nix develop -c ./tests/run.sh` and what each layer covers.
- **Runner summary line:** `SUMMARY: unit=PASS claude=PASS opencode=PASS isolation=PASS`
`ALL RUN TESTS PASSED (skips are OK)`, rc=0.
Working tree of the deliverable clone is clean and pushed.
---
## Gate status
| Gate | Status | Last checked |
| Gate | Status | Claimed |
|---|---|---|
| Unit tests PASS (clean /tmp, nix develop) | PENDING | — |
| Claude smoke test PASSES via harness | PENDING | — |
| opencode smoke test PASSES or SKIPS (justified) | PENDING | — |
| No leftover aotest-* sessions or ports | PENDING | — |
| Test suite + runner committed + documented | PENDING | — |
---
## Monitoring
Polling agent-orchestrator for new commits (tests/ dir push).
No gate formally claimed yet.
| DoD-1 Unit tests PASS (clean /tmp, nix develop) | CLAIMED | 2026-06-13T18:56Z |
| DoD-2 Claude smoke PASSES via harness | CLAIMED | 2026-06-13T18:56Z |
| DoD-3 opencode smoke PASSES (dedicated port) | CLAIMED | 2026-06-13T18:56Z |
| DoD-4 No leftover aotest-* sessions/ports; cc-ci intact | CLAIMED | 2026-06-13T18:56Z |
| DoD-5 Test suite + runner committed + documented | CLAIMED | 2026-06-13T18:56Z |