claim(aotest): test suite pushed (deliverable cdcece9) — unit+claude+opencode smokes PASS, isolated, awaiting Adversary
Some checks failed
continuous-integration/drone/push Build is failing
Some checks failed
continuous-integration/drone/push Build is failing
Unit 51/51 PASS, claude smoke PASS, opencode smoke PASS (own :4097), no leftover aotest-* sessions/ports, cc-ci sessions intact. Cold-verified from /tmp clone inside nix develop. HOW/EXPECTED/WHERE in STATUS-aotest.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@ -1,4 +1,17 @@
|
||||
# BACKLOG — phase aotest (Adversary section)
|
||||
# BACKLOG — phase aotest
|
||||
|
||||
## Build backlog
|
||||
|
||||
- [x] Unit tests for: config load + defaults merge, kickoff-template assembly, phase machine
|
||||
(advance/idempotent-complete/append-resumes), limit reset-banner parsing, WAITING-UNTIL/stall
|
||||
parsing, claude+opencode activity detectors. — `tests/test_unit.py` (51 tests)
|
||||
- [x] Isolated live claude smoke through the harness (attach + status + down, cleaned up). —
|
||||
`tests/smoke_claude.sh`
|
||||
- [x] Isolated live opencode smoke through the harness, dedicated non-4096 port, cleaned up. —
|
||||
`tests/smoke_opencode.sh`
|
||||
- [x] Test runner: unit always + live smokes when backends available; README documented. —
|
||||
`tests/run.sh`, README `## Testing`
|
||||
- All items complete at deliverable commit `cdcece9`; gate CLAIMED 2026-06-13T18:56Z.
|
||||
|
||||
## Adversary findings
|
||||
|
||||
|
||||
@ -12,3 +12,41 @@
|
||||
- Builder has not yet pushed any aotest work. Entering polling stance.
|
||||
|
||||
Next: poll agent-orchestrator for new commits every ~10 min.
|
||||
|
||||
---
|
||||
|
||||
## 2026-06-13T18:56Z — (Builder) test suite built, all DoD met, gate CLAIMED
|
||||
|
||||
**Approach.** The harness (agents.py) is mostly pure functions with a thin tmux shell-out layer,
|
||||
so I split testing into (a) unit tests that exercise the pure logic directly and (b) live smokes
|
||||
that drive `agents.py` end-to-end on each real backend.
|
||||
|
||||
**Unit tests (`tests/test_unit.py`, stdlib `unittest`, 51 tests).** Each builds a throwaway
|
||||
project (config + prompts + machine-docs) in a tempdir and calls the harness functions directly —
|
||||
no agents, no live tmux. The one function that *would* spawn sessions, `phase_advance_check`,
|
||||
calls module-level `stop_loops`/`start_loops`/`handoff_reset`; I monkeypatch those three to
|
||||
recorders so the phase-machine logic (advance, idempotent sequence-complete, append-a-phase
|
||||
resumes + clears the stale marker) is covered without launching anything. I also load the shipped
|
||||
`agents.example.toml` so an example regression is caught.
|
||||
|
||||
- Gotcha: my `BASE_TOML` fixture had `\d+`/`·` regexes; in a normal triple-quoted string those
|
||||
collapse to single backslashes and tomllib rejects the invalid escape. Fixed by making the
|
||||
fixture a raw string (`r"""…"""`) so the on-disk TOML keeps the doubled backslash, like the real
|
||||
`agents.example.toml`.
|
||||
|
||||
**Live smokes.** `smoke_claude.sh` / `smoke_opencode.sh` each spin up a throwaway persistent
|
||||
"probe" through `agents.py up` in a sandbox with a unique `session_prefix` and temp `log_dir`,
|
||||
confirm the session attaches (pane command `claude`/`opencode`), `status` shows RUNNING, `down`
|
||||
removes it; a cleanup trap (EXIT INT TERM) kills everything. claude uses the cheap
|
||||
`claude-haiku-4-5`. opencode generalizes cc-ci `test-opencode.sh` onto this repo with its own
|
||||
server on `:4097` (a guard refuses `4096`).
|
||||
|
||||
- Gotcha: the opencode server runs in a subshell `( … serve … ) &`, so `$SERVER_PID` is the
|
||||
subshell, not the listener — killing it left `:4097` held (a DoD-4 leftover-port failure I caught
|
||||
on the first standalone run). Fixed cleanup to also `pkill -f "opencode serve.*--port ${PORT}"`
|
||||
and wait for the port to free. Re-ran: freed.
|
||||
|
||||
**Verification.** Cold-cloned to `/tmp/aotest-cold` and ran inside `nix develop` (python311) — the
|
||||
Adversary's exact path: `unit=PASS (51) claude=PASS opencode=PASS isolation=PASS`, rc=0; afterwards
|
||||
no `aotest-*` sessions, `:4097` free, `cc-ci-orchestrator/watchdog/assistant3` present. Pushed the
|
||||
deliverable as `cdcece9`; clean tree; claimed the gate.
|
||||
|
||||
@ -1,33 +1,74 @@
|
||||
# STATUS — phase aotest (Adversary view)
|
||||
# STATUS — phase aotest (Builder)
|
||||
|
||||
**Phase plan:** `/srv/cc-ci/cc-ci-plan/plan-phase-aotest-verify.md`
|
||||
**Adversary clone:** `/srv/cc-ci/cc-ci-adv`
|
||||
**Phase start:** 2026-06-13T18:44Z
|
||||
**Deliverable repo:** `recipe-maintainers/agent-orchestrator` on `git.autonomic.zone`
|
||||
**Builder working clone:** `/home/loops/aoeng/agent-orchestrator` (outside the cc-ci tracked tree)
|
||||
|
||||
---
|
||||
|
||||
## Current state: AWAITING BUILDER
|
||||
## Gate: aotest CLAIMED, awaiting Adversary
|
||||
|
||||
The `agent-orchestrator` repo (commit `289ef07`, v0.1.0) contains NO `tests/` directory yet.
|
||||
Builder has not yet pushed the aotest deliverable (tests + runner + README doc).
|
||||
The committed test suite is in `tests/` of the deliverable repo. All 5 Definition-of-Done items
|
||||
are satisfied; cold-verify per the HOW/EXPECTED/WHERE below.
|
||||
|
||||
Last checked: 2026-06-13T18:44Z
|
||||
### WHERE (verification inputs)
|
||||
- Repo: `https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git`
|
||||
- `main` HEAD → `cdcece9a9ac64b458103194025f2c22ba830ce15` (commit `cdcece9`, on top of `289ef07` v0.1.0)
|
||||
- New files: `tests/test_unit.py`, `tests/smoke_claude.sh`, `tests/smoke_opencode.sh`,
|
||||
`tests/run.sh`; README updated (file-map line + a new `## Testing` section).
|
||||
- Backends present on this host: `claude` → `/home/loops/.local/bin/claude` (v2.1.177);
|
||||
`opencode` → `/home/loops/.local/bin/opencode`; creds at `/srv/cc-ci/.testenv`.
|
||||
|
||||
### HOW to cold-verify (fresh /tmp clone, exactly as the plan specifies)
|
||||
```
|
||||
cd /tmp && rm -rf aotest-cold
|
||||
git clone https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git aotest-cold
|
||||
cd aotest-cold && git rev-parse HEAD # → cdcece9a9ac6...
|
||||
nix develop -c python3 -m unittest discover -s tests # DoD-1: unit tests
|
||||
nix develop -c ./tests/run.sh # full suite: unit + both smokes + isolation
|
||||
```
|
||||
Individual smokes (each is also invoked by run.sh):
|
||||
```
|
||||
nix develop -c bash tests/smoke_claude.sh # DoD-2
|
||||
nix develop -c bash tests/smoke_opencode.sh # DoD-3 (own server on :4097, ≠ live :4096)
|
||||
```
|
||||
Post-run isolation check (DoD-4):
|
||||
```
|
||||
tmux ls | grep '^aotest-' # EXPECTED: no output (no leftover sessions)
|
||||
ss -ltn | grep ':4097 ' # EXPECTED: no output (port freed)
|
||||
tmux ls | grep -E 'cc-ci-orchestrator|cc-ci-watchdog|cc-ci-assistant3' # EXPECTED: all 3 present
|
||||
```
|
||||
|
||||
### EXPECTED outcomes (from my cold run @2026-06-13T18:55Z on cdcece9, /tmp clone, nix develop)
|
||||
- **DoD-1 Unit tests:** `Ran 51 tests` … `OK`, rc=0. Pure logic — no agents spawned, no tmux
|
||||
sessions created. Covers: config load + defaults merge; kickoff-template assembly; phase machine
|
||||
(advance on `## DONE`, idempotent sequence-complete, append-a-phase resumes); limit reset-banner
|
||||
parsing; `WAITING-UNTIL`/stall parsing; claude + opencode activity detectors; the shipped
|
||||
`agents.example.toml` loads.
|
||||
- **DoD-2 claude smoke:** `=== CLAUDE BACKEND SMOKE: PASS ===`, rc=0 — probe brought up THROUGH
|
||||
`agents.py` (pane command `claude`), `status` shows it RUNNING, `down` removes it. Isolated
|
||||
prefix `aotest-c-<pid>-`; trivial probe on `claude-haiku-4-5`.
|
||||
- **DoD-3 opencode smoke:** `=== OPENCODE BACKEND SMOKE: PASS ===`, rc=0 — dedicated opencode
|
||||
server on **:4097** (not 4096); probe attaches THROUGH `agents.py` (pane command `opencode`),
|
||||
`status` RUNNING, `down` removes it; cleanup kills the server and waits for the port to free.
|
||||
(SKIPs gracefully with rc=0 if `opencode`/creds are absent — not the case on this host.)
|
||||
- **DoD-4 isolation:** runner prints `PASS: no leftover aotest-* tmux sessions` and lists
|
||||
`cc-ci-orchestrator cc-ci-watchdog cc-ci-assistant3` as present; `:4097` free afterwards.
|
||||
- **DoD-5 committed + documented:** the four `tests/` files are committed at `cdcece9`; README
|
||||
`## Testing` section documents `nix develop -c ./tests/run.sh` and what each layer covers.
|
||||
- **Runner summary line:** `SUMMARY: unit=PASS claude=PASS opencode=PASS isolation=PASS` →
|
||||
`ALL RUN TESTS PASSED (skips are OK)`, rc=0.
|
||||
|
||||
Working tree of the deliverable clone is clean and pushed.
|
||||
|
||||
---
|
||||
|
||||
## Gate status
|
||||
|
||||
| Gate | Status | Last checked |
|
||||
| Gate | Status | Claimed |
|
||||
|---|---|---|
|
||||
| Unit tests PASS (clean /tmp, nix develop) | PENDING | — |
|
||||
| Claude smoke test PASSES via harness | PENDING | — |
|
||||
| opencode smoke test PASSES or SKIPS (justified) | PENDING | — |
|
||||
| No leftover aotest-* sessions or ports | PENDING | — |
|
||||
| Test suite + runner committed + documented | PENDING | — |
|
||||
|
||||
---
|
||||
|
||||
## Monitoring
|
||||
|
||||
Polling agent-orchestrator for new commits (tests/ dir push).
|
||||
No gate formally claimed yet.
|
||||
| DoD-1 Unit tests PASS (clean /tmp, nix develop) | CLAIMED | 2026-06-13T18:56Z |
|
||||
| DoD-2 Claude smoke PASSES via harness | CLAIMED | 2026-06-13T18:56Z |
|
||||
| DoD-3 opencode smoke PASSES (dedicated port) | CLAIMED | 2026-06-13T18:56Z |
|
||||
| DoD-4 No leftover aotest-* sessions/ports; cc-ci intact | CLAIMED | 2026-06-13T18:56Z |
|
||||
| DoD-5 Test suite + runner committed + documented | CLAIMED | 2026-06-13T18:56Z |
|
||||
|
||||
Reference in New Issue
Block a user