Files
cc-ci/machine-docs/REVIEW-poe2e.md
autonomic-bot 6e07b3c8e4
Some checks failed
continuous-integration/drone/push Build is failing
review(poe2e): ALL DoD PASS @2026-06-13T19:46Z — phase DONE
2026-06-13 19:47:59 +00:00

169 lines
9.2 KiB
Markdown

# REVIEW — phase poe2e (Adversary)
**Phase plan:** `/srv/cc-ci/cc-ci-plan/plan-phase-poe2e-end-to-end.md`
**Initialized:** 2026-06-13T19:25Z
## Orientation
Phase mission: prove the whole model works end-to-end — PO scaffolds, runs (isolated), and tears
down a throwaway project; cc-ci is modeled as a project in STAGING; live cc-ci is provably untouched.
### Definition of Done (poe2e)
| # | DoD item | Status |
|---|---|---|
| D1 | PO scaffolded, ran (isolated), and tore down a throwaway project — evidence in REVIEW | **PASS @2026-06-13T19:46Z** |
| D2 | Staged `cc-ci` project: engine submodule pinned + migrated `agents.toml`; `agents.py status` MATCHES live cc-ci (side-by-side shown) | **PASS @2026-06-13T19:46Z** |
| D3 | Staged cc-ci registered in `fleet.toml` | **PASS @2026-06-13T19:46Z** |
| D4 | Written, reviewed operator cutover runbook | **PASS @2026-06-13T19:46Z** |
| D5 | Live cc-ci provably untouched: tmux sessions + `/srv/cc-ci/cc-ci-plan/agents.{py,toml}` + `state/` unchanged; no second watchdog started | **PASS @2026-06-13T19:46Z** |
## Verdicts
### ALL DoD PASS @2026-06-13T19:46Z — phase DONE
Cold-verified from the Adversary's own clone (/srv/cc-ci/cc-ci-adv) and fresh shell. No VETO.
---
#### D1 PASS @2026-06-13T19:46Z
Re-ran the full PO scratch lifecycle independently:
```
cd /home/loops/porepo/project-orchestrator
bash scripts/create-project.sh scratch-e2e --dir /tmp/poe2e-scratch --ref v0.1.0 --prefix poe2e-scratch-
```
Scaffold output: `engine pinned at 289ef07df40a8264f3a36b4e91b923d1424c4658 (v0.1.0)`, `config: agents.toml (session_prefix = poe2e-scratch-)`.
Tracked files: `.gitignore`, `.gitmodules`, `agents.toml`, `engine` — no PO/fleet metadata.
Injected demo backend (`prompt_delivery = "exec"` — required; "arg" default causes sleep to receive kickoff as arg and exit):
- `python3 engine/agents.py status` → worker=stopped, watchdog=stopped
- `python3 engine/agents.py up``starting poe2e-scratch-worker (demo, ...)` + `starting watchdog`
- `tmux ls | grep poe2e-scratch` → both sessions present
- `python3 engine/agents.py status``worker RUNNING [sleep]`, `watchdog RUNNING`
- Live cc-ci sessions during run: exactly 8 cc-ci-* sessions unchanged
- `python3 engine/agents.py down``killing poe2e-scratch-worker`, `killing poe2e-scratch-watchdog`
- `tmux ls | grep poe2e-scratch || echo "torn down"` → torn down
- `python3 engine/agents.py status` → both stopped
- `rm -rf /tmp/poe2e-scratch` → throwaway deleted
**Note:** The demo backend in `agents.example.toml` uses `prompt_delivery = "exec"` (not the default "arg"). Any cold-verify that injects the demo backend must include this field — otherwise the sleep process receives the kickoff file content as args and exits immediately.
---
#### D2 PASS @2026-06-13T19:46Z
Cold clone: `git clone --recurse-submodules /home/loops/poe2e/cc-ci /tmp/poe2e-ccci-cold`
- HEAD: `38e5c907b9e37b8aebbfccb2e1ad8de7e2d880cb`
- Submodule: `289ef07df40a8264f3a36b4e91b923d1424c4658 engine (v0.1.0)`
- (a) Phase list: `phases: 19 19 | identical: True`
- (b) Phase seq: `rcust shot lvl5 bsky dstamp mailu kuma drone cfold cf55 pvfix pvcheck ghost cf48 pxgate aoeng aotest porepo poe2e`
- (c) After `phase set 18` (poe2e): `diff /tmp/s.txt /tmp/l.txt`**STATUS BYTE-IDENTICAL**
- Both print: `phase: poe2e [19/19] plan=plan-phase-poe2e-end-to-end.md (in progress)` + identical 8-agent table
- STATE column shows RUNNING for live sessions because `agents.py status` uses read-only `tmux has-session` — the staged project started nothing; both configs point at the same live tmux sessions, which is why status is byte-identical
- (d) `builder kickoff identical: True`, `adversary kickoff identical: True`
Cold clone deleted.
---
#### D3 PASS @2026-06-13T19:46Z
```
cd /home/loops/porepo/project-orchestrator
python3 scripts/fleet.py validate → fleet: OK — 2 project(s), schema v1
python3 scripts/fleet.py status → cc-ci [disabled] agent-orchestrator@v0.1.0 /home/loops/poe2e/cc-ci
total=2 enabled=1 disabled=1
```
`cc-ci` is registered as disabled — correct, it must not be started by the PO (that would conflict with the live system). Operator cutover enables it per runbook §6.
---
#### D4 PASS @2026-06-13T19:46Z
Read `/home/loops/poe2e/cc-ci/docs/cutover-runbook.md`. Covers all expected sections:
- §0: What-stays/what-changes table with exact config deltas
- §1: Pre-flight + parity gate (`engine/agents.py status` on project must match live before proceeding)
- §2: Quiesce live — `systemctl stop cc-ci-loops.service` + `agents.py down` + confirm zero `cc-ci-` sessions (critical: prevents double watchdog on shared namespace)
- §3: Reuse vs fresh start decision (reuse recommended — preserves phase-idx + resume ids)
- §4: Production config delta: change `log_dir` from `.ao-state` back to `/srv/cc-ci/.cc-ci-logs`
- §5: Re-point `launch.py`/`launch.sh` at `engine/agents.py --config agents.toml` (keeps systemd + orchestrator's prompt working unchanged; rollback copy preserved as `launch.py.preproject`)
- §6: Start + validate (launch.py status parity, single watchdog, handoff ping, flip fleet entry to enabled)
- §7: Fast rollback (re-point `launch.py`, restart)
- Appendix: explicitly notes no ACME/DNS/prod-domain work (out of scope)
Runbook is operator-supervised and explicitly states loops MUST NOT perform this cutover themselves.
---
#### D5 PASS @2026-06-13T19:46Z
Final check (vs baseline @19:25Z):
- `agents.toml` SHA256: `0d78ba55329705055bbb39722292b6d131cdd30f37eb814e50316f7c0e222b88` ✓ unchanged
- `agents.py` SHA256: `b4567b73099a587b5727a194f80a5e908d1a1589691294230e6ad1492fb9fe9a` ✓ unchanged
- `state/phase-idx`: `18` ✓ unchanged
- tmux sessions: exactly 8 `cc-ci-*` sessions, all with same creation times as baseline ✓
- `cc-ci-watchdog` count: exactly 1 ✓ (no second watchdog started)
- cc-ci host: `no tmux sessions` ✓ unchanged
The staged project (`/home/loops/poe2e/cc-ci`) uses `session_prefix = "cc-ci-"` for fidelity but the Builder ran ONLY `status`/`phase show`/`phase set` against it — none of which start or kill sessions. The scratch D1 demo ran under `poe2e-scratch-` namespace. No live cc-ci file or session was touched.
## D5 — Live cc-ci baseline snapshot @2026-06-13T19:25Z (pre-Builder)
Taken before Builder started any poe2e work. Will diff against this on cold-verify.
**agents.toml SHA256:** `0d78ba55329705055bbb39722292b6d131cdd30f37eb814e50316f7c0e222b88`
**agents.py SHA256:** `b4567b73099a587b5727a194f80a5e908d1a1589691294230e6ad1492fb9fe9a`
**state/phase-idx:** `18` (poe2e — index 18 in the phases array)
**tmux sessions (orchestrator host, pre-Builder):**
```
cc-ci-adv (just started)
cc-ci-assistant3 (pre-existing since 2026-06-09)
cc-ci-builder (just started)
cc-ci-cleanlogs (pre-existing since 2026-06-02)
cc-ci-orchestrator (pre-existing since 2026-06-13)
cc-ci-report (pre-existing since 2026-06-12)
cc-ci-upgrader (pre-existing since 2026-06-11)
cc-ci-watchdog (pre-existing since 2026-06-13)
```
**cc-ci host tmux:** `no tmux sessions` (cc-ci has no tmux sessions at phase start)
D5 PASS criterion: after all Builder work, agents.toml + agents.py checksums unchanged,
state/phase-idx still 18, no new cc-ci-*-prefixed watchdog sessions started, cc-ci host tmux
still empty (or unchanged).
**Note on JOURNAL:** The system-reminder auto-surfaced JOURNAL-poe2e.md contents during git pull
(Builder had overwritten the file). I noted the live `agents.py status` capture therein — I will
re-run this independently during cold-verify and will NOT use the Builder's capture as my verdict.
## Break-it probes
(will log independent probes here as they run)
## D2 — Live agents.py status (Adversary independent capture @2026-06-13T19:36Z)
Run from scratch: `cd /srv/cc-ci/cc-ci-plan && python3 agents.py status`
```
phase: poe2e [19/19] plan=plan-phase-poe2e-end-to-end.md (in progress)
AGENT KIND BACKEND MODEL WATCH STATE
orchestrator persistent claude claude-opus-4-8 heal RUNNING [claude]
builder loop claude claude-opus-4-8 heal+stall RUNNING [claude]
adversary loop claude claude-sonnet-4-6 heal+stall RUNNING [claude]
assistant persistent claude claude-sonnet-4-6 none stopped (disabled)
upgrader task claude claude-sonnet-4-6 none RUNNING (disabled) [claude]
report task claude claude-opus-4-8 none RUNNING (disabled) [claude]
cleanlogs service - - - RUNNING
watchdog service - - - RUNNING
```
This is the parity target for D2. The staged cc-ci `agents.py status` must match the AGENT/KIND/BACKEND/MODEL/WATCH columns (STATE will differ — staged is never started, so all agents will show `stopped`).
Also noted: PO scripts exist at `/home/loops/porepo/project-orchestrator/scripts/` (create, start, stop, update, fleet.py). The `demo` backend is defined in `agents.example.toml` as `bin = "echo '[demo] ...' ; exec sleep 1000000"` — starts a sleeping process the engine tracks as RUNNING. This is what D1 will use for the isolated run.