169 lines
9.2 KiB
Markdown
169 lines
9.2 KiB
Markdown
# REVIEW — phase poe2e (Adversary)
|
|
|
|
**Phase plan:** `/srv/cc-ci/cc-ci-plan/plan-phase-poe2e-end-to-end.md`
|
|
**Initialized:** 2026-06-13T19:25Z
|
|
|
|
## Orientation
|
|
|
|
Phase mission: prove the whole model works end-to-end — PO scaffolds, runs (isolated), and tears
|
|
down a throwaway project; cc-ci is modeled as a project in STAGING; live cc-ci is provably untouched.
|
|
|
|
### Definition of Done (poe2e)
|
|
|
|
| # | DoD item | Status |
|
|
|---|---|---|
|
|
| D1 | PO scaffolded, ran (isolated), and tore down a throwaway project — evidence in REVIEW | **PASS @2026-06-13T19:46Z** |
|
|
| D2 | Staged `cc-ci` project: engine submodule pinned + migrated `agents.toml`; `agents.py status` MATCHES live cc-ci (side-by-side shown) | **PASS @2026-06-13T19:46Z** |
|
|
| D3 | Staged cc-ci registered in `fleet.toml` | **PASS @2026-06-13T19:46Z** |
|
|
| D4 | Written, reviewed operator cutover runbook | **PASS @2026-06-13T19:46Z** |
|
|
| D5 | Live cc-ci provably untouched: tmux sessions + `/srv/cc-ci/cc-ci-plan/agents.{py,toml}` + `state/` unchanged; no second watchdog started | **PASS @2026-06-13T19:46Z** |
|
|
|
|
## Verdicts
|
|
|
|
### ALL DoD PASS @2026-06-13T19:46Z — phase DONE
|
|
|
|
Cold-verified from the Adversary's own clone (/srv/cc-ci/cc-ci-adv) and fresh shell. No VETO.
|
|
|
|
---
|
|
|
|
#### D1 PASS @2026-06-13T19:46Z
|
|
|
|
Re-ran the full PO scratch lifecycle independently:
|
|
|
|
```
|
|
cd /home/loops/porepo/project-orchestrator
|
|
bash scripts/create-project.sh scratch-e2e --dir /tmp/poe2e-scratch --ref v0.1.0 --prefix poe2e-scratch-
|
|
```
|
|
|
|
Scaffold output: `engine pinned at 289ef07df40a8264f3a36b4e91b923d1424c4658 (v0.1.0)`, `config: agents.toml (session_prefix = poe2e-scratch-)`.
|
|
Tracked files: `.gitignore`, `.gitmodules`, `agents.toml`, `engine` — no PO/fleet metadata.
|
|
|
|
Injected demo backend (`prompt_delivery = "exec"` — required; "arg" default causes sleep to receive kickoff as arg and exit):
|
|
- `python3 engine/agents.py status` → worker=stopped, watchdog=stopped
|
|
- `python3 engine/agents.py up` → `starting poe2e-scratch-worker (demo, ...)` + `starting watchdog`
|
|
- `tmux ls | grep poe2e-scratch` → both sessions present
|
|
- `python3 engine/agents.py status` → `worker RUNNING [sleep]`, `watchdog RUNNING`
|
|
- Live cc-ci sessions during run: exactly 8 cc-ci-* sessions unchanged
|
|
- `python3 engine/agents.py down` → `killing poe2e-scratch-worker`, `killing poe2e-scratch-watchdog`
|
|
- `tmux ls | grep poe2e-scratch || echo "torn down"` → torn down
|
|
- `python3 engine/agents.py status` → both stopped
|
|
- `rm -rf /tmp/poe2e-scratch` → throwaway deleted
|
|
|
|
**Note:** The demo backend in `agents.example.toml` uses `prompt_delivery = "exec"` (not the default "arg"). Any cold-verify that injects the demo backend must include this field — otherwise the sleep process receives the kickoff file content as args and exits immediately.
|
|
|
|
---
|
|
|
|
#### D2 PASS @2026-06-13T19:46Z
|
|
|
|
Cold clone: `git clone --recurse-submodules /home/loops/poe2e/cc-ci /tmp/poe2e-ccci-cold`
|
|
|
|
- HEAD: `38e5c907b9e37b8aebbfccb2e1ad8de7e2d880cb` ✓
|
|
- Submodule: `289ef07df40a8264f3a36b4e91b923d1424c4658 engine (v0.1.0)` ✓
|
|
- (a) Phase list: `phases: 19 19 | identical: True` ✓
|
|
- (b) Phase seq: `rcust shot lvl5 bsky dstamp mailu kuma drone cfold cf55 pvfix pvcheck ghost cf48 pxgate aoeng aotest porepo poe2e` ✓
|
|
- (c) After `phase set 18` (poe2e): `diff /tmp/s.txt /tmp/l.txt` → **STATUS BYTE-IDENTICAL** ✓
|
|
- Both print: `phase: poe2e [19/19] plan=plan-phase-poe2e-end-to-end.md (in progress)` + identical 8-agent table
|
|
- STATE column shows RUNNING for live sessions because `agents.py status` uses read-only `tmux has-session` — the staged project started nothing; both configs point at the same live tmux sessions, which is why status is byte-identical
|
|
- (d) `builder kickoff identical: True`, `adversary kickoff identical: True` ✓
|
|
|
|
Cold clone deleted.
|
|
|
|
---
|
|
|
|
#### D3 PASS @2026-06-13T19:46Z
|
|
|
|
```
|
|
cd /home/loops/porepo/project-orchestrator
|
|
python3 scripts/fleet.py validate → fleet: OK — 2 project(s), schema v1
|
|
python3 scripts/fleet.py status → cc-ci [disabled] agent-orchestrator@v0.1.0 /home/loops/poe2e/cc-ci
|
|
total=2 enabled=1 disabled=1
|
|
```
|
|
|
|
`cc-ci` is registered as disabled — correct, it must not be started by the PO (that would conflict with the live system). Operator cutover enables it per runbook §6.
|
|
|
|
---
|
|
|
|
#### D4 PASS @2026-06-13T19:46Z
|
|
|
|
Read `/home/loops/poe2e/cc-ci/docs/cutover-runbook.md`. Covers all expected sections:
|
|
- §0: What-stays/what-changes table with exact config deltas
|
|
- §1: Pre-flight + parity gate (`engine/agents.py status` on project must match live before proceeding)
|
|
- §2: Quiesce live — `systemctl stop cc-ci-loops.service` + `agents.py down` + confirm zero `cc-ci-` sessions (critical: prevents double watchdog on shared namespace)
|
|
- §3: Reuse vs fresh start decision (reuse recommended — preserves phase-idx + resume ids)
|
|
- §4: Production config delta: change `log_dir` from `.ao-state` back to `/srv/cc-ci/.cc-ci-logs`
|
|
- §5: Re-point `launch.py`/`launch.sh` at `engine/agents.py --config agents.toml` (keeps systemd + orchestrator's prompt working unchanged; rollback copy preserved as `launch.py.preproject`)
|
|
- §6: Start + validate (launch.py status parity, single watchdog, handoff ping, flip fleet entry to enabled)
|
|
- §7: Fast rollback (re-point `launch.py`, restart)
|
|
- Appendix: explicitly notes no ACME/DNS/prod-domain work (out of scope)
|
|
|
|
Runbook is operator-supervised and explicitly states loops MUST NOT perform this cutover themselves.
|
|
|
|
---
|
|
|
|
#### D5 PASS @2026-06-13T19:46Z
|
|
|
|
Final check (vs baseline @19:25Z):
|
|
- `agents.toml` SHA256: `0d78ba55329705055bbb39722292b6d131cdd30f37eb814e50316f7c0e222b88` ✓ unchanged
|
|
- `agents.py` SHA256: `b4567b73099a587b5727a194f80a5e908d1a1589691294230e6ad1492fb9fe9a` ✓ unchanged
|
|
- `state/phase-idx`: `18` ✓ unchanged
|
|
- tmux sessions: exactly 8 `cc-ci-*` sessions, all with same creation times as baseline ✓
|
|
- `cc-ci-watchdog` count: exactly 1 ✓ (no second watchdog started)
|
|
- cc-ci host: `no tmux sessions` ✓ unchanged
|
|
|
|
The staged project (`/home/loops/poe2e/cc-ci`) uses `session_prefix = "cc-ci-"` for fidelity but the Builder ran ONLY `status`/`phase show`/`phase set` against it — none of which start or kill sessions. The scratch D1 demo ran under `poe2e-scratch-` namespace. No live cc-ci file or session was touched.
|
|
|
|
## D5 — Live cc-ci baseline snapshot @2026-06-13T19:25Z (pre-Builder)
|
|
|
|
Taken before Builder started any poe2e work. Will diff against this on cold-verify.
|
|
|
|
**agents.toml SHA256:** `0d78ba55329705055bbb39722292b6d131cdd30f37eb814e50316f7c0e222b88`
|
|
**agents.py SHA256:** `b4567b73099a587b5727a194f80a5e908d1a1589691294230e6ad1492fb9fe9a`
|
|
**state/phase-idx:** `18` (poe2e — index 18 in the phases array)
|
|
|
|
**tmux sessions (orchestrator host, pre-Builder):**
|
|
```
|
|
cc-ci-adv (just started)
|
|
cc-ci-assistant3 (pre-existing since 2026-06-09)
|
|
cc-ci-builder (just started)
|
|
cc-ci-cleanlogs (pre-existing since 2026-06-02)
|
|
cc-ci-orchestrator (pre-existing since 2026-06-13)
|
|
cc-ci-report (pre-existing since 2026-06-12)
|
|
cc-ci-upgrader (pre-existing since 2026-06-11)
|
|
cc-ci-watchdog (pre-existing since 2026-06-13)
|
|
```
|
|
|
|
**cc-ci host tmux:** `no tmux sessions` (cc-ci has no tmux sessions at phase start)
|
|
|
|
D5 PASS criterion: after all Builder work, agents.toml + agents.py checksums unchanged,
|
|
state/phase-idx still 18, no new cc-ci-*-prefixed watchdog sessions started, cc-ci host tmux
|
|
still empty (or unchanged).
|
|
|
|
**Note on JOURNAL:** The system-reminder auto-surfaced JOURNAL-poe2e.md contents during git pull
|
|
(Builder had overwritten the file). I noted the live `agents.py status` capture therein — I will
|
|
re-run this independently during cold-verify and will NOT use the Builder's capture as my verdict.
|
|
|
|
## Break-it probes
|
|
|
|
(will log independent probes here as they run)
|
|
|
|
## D2 — Live agents.py status (Adversary independent capture @2026-06-13T19:36Z)
|
|
|
|
Run from scratch: `cd /srv/cc-ci/cc-ci-plan && python3 agents.py status`
|
|
|
|
```
|
|
phase: poe2e [19/19] plan=plan-phase-poe2e-end-to-end.md (in progress)
|
|
AGENT KIND BACKEND MODEL WATCH STATE
|
|
orchestrator persistent claude claude-opus-4-8 heal RUNNING [claude]
|
|
builder loop claude claude-opus-4-8 heal+stall RUNNING [claude]
|
|
adversary loop claude claude-sonnet-4-6 heal+stall RUNNING [claude]
|
|
assistant persistent claude claude-sonnet-4-6 none stopped (disabled)
|
|
upgrader task claude claude-sonnet-4-6 none RUNNING (disabled) [claude]
|
|
report task claude claude-opus-4-8 none RUNNING (disabled) [claude]
|
|
cleanlogs service - - - RUNNING
|
|
watchdog service - - - RUNNING
|
|
```
|
|
|
|
This is the parity target for D2. The staged cc-ci `agents.py status` must match the AGENT/KIND/BACKEND/MODEL/WATCH columns (STATE will differ — staged is never started, so all agents will show `stopped`).
|
|
|
|
Also noted: PO scripts exist at `/home/loops/porepo/project-orchestrator/scripts/` (create, start, stop, update, fleet.py). The `demo` backend is defined in `agents.example.toml` as `bin = "echo '[demo] ...' ; exec sleep 1000000"` — starts a sleeping process the engine tracks as RUNNING. This is what D1 will use for the isolated run.
|