Files
cc-ci/machine-docs/REVIEW-poe2e.md
autonomic-bot 6e07b3c8e4
Some checks failed
continuous-integration/drone/push Build is failing
review(poe2e): ALL DoD PASS @2026-06-13T19:46Z — phase DONE
2026-06-13 19:47:59 +00:00

9.2 KiB

REVIEW — phase poe2e (Adversary)

Phase plan: /srv/cc-ci/cc-ci-plan/plan-phase-poe2e-end-to-end.md Initialized: 2026-06-13T19:25Z

Orientation

Phase mission: prove the whole model works end-to-end — PO scaffolds, runs (isolated), and tears down a throwaway project; cc-ci is modeled as a project in STAGING; live cc-ci is provably untouched.

Definition of Done (poe2e)

# DoD item Status
D1 PO scaffolded, ran (isolated), and tore down a throwaway project — evidence in REVIEW PASS @2026-06-13T19:46Z
D2 Staged cc-ci project: engine submodule pinned + migrated agents.toml; agents.py status MATCHES live cc-ci (side-by-side shown) PASS @2026-06-13T19:46Z
D3 Staged cc-ci registered in fleet.toml PASS @2026-06-13T19:46Z
D4 Written, reviewed operator cutover runbook PASS @2026-06-13T19:46Z
D5 Live cc-ci provably untouched: tmux sessions + /srv/cc-ci/cc-ci-plan/agents.{py,toml} + state/ unchanged; no second watchdog started PASS @2026-06-13T19:46Z

Verdicts

ALL DoD PASS @2026-06-13T19:46Z — phase DONE

Cold-verified from the Adversary's own clone (/srv/cc-ci/cc-ci-adv) and fresh shell. No VETO.


D1 PASS @2026-06-13T19:46Z

Re-ran the full PO scratch lifecycle independently:

cd /home/loops/porepo/project-orchestrator
bash scripts/create-project.sh scratch-e2e --dir /tmp/poe2e-scratch --ref v0.1.0 --prefix poe2e-scratch-

Scaffold output: engine pinned at 289ef07df40a8264f3a36b4e91b923d1424c4658 (v0.1.0), config: agents.toml (session_prefix = poe2e-scratch-). Tracked files: .gitignore, .gitmodules, agents.toml, engine — no PO/fleet metadata.

Injected demo backend (prompt_delivery = "exec" — required; "arg" default causes sleep to receive kickoff as arg and exit):

  • python3 engine/agents.py status → worker=stopped, watchdog=stopped
  • python3 engine/agents.py upstarting poe2e-scratch-worker (demo, ...) + starting watchdog
  • tmux ls | grep poe2e-scratch → both sessions present
  • python3 engine/agents.py statusworker RUNNING [sleep], watchdog RUNNING
  • Live cc-ci sessions during run: exactly 8 cc-ci-* sessions unchanged
  • python3 engine/agents.py downkilling poe2e-scratch-worker, killing poe2e-scratch-watchdog
  • tmux ls | grep poe2e-scratch || echo "torn down" → torn down
  • python3 engine/agents.py status → both stopped
  • rm -rf /tmp/poe2e-scratch → throwaway deleted

Note: The demo backend in agents.example.toml uses prompt_delivery = "exec" (not the default "arg"). Any cold-verify that injects the demo backend must include this field — otherwise the sleep process receives the kickoff file content as args and exits immediately.


D2 PASS @2026-06-13T19:46Z

Cold clone: git clone --recurse-submodules /home/loops/poe2e/cc-ci /tmp/poe2e-ccci-cold

  • HEAD: 38e5c907b9e37b8aebbfccb2e1ad8de7e2d880cb
  • Submodule: 289ef07df40a8264f3a36b4e91b923d1424c4658 engine (v0.1.0)
  • (a) Phase list: phases: 19 19 | identical: True
  • (b) Phase seq: rcust shot lvl5 bsky dstamp mailu kuma drone cfold cf55 pvfix pvcheck ghost cf48 pxgate aoeng aotest porepo poe2e
  • (c) After phase set 18 (poe2e): diff /tmp/s.txt /tmp/l.txtSTATUS BYTE-IDENTICAL
    • Both print: phase: poe2e [19/19] plan=plan-phase-poe2e-end-to-end.md (in progress) + identical 8-agent table
    • STATE column shows RUNNING for live sessions because agents.py status uses read-only tmux has-session — the staged project started nothing; both configs point at the same live tmux sessions, which is why status is byte-identical
  • (d) builder kickoff identical: True, adversary kickoff identical: True

Cold clone deleted.


D3 PASS @2026-06-13T19:46Z

cd /home/loops/porepo/project-orchestrator
python3 scripts/fleet.py validate   → fleet: OK — 2 project(s), schema v1
python3 scripts/fleet.py status     → cc-ci [disabled] agent-orchestrator@v0.1.0  /home/loops/poe2e/cc-ci
                                       total=2 enabled=1 disabled=1

cc-ci is registered as disabled — correct, it must not be started by the PO (that would conflict with the live system). Operator cutover enables it per runbook §6.


D4 PASS @2026-06-13T19:46Z

Read /home/loops/poe2e/cc-ci/docs/cutover-runbook.md. Covers all expected sections:

  • §0: What-stays/what-changes table with exact config deltas
  • §1: Pre-flight + parity gate (engine/agents.py status on project must match live before proceeding)
  • §2: Quiesce live — systemctl stop cc-ci-loops.service + agents.py down + confirm zero cc-ci- sessions (critical: prevents double watchdog on shared namespace)
  • §3: Reuse vs fresh start decision (reuse recommended — preserves phase-idx + resume ids)
  • §4: Production config delta: change log_dir from .ao-state back to /srv/cc-ci/.cc-ci-logs
  • §5: Re-point launch.py/launch.sh at engine/agents.py --config agents.toml (keeps systemd + orchestrator's prompt working unchanged; rollback copy preserved as launch.py.preproject)
  • §6: Start + validate (launch.py status parity, single watchdog, handoff ping, flip fleet entry to enabled)
  • §7: Fast rollback (re-point launch.py, restart)
  • Appendix: explicitly notes no ACME/DNS/prod-domain work (out of scope)

Runbook is operator-supervised and explicitly states loops MUST NOT perform this cutover themselves.


D5 PASS @2026-06-13T19:46Z

Final check (vs baseline @19:25Z):

  • agents.toml SHA256: 0d78ba55329705055bbb39722292b6d131cdd30f37eb814e50316f7c0e222b88 ✓ unchanged
  • agents.py SHA256: b4567b73099a587b5727a194f80a5e908d1a1589691294230e6ad1492fb9fe9a ✓ unchanged
  • state/phase-idx: 18 ✓ unchanged
  • tmux sessions: exactly 8 cc-ci-* sessions, all with same creation times as baseline ✓
  • cc-ci-watchdog count: exactly 1 ✓ (no second watchdog started)
  • cc-ci host: no tmux sessions ✓ unchanged

The staged project (/home/loops/poe2e/cc-ci) uses session_prefix = "cc-ci-" for fidelity but the Builder ran ONLY status/phase show/phase set against it — none of which start or kill sessions. The scratch D1 demo ran under poe2e-scratch- namespace. No live cc-ci file or session was touched.

D5 — Live cc-ci baseline snapshot @2026-06-13T19:25Z (pre-Builder)

Taken before Builder started any poe2e work. Will diff against this on cold-verify.

agents.toml SHA256: 0d78ba55329705055bbb39722292b6d131cdd30f37eb814e50316f7c0e222b88 agents.py SHA256: b4567b73099a587b5727a194f80a5e908d1a1589691294230e6ad1492fb9fe9a state/phase-idx: 18 (poe2e — index 18 in the phases array)

tmux sessions (orchestrator host, pre-Builder):

cc-ci-adv (just started)
cc-ci-assistant3 (pre-existing since 2026-06-09)
cc-ci-builder (just started)
cc-ci-cleanlogs (pre-existing since 2026-06-02)
cc-ci-orchestrator (pre-existing since 2026-06-13)
cc-ci-report (pre-existing since 2026-06-12)
cc-ci-upgrader (pre-existing since 2026-06-11)
cc-ci-watchdog (pre-existing since 2026-06-13)

cc-ci host tmux: no tmux sessions (cc-ci has no tmux sessions at phase start)

D5 PASS criterion: after all Builder work, agents.toml + agents.py checksums unchanged, state/phase-idx still 18, no new cc-ci-*-prefixed watchdog sessions started, cc-ci host tmux still empty (or unchanged).

Note on JOURNAL: The system-reminder auto-surfaced JOURNAL-poe2e.md contents during git pull (Builder had overwritten the file). I noted the live agents.py status capture therein — I will re-run this independently during cold-verify and will NOT use the Builder's capture as my verdict.

Break-it probes

(will log independent probes here as they run)

D2 — Live agents.py status (Adversary independent capture @2026-06-13T19:36Z)

Run from scratch: cd /srv/cc-ci/cc-ci-plan && python3 agents.py status

  phase: poe2e [19/19]  plan=plan-phase-poe2e-end-to-end.md  (in progress)
  AGENT          KIND        BACKEND   MODEL                WATCH      STATE
  orchestrator   persistent  claude    claude-opus-4-8      heal       RUNNING  [claude]
  builder        loop        claude    claude-opus-4-8      heal+stall RUNNING  [claude]
  adversary      loop        claude    claude-sonnet-4-6    heal+stall RUNNING  [claude]
  assistant      persistent  claude    claude-sonnet-4-6    none       stopped (disabled)
  upgrader       task        claude    claude-sonnet-4-6    none       RUNNING (disabled)  [claude]
  report         task        claude    claude-opus-4-8      none       RUNNING (disabled)  [claude]
  cleanlogs      service     -         -                    -          RUNNING
  watchdog       service     -         -                    -          RUNNING

This is the parity target for D2. The staged cc-ci agents.py status must match the AGENT/KIND/BACKEND/MODEL/WATCH columns (STATE will differ — staged is never started, so all agents will show stopped).

Also noted: PO scripts exist at /home/loops/porepo/project-orchestrator/scripts/ (create, start, stop, update, fleet.py). The demo backend is defined in agents.example.toml as bin = "echo '[demo] ...' ; exec sleep 1000000" — starts a sleeping process the engine tracks as RUNNING. This is what D1 will use for the isolated run.