diff --git a/machine-docs/BACKLOG-poe2e.md b/machine-docs/BACKLOG-poe2e.md index 8b9eb3d..ed16808 100644 --- a/machine-docs/BACKLOG-poe2e.md +++ b/machine-docs/BACKLOG-poe2e.md @@ -4,31 +4,31 @@ (Builder-owned) -- [ ] **B1 — PO scratch project full lifecycle (D1).** Use the PO's `scripts/create-project.sh` to +- [x] **B1 — PO scratch project full lifecycle (D1).** Use the PO's `scripts/create-project.sh` to scaffold a throwaway scratch project under an isolated parent dir; switch it to the engine's dependency-free `demo` backend on a unique `session_prefix`; `up` it, confirm `status` shows the sessions RUNNING through the harness; `down` it; delete the throwaway. Capture full transcript. -- [ ] **B2 — Staged cc-ci project skeleton (D2).** Scaffold a local git repo `cc-ci` (staging) with +- [x] **B2 — Staged cc-ci project skeleton (D2).** Scaffold a local git repo `cc-ci` (staging) with `engine/` submodule pinned at v0.1.0 (`289ef07`). Initial commit. -- [ ] **B3 — Migrate `agents.toml` (D2).** Translate the live `/srv/cc-ci/cc-ci-plan/agents.toml` +- [x] **B3 — Migrate `agents.toml` (D2).** Translate the live `/srv/cc-ci/cc-ci-plan/agents.toml` to the engine v0.1.0 schema: all agents + services, both backends, defaults (+ required `session_prefix`/`log_dir`), the full `[loop]` phases array (19 phases) with per-phase model overrides, handoff, on_complete, plus `kickoff_template` + `roles_dir`. -- [ ] **B4 — Migrate `prompts/` (D2).** Copy `prompts/{builder,adversary}.md` verbatim from live; +- [x] **B4 — Migrate `prompts/` (D2).** Copy `prompts/{builder,adversary}.md` verbatim from live; author `prompts/kickoff.md` reproducing the live `build_loop_kickoff()` preamble via the engine's `{phase_id}/{plan}/{status}/{role}` slots. -- [ ] **B5 — Parity verification (D2).** Run `engine/agents.py status` on the staged config from a +- [x] **B5 — Parity verification (D2).** Run `engine/agents.py status` on the staged config from a clean checkout inside `nix develop`; diff agents/models/phases against the live status; produce a side-by-side in STATUS. Must match (modulo the STATE column, which differs because staged is never started). -- [ ] **B6 — Register staged cc-ci in `fleet.toml` (D3).** Add a `[[project]]` entry in the PO +- [x] **B6 — Register staged cc-ci in `fleet.toml` (D3).** Add a `[[project]]` entry in the PO repo's `fleet.toml`; `scripts/fleet.py validate` passes. -- [ ] **B7 — Operator cutover runbook (D4).** Write the exact, reviewed operator-supervised cutover +- [x] **B7 — Operator cutover runbook (D4).** Write the exact, reviewed operator-supervised cutover steps (stop live → point systemd/shims at the project's engine → start), with rollback. -- [ ] **B8 — Prove live untouched (D5).** Re-checksum live `agents.{py,toml}`, `state/phase-idx`, +- [x] **B8 — Prove live untouched (D5).** Re-checksum live `agents.{py,toml}`, `state/phase-idx`, and tmux session list; confirm unchanged vs the Adversary's baseline; confirm no `cc-ci-`-prefixed watchdog/loop was started by me. -- [ ] **B9 — Claim the gate.** Clean tree (commit + push everything), STATUS `## Gate CLAIMED` with +- [x] **B9 — Claim the gate.** Clean tree (commit + push everything), STATUS `## Gate CLAIMED` with WHAT/HOW/EXPECTED/WHERE; await Adversary. ## Adversary findings diff --git a/machine-docs/JOURNAL-poe2e.md b/machine-docs/JOURNAL-poe2e.md index b14e188..377cbcb 100644 --- a/machine-docs/JOURNAL-poe2e.md +++ b/machine-docs/JOURNAL-poe2e.md @@ -54,6 +54,44 @@ Established facts: tmux sessions (provable RUNNING) without spending tokens or risking any collision, on a unique isolated `session_prefix`. Then `down` + delete the throwaway. +## 2026-06-13T19:41Z — All 5 DoD built + cold-verified; claiming gate + +Built and verified end to end. The WHY behind the STATUS facts: + +- **D1 (lifecycle).** Used the PO's `create-project.sh` to scaffold `/tmp/poe2e-scratch/scratch-e2e` + (engine pinned `289ef07`; tracked files exactly `.gitignore .gitmodules agents.toml engine` — no + PO/fleet metadata), switched it to the `demo` backend so `up` really starts tmux sessions with no + token spend and on the isolated `poe2e-scratch-` namespace. Observed: `up` → both sessions; `status` + → RUNNING; `down` → killed; `status` → stopped; deleted. The 8 live `cc-ci-*` sessions never moved. +- **D2 (migration + parity).** The migration is faithful: `role_model()` and `cmd_status()` render + byte-identical between the live engine and v0.1.0 (I diffed `role_model` — IDENTICAL — and read + `cmd_status`). I copied the `phases` array verbatim (incl. the `"opus"` shorthand for dstamp and all + per-phase `models`), so `tomllib`-comparing the two configs' phase arrays gives `True`. The biggest + confidence boost: rendering the staged builder/adversary kickoffs via the engine and diffing against + the *live generated* `kickoff-cc-ci-*.txt` → **byte-identical**, proving prompts/kickoff.md + + prompts/{builder,adversary}.md reproduce the live `build_loop_kickoff()` exactly. The staged + `status` is byte-identical to live including STATE, because `session_prefix="cc-ci-"` means + `session_alive()` (read-only `tmux has-session`) sees the live sessions — the staged project starts + nothing. **Critical safety finding:** the engine's `load_config()` does + `Path(log_dir/state).mkdir(exist_ok=True)` on EVERY invocation incl. `status` — so the staged + `log_dir` must be the isolated `.ao-state`, never the live `/srv/cc-ci/.cc-ci-logs` (the cutover + runbook flips it back). That's why staging uses an isolated state dir. +- **D3.** Registered `cc-ci` in the PO `fleet.toml` as `enabled=false` (the PO must never start it — + shared namespace would collide with live). `fleet.py validate` → OK, 2 projects. +- **D4.** Cutover runbook derived from the *actual* live boot chain I inspected + (`cc-ci-loops.service → cc-ci-loops-start → launch.sh start → launch.py [shim] → agents.py up`, + cwd `/srv/cc-ci/cc-ci`, `RESUME_PHASE=1`). The cutover is one indirection change (re-point + `launch.py` at the project engine) + one config delta (`log_dir` → live path to resume phase/ids) + + quiesce-then-start to avoid a double watchdog; rollback is just restoring the old shim. The + in-place `agents.{py,toml}` stay present throughout → trivial rollback. +- **D5.** Re-checksummed live `agents.{py,toml}` (both == baseline), `phase-idx`=18, the 8 baseline + sessions, exactly 1 `cc-ci-watchdog`, cc-ci host has no tmux. Nothing I did wrote live files/state + or started a `cc-ci-` session. + +Deliverable SHAs: staged cc-ci `/home/loops/poe2e/cc-ci` @ `38e5c90` (engine `289ef07` v0.1.0); +PO `recipe-maintainers/project-orchestrator` @ `6cc3ed4` (pushed). Cleaned up `/tmp` scratch + +cold-clone artifacts. Claiming the gate. + ## Adversary pre-Builder D5 baseline (preserved verbatim from the Adversary's init) > The Adversary recorded this in JOURNAL-poe2e.md at phase start, before I took ownership. Kept here diff --git a/machine-docs/STATUS-poe2e.md b/machine-docs/STATUS-poe2e.md index 9ec496f..63b5978 100644 --- a/machine-docs/STATUS-poe2e.md +++ b/machine-docs/STATUS-poe2e.md @@ -1,24 +1,163 @@ # STATUS — phase poe2e (Builder) **Phase plan:** `/srv/cc-ci/cc-ci-plan/plan-phase-poe2e-end-to-end.md` -**Consumes:** PO repo `recipe-maintainers/project-orchestrator` (`main` @ `346ed31`) + engine -`recipe-maintainers/agent-orchestrator` @ `v0.1.0` (`289ef07`). --- -## Gate: NOT YET CLAIMED — build in progress +## Gate: CLAIMED — all 5 DoD built + cold-verified @2026-06-13T19:41Z — awaiting Adversary -Building D1–D5. No `## DONE` and no gate claim until every DoD item is built and cold-verified, then -Adversary-PASS in REVIEW-poe2e.md. +### Deliverables (WHERE) +- **Staged cc-ci project** (local staging git repo, the phase's sanctioned "staging dir"): + `/home/loops/poe2e/cc-ci`, `main` HEAD `38e5c907b9e37b8aebbfccb2e1ad8de7e2d880cb`. + `engine/` submodule pinned `289ef07df40a8264f3a36b4e91b923d1424c4658` = tag `v0.1.0` of + `recipe-maintainers/agent-orchestrator` (public; `.gitmodules` URL is the public Gitea URL, so a + recursive clone fetches the engine without creds). Tracked files: `agents.toml`, + `prompts/{kickoff,builder,adversary}.md`, `ai-progress-monitor-prompt.txt`, `docs/cutover-runbook.md`, + `.gitignore`, `.gitmodules`, `engine` (gitlink). Runtime state (`.ao-state/`) is gitignored. +- **PO fleet registry**: `recipe-maintainers/project-orchestrator` on `git.autonomic.zone`, `main` + HEAD `6cc3ed4` (pushed). `fleet.toml` now has the `cc-ci` `[[project]]` entry (`enabled = false`). +- **Live cc-ci** (the parity target / must-be-untouched): `/srv/cc-ci/cc-ci-plan/agents.{py,toml}`, + `/srv/cc-ci/.cc-ci-logs/state/`, and the `cc-ci-*` tmux sessions on the orchestrator host. -### DoD progress (Builder self-track; authoritative verification is the Adversary's REVIEW) +### Nothing live was started or modified +The staged config uses `session_prefix = "cc-ci-"` (faithful to live). I ran ONLY `status` / `phase +show` / `phase set` on it — all read-only or writing the staged repo's own gitignored `.ao-state`. +I never ran `up`/`down`/`watchdog` on the staged config (which would target the live `cc-ci-` +sessions). The staged `status` STATE column reads RUNNING because `session_alive()` is a read-only +`tmux has-session` query that sees the *live* sessions — the staged project started nothing. -| # | DoD item | Build state | -|---|---|---| -| D1 | PO scaffolded, ran (isolated), tore down a throwaway project | not started | -| D2 | Staged cc-ci: engine submodule pinned + migrated agents.toml; `agents.py status` MATCHES live (side-by-side) | not started | -| D3 | Staged cc-ci registered in `fleet.toml` | not started | -| D4 | Operator cutover runbook | not started | -| D5 | Live cc-ci provably untouched | not started | +--- + +## DoD verification (WHAT / HOW / EXPECTED) + +### D1 — PO scaffolded, ran (isolated), and tore down a throwaway project +**HOW** (re-runnable): +```bash +cd /home/loops/porepo/project-orchestrator +rm -rf /tmp/poe2e-scratch +bash scripts/create-project.sh scratch-e2e --dir /tmp/poe2e-scratch --ref v0.1.0 --prefix poe2e-scratch- +# switch the scaffold to the dependency-free `demo` backend (no token spend, isolated namespace): +# edit /tmp/poe2e-scratch/scratch-e2e/agents.toml → backend="demo" + [backend.demo] + one demo agent +cd /tmp/poe2e-scratch/scratch-e2e +python3 engine/agents.py status # worker+watchdog: stopped +python3 engine/agents.py up # starts poe2e-scratch-worker + poe2e-scratch-watchdog +tmux ls | grep poe2e-scratch # both sessions present +python3 engine/agents.py status # worker RUNNING [sleep], watchdog RUNNING +python3 engine/agents.py down # kills both +tmux ls | grep poe2e-scratch || echo "torn down" +cd / && rm -rf /tmp/poe2e-scratch # delete throwaway +``` +**EXPECTED**: scaffold reports `engine pinned at 289ef07 (v0.1.0)`; tracked files exactly +`.gitignore .gitmodules agents.toml engine` (no PO/fleet metadata). `up` prints +`starting poe2e-scratch-worker (demo, …)` + `starting watchdog`; post-up `status` shows both +`RUNNING`; `down` prints `killing …`; post-down `status` shows both `stopped`; throwaway deleted; the +8 live `cc-ci-*` sessions untouched throughout (the demo used the isolated `poe2e-scratch-` +namespace). I executed exactly this @19:31Z (transcript in JOURNAL-poe2e.md). + +### D2 — Staged cc-ci: engine submodule pinned + migrated agents.toml; `agents.py status` MATCHES live +**HOW** (cold, from a fresh recursive clone of the staging repo): +```bash +cd /tmp && rm -rf poe2e-ccci-cold +git clone --recurse-submodules /home/loops/poe2e/cc-ci poe2e-ccci-cold +cd poe2e-ccci-cold +git rev-parse HEAD # 38e5c90… +git submodule status # 289ef07… engine (v0.1.0) + +# (a) phase LIST + per-phase models are byte-identical (index-independent, strongest proof): +python3 - <<'PY' +import tomllib +live = tomllib.load(open('/srv/cc-ci/cc-ci-plan/agents.toml','rb'))['loop']['phases'] +stg = tomllib.load(open('agents.toml','rb'))['loop']['phases'] +print('phases:', len(live), len(stg), '| identical:', live == stg) +PY + +# (b) full phase sequence: +python3 engine/agents.py phase show + +# (c) exact status side-by-side at the live phase (set the staged index to poe2e=18): +python3 engine/agents.py phase set 18 +python3 engine/agents.py status > /tmp/s.txt +( cd /srv/cc-ci/cc-ci-plan && python3 agents.py status ) > /tmp/l.txt +diff /tmp/s.txt /tmp/l.txt && echo "STATUS BYTE-IDENTICAL" + +# (d) the loop kickoff each agent would receive is byte-identical to the live generated one: +python3 - <<'PY' +import sys; sys.path.insert(0,'engine'); import agents +cfg=agents.load_config('agents.toml') # phase-idx already 18 from (c) +for nm,live in [('builder','/srv/cc-ci/.cc-ci-logs/state/kickoff-cc-ci-builder.txt'), + ('adversary','/srv/cc-ci/.cc-ci-logs/state/kickoff-cc-ci-adv.txt')]: + got=agents.build_loop_kickoff(cfg,cfg['agents'][nm]); exp=open(live).read() + print(nm,'kickoff identical:', got==exp) +PY +cd / && rm -rf /tmp/poe2e-ccci-cold +``` +**EXPECTED**: `HEAD 38e5c90`; submodule `289ef07 (v0.1.0)`. (a) `phases: 19 19 | identical: True`. +(b) `seq: rcust shot lvl5 bsky dstamp mailu kuma drone cfold cf55 pvfix pvcheck ghost cf48 pxgate +aoeng aotest porepo poe2e`. (c) **`STATUS BYTE-IDENTICAL`** — both print +`phase: poe2e [19/19] plan=plan-phase-poe2e-end-to-end.md (in progress)` and the same 8-row agent +table (orchestrator opus, builder opus, adversary sonnet, assistant sonnet/disabled, upgrader +sonnet/disabled, report opus/disabled, cleanlogs + watchdog services). The STATE column matches +because both read the same live `cc-ci-` sessions (read-only `tmux has-session`). (d) both +`kickoff identical: True`. Migration deltas vs live are documented inline in the staged `agents.toml` +("MIGRATE:" comments): added `session_prefix`, isolated staging `log_dir`, backend `process_name`/TUI +fields, `cleanlogs` → `engine/agent-log.py`, `[loop].kickoff_template`/`roles_dir`. None affect the +agents/models/phases columns. + +### D3 — Staged cc-ci registered in `fleet.toml` +**HOW**: +```bash +cd /home/loops/porepo/project-orchestrator # or: git clone --recurse-submodules \ + # https://git.autonomic.zone/recipe-maintainers/project-orchestrator.git +python3 scripts/fleet.py validate +python3 scripts/fleet.py status +``` +**EXPECTED**: `fleet: OK — 2 project(s), schema v1`. `status` lists `cc-ci [disabled] +agent-orchestrator@v0.1.0 /home/loops/poe2e/cc-ci` plus the sample `example-recipe-ci [enabled]`; +`total=2 enabled=1 disabled=1`. `enabled=false` is deliberate — the PO must never start cc-ci +(it would collide with the running live system); going live is the operator cutover. + +### D4 — Operator cutover runbook +**HOW**: `cat /home/loops/poe2e/cc-ci/docs/cutover-runbook.md` (also reachable from a recursive +clone). **EXPECTED**: a written, operator-supervised runbook: §0 what-stays/what-changes table + +the exact config deltas; §1 pre-flight + parity gate; §2 quiesce live (stop `cc-ci-loops.service`, +`agents.py down`, confirm zero `cc-ci-` sessions — prevents a double watchdog on the shared +namespace); §3 reuse live state (`log_dir` → `/srv/cc-ci/.cc-ci-logs`); §4 production config deltas; +§5 re-point `launch.py`/`launch.sh` at `/engine/agents.py --config /agents.toml` +(keeps the systemd boot chain + the orchestrator's startup prompt working unchanged; `launch.py.orig` +already preserved); §6 start + validate (`launch.py status` parity, single watchdog, handoff ping, +flip fleet entry to enabled); §7 fast rollback (re-point `launch.py`, restart). Derived from the real +live boot chain `cc-ci-loops.service → cc-ci-loops-start → launch.sh start → launch.py → agents.py up`. + +### D5 — Live cc-ci provably untouched +**HOW** (compare to the Adversary's pre-Builder baseline @19:25Z): +```bash +sha256sum /srv/cc-ci/cc-ci-plan/agents.toml /srv/cc-ci/cc-ci-plan/agents.py +cat /srv/cc-ci/.cc-ci-logs/state/phase-idx +tmux ls | grep '^cc-ci' | sort +tmux ls | grep -c 'cc-ci-watchdog' # exactly 1 +ssh cc-ci 'tmux ls 2>/dev/null || echo "no tmux sessions"' +``` +**EXPECTED** (all match baseline): +- `agents.toml` SHA256 = `0d78ba55329705055bbb39722292b6d131cdd30f37eb814e50316f7c0e222b88` (unchanged). +- `agents.py` SHA256 = `b4567b73099a587b5727a194f80a5e908d1a1589691294230e6ad1492fb9fe9a` (unchanged). +- `state/phase-idx` = `18` (unchanged). +- exactly the 8 baseline `cc-ci-*` sessions (orchestrator, builder, adv, assistant3, cleanlogs, + upgrader, report, watchdog); **exactly 1** `cc-ci-watchdog` (no second watchdog started by me). +- cc-ci host: `no tmux sessions`. +I verified all of the above @19:41Z. The staged config + scratch demo never wrote live `agents.*` / +`state/` and never started a `cc-ci-`-prefixed session (the scratch demo ran under +`poe2e-scratch-`). + +--- + +## DoD summary + +| # | DoD item | Build state | Cold-verified | +|---|---|---|---| +| D1 | PO scaffolded, ran (isolated), tore down a throwaway project | DONE | 19:31Z | +| D2 | Staged cc-ci: engine pinned + migrated agents.toml; status MATCHES live | DONE | 19:40Z | +| D3 | Staged cc-ci registered in `fleet.toml` (disabled) | DONE | 19:40Z | +| D4 | Operator cutover runbook | DONE | 19:41Z | +| D5 | Live cc-ci provably untouched (files/state/sessions = baseline) | DONE | 19:41Z | (Reasoning / design rationale → JOURNAL-poe2e.md, kept out of STATUS to preserve anti-anchoring.)