# Phase `ghost` — re-evaluate ghost after proxy fix and leave one clean PR **Mission:** re-evaluate the `ghost` upgrade failure after the Swarm proxy/IPAM infra confound has been removed, then leave exactly one operator-ready ghost PR: green if the recipe is sound, or clearly explained with the minimum required recipe fix/comment if a real Ghost/MySQL upgrade issue remains. State files live under `machine-docs/`: `STATUS-ghost.md`, `BACKLOG-ghost.md`, `REVIEW-ghost.md`, `JOURNAL-ghost.md`. ## Context The 2026-06-12 `/upgrade-all` recorded `ghost` as the only failed recipe, but the evidence was mixed: - One failure was definitely infra: shared `proxy` overlay VIP exhaustion left tasks stuck in Swarm `New` state. - A later failure may be recipe-specific: MySQL 8.0 to 8.4 data-dir upgrade timing under Swarm's default update monitor, producing `UpdateStatus=paused` under load. - A previous run on 2026-06-05 passed the Ghost/MySQL path under lighter load. - Duplicate ghost subagent churn may have left branch/PR/comment state messy. Existing focused plan/background: `/srv/cc-ci/cc-ci-plan/plan-ghostpr-debug-fix.md`. ## Required Work 1. **Inventory PR state.** On `recipe-maintainers/ghost`, list all open PRs and branches related to the upgrade. Identify the correct PR, expected to be ghost PR `#4`, and close or clearly mark any duplicate only if it is truly superseded. Never merge recipe PRs. 2. **Separate infra from recipe behavior.** After `pvfix` and `pvcheck`, trigger a fresh `!testme` on the correct ghost PR and watch the run. Do not count pre-proxy failures as current recipe evidence. 3. **If green:** record that the prior failure was infra/timing-confounded, ensure no stale stacks/volumes remain, and leave the PR ready for operator review. 4. **If red for a real recipe reason:** make the smallest recipe PR change needed. The suspected fix is a longer Swarm update monitor/start grace around the MySQL 8.0 to 8.4 data-dir migration, e.g. `update_config.monitor: 300s` and related minimal service health timing. Validate the hypothesis with logs; do not cargo-cult timing knobs. 5. **If the test is genuinely stale:** default recipe-upgrade policy applies: leave an explanatory PR comment for the operator. Do not edit cc-ci tests in this phase unless the operator explicitly asks for a test-update phase. 6. **Deduplicate and clean up.** Ensure exactly one relevant open ghost upgrade PR remains, comments explain the final state, and no `ghos-*`/`dev-ghost` stacks or volumes leak. ## Gates **M1 — State inventory and clean retry.** Builder documents PR/branch/comment/build state, identifies the correct PR, and runs one clean post-proxy `!testme`. Adversary verifies that pre-proxy infra failures were not misclassified as current recipe failures. **M2 — Operator-ready outcome.** The ghost PR is green, or it has the minimal justified recipe fix/comment and a clear current blocker. Duplicate PR/branch mess is resolved and no ghost resources leak. Adversary verifies live PR state, build evidence, and cleanup. ## Guardrails - Recipe PRs are never merged by agents. - Do not weaken tests to get green. - Do not re-run ghost during proxy maintenance or while `cfold` owns a broad CI sweep. - Keep iterations bounded: at most three fresh post-proxy `!testme` attempts unless the operator authorizes more. - Preserve useful failure evidence in PR comments and `machine-docs/STATUS-ghost.md`. ## Definition of Done Exactly one ghost upgrade PR is operator-ready, with a fresh post-proxy verdict and clear classification of the 2026-06-12 failure. Any real recipe fix is minimal and verified; otherwise the PR is green or has a precise operator-facing explanation. Adversary has signed off on M1 and M2 in `machine-docs/REVIEW-ghost.md`; Builder writes `## DONE` only after both gates have fresh Adversary PASSes.