# Plan — debug & fix the ghost recipe upgrade PR **Context:** during the 2026-06-12 weekly upgrade, ghost (ghost 6.42.0→6.44.1 + mysql bump) was the recipe whose `!testme` kept wedging. Its test deploys (`ghos-bdd2f3` etc.) hung at 0/1 in Swarm `New` state — which we now know was the **`proxy` VIP exhaustion** (see [[proxy-vip-exhaustion-runbook]] / `plan-proxy-vip-exhaustion-fix.md`), NOT necessarily a ghost defect. It also got run by a DUPLICATE subagent during the interrupt churn, so the PR/branch state may be messy. This plan figures out what actually went wrong and leaves the ghost PR clean + green. **Execute AFTER** the proxy VIP fix (so the infra confound is gone) and the current upgrade settles. Owner: orchestrator, or a focused `/recipe-upgrade ghost` re-run. ## What the upgrader already found (2026-06-12 summary — start here) ghost 6.42.0→6.44.1 (+ mysql 8.0→8.4). PR is **recipe-maintainers/ghost #4**, open with an analysis comment. Three `!testme` attempts: - run 1 (2026-06-05): **PASS** at 6.44.0 + mysql:8.4 (under lighter load). - run 2 (2026-06-12): FAIL — **IPAM/proxy-VIP exhaustion** (infra; the [[proxy-vip-exhaustion-runbook]] issue). - run 3 (2026-06-12): FAIL — **a REAL issue**: Swarm `UpdateStatus=paused` on the **mysql 8.0→8.4 data-dir upgrade race** — the default 5s `update_config.monitor` is too tight for the mysql data-dir migration under load, so Swarm marks the update paused/failed. The upgrader's suggested fix: add **`update_config.monitor: 300s`** (and likely `failure_action: continue` / a longer `start_period`) to the ghost **app** service so the mysql data-dir upgrade has time to converge. So the likely real fix is a small recipe-PR change to ghost's `update_config`, then re-verify — NOT a test change. ## Steps 1. **Inventory the ghost PR state.** On recipe-maintainers/ghost: list open PRs — is there ONE upgrade PR or a DUPLICATE (two branches/PRs from the two ghost subagents)? Capture each PR's branch, diff (image tag + version-label bumps), and its `!testme` comment history / build results. Read the upgrader transcript for both ghost subagents to see what each did. 2. **Separate infra failure from real failure.** The deploy wedges were proxy-VIP exhaustion (infra). Determine whether ghost ALSO has a genuine upgrade problem: does ghost 6.44.1 + the mysql bump deploy + pass its tests on a HEALTHY swarm? Re-run `!testme` on the ghost PR now that the box is healthy (post docker-restart / post proxy fix) and watch the real result. 3. **Dedup.** If two ghost PRs/branches exist, keep the correct one (right version bump, clean diff), close the duplicate with a note, and ensure no leftover `dev-ghost`/`ghos-*` stacks remain (reap). 4. **Fix forward to green.** If `!testme` is RED for a REAL reason (e.g. ghost 6.44.1 needs a config change, or the mysql major bump needs a migration step / a genuinely-stale test): apply the minimal recipe fix per `/recipe-upgrade` rules — recipe PR changes only; if a cc-ci TEST is genuinely stale, leave an explanatory PR COMMENT (do NOT edit tests in default mode). Iterate `!testme` ≤3× to green. 5. **Leave it operator-ready.** One clean ghost PR, `!testme` GREEN (or a clear comment explaining a legitimately-deferred issue), no duplicate, no leaked deploys. NEVER merge — operator merges. ## Acceptance The ghost upgrade is represented by exactly one PR with a clear, green (or clearly-explained) `!testme`, the duplicate-subagent mess cleaned, and a one-line note on whether ghost's original failure was purely the proxy-VIP infra issue or a real upgrade problem (and how it was fixed). ## Guardrails Recipe mirror = PR only, never merge / never push main. Reap any `dev-ghost`/`ghos-*` test stacks on exit. No secrets in logs/commits. Don't run while the proxy recreate (maintenance window) is in progress.