Files
cc-ci-orchestrator/memory/ghost-pr-debug.md
autonomic-bot ca02a0dd6f upgrade-all: proxy VIP-exhaustion guard in Step 0; runbooks for proxy /16 enlarge + ghost PR debug
Root-caused (empirically, dockerd logs) the discourse/ghost deploy wedges:
the shared proxy overlay (/24=254 VIPs) exhausts as concurrent stack rm leaks
endpoints over many days -> tasks stuck in Swarm 'New'. Add a per-run safety
net to Step 0 (network prune + docker restart when VIP-allocation failures are
logged). Plans + memory for the durable fix (enlarge proxy to /16 in swarm.nix,
maintenance window) and for debugging/fixing the ghost PR afterward.
2026-06-12 03:30:00 +00:00

21 lines
1.2 KiB
Markdown

---
name: ghost-pr-debug
description: TODO after proxy fix — debug & fix the ghost recipe upgrade PR (its !testme kept wedging; possible duplicate PR from interrupt churn)
metadata:
node_type: memory
type: project
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
---
During the 2026-06-12 weekly upgrade, **ghost** (6.42.0→6.44.1 + mysql bump) was the recipe whose
`!testme` kept wedging — its deploys hung at 0/1 in Swarm `New`, which was the **proxy VIP
exhaustion** infra issue ([[proxy-vip-exhaustion-runbook]]), not necessarily a ghost defect. It also
got run by a DUPLICATE subagent during the interrupt churn, so the ghost PR/branch state may be messy.
**TODO (after the proxy fix removes the infra confound):** inventory the ghost PR(s) on
recipe-maintainers/ghost (one or duplicate?), separate infra-failure from a real upgrade problem by
re-running `!testme` on a HEALTHY swarm, dedup any duplicate PR, fix-forward to green (recipe PR only;
comment on genuinely-stale tests, never edit them in default mode), and leave exactly one clean,
operator-ready ghost PR. NEVER merge. Plan: `cc-ci-plan/plan-ghostpr-debug-fix.md`. Delete this memory
once the ghost PR is clean + green (or clearly explained).