Root-caused (empirically, dockerd logs) the discourse/ghost deploy wedges: the shared proxy overlay (/24=254 VIPs) exhausts as concurrent stack rm leaks endpoints over many days -> tasks stuck in Swarm 'New'. Add a per-run safety net to Step 0 (network prune + docker restart when VIP-allocation failures are logged). Plans + memory for the durable fix (enlarge proxy to /16 in swarm.nix, maintenance window) and for debugging/fixing the ghost PR afterward.
21 lines
1.2 KiB
Markdown
21 lines
1.2 KiB
Markdown
---
|
|
name: ghost-pr-debug
|
|
description: TODO after proxy fix — debug & fix the ghost recipe upgrade PR (its !testme kept wedging; possible duplicate PR from interrupt churn)
|
|
metadata:
|
|
node_type: memory
|
|
type: project
|
|
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
|
|
---
|
|
|
|
During the 2026-06-12 weekly upgrade, **ghost** (6.42.0→6.44.1 + mysql bump) was the recipe whose
|
|
`!testme` kept wedging — its deploys hung at 0/1 in Swarm `New`, which was the **proxy VIP
|
|
exhaustion** infra issue ([[proxy-vip-exhaustion-runbook]]), not necessarily a ghost defect. It also
|
|
got run by a DUPLICATE subagent during the interrupt churn, so the ghost PR/branch state may be messy.
|
|
|
|
**TODO (after the proxy fix removes the infra confound):** inventory the ghost PR(s) on
|
|
recipe-maintainers/ghost (one or duplicate?), separate infra-failure from a real upgrade problem by
|
|
re-running `!testme` on a HEALTHY swarm, dedup any duplicate PR, fix-forward to green (recipe PR only;
|
|
comment on genuinely-stale tests, never edit them in default mode), and leave exactly one clean,
|
|
operator-ready ghost PR. NEVER merge. Plan: `cc-ci-plan/plan-ghostpr-debug-fix.md`. Delete this memory
|
|
once the ghost PR is clean + green (or clearly explained).
|