Root-caused (empirically, dockerd logs) the discourse/ghost deploy wedges: the shared proxy overlay (/24=254 VIPs) exhausts as concurrent stack rm leaks endpoints over many days -> tasks stuck in Swarm 'New'. Add a per-run safety net to Step 0 (network prune + docker restart when VIP-allocation failures are logged). Plans + memory for the durable fix (enlarge proxy to /16 in swarm.nix, maintenance window) and for debugging/fixing the ghost PR afterward.
2.4 KiB
2.4 KiB
Memory index
- Orchestrator host: Hetzner — runs on Hetzner cpx22; rebuild cmd, loops-service bounce, git-identity gotcha
- Push commits to remote — push to git.autonomic.zone right after every commit in this repo
- Regression canary cadence — server E2E canaries run on polish/review/release, not every commit
- Recipe-mirrors public / org blocker — mirrors public but recipe-maintainers ORG is private → live PR-STATUS column dark until operator flips org public
- abra chaos-deploy checkout gotcha —
abra app newmoves recipe checkout to release tag; checkout PR branch after, or chaos deploys wrong tree - Shared recipe-checkout race — never git-checkout ~/.abra/recipes/ on cc-ci while its CI build runs; harness deploys from that tree
- immich pgvecto.rs DROP DATABASE panic — DROP DATABASE crashes immich's postgres image; use pg_dump --clean --if-exists + search_path rewrite
- Drone sqlite log extraction — copy /data/database.sqlite from drone container, query builds→stages→steps→logs for full step output
- plausible upgrade-base trap — RESOLVED: PR#3 GREEN L4; lessons: check harness base version pre-!testme; backupbot v2 label syntax; TinyLog not FREEZEable; BEAM exit-0 needs restart_policy any
- Swarm UpdateStatus convergence gotchas — N/N is not converged mid stop-first update; paused flag persists forever; only updating/rollback_started are active
- Weekly upgrade queued after phases — 06-12 cron skipped; auto-runs /upgrade-all when phase queue (drone) finishes; don'''t systemctl start the timer
- cfold paused pending upgrade — cfold phase loops+watchdog STOPPED until /upgrade-all (cc-ci-upgrader) finishes; resume = restart watchdog (phase-idx 9)
- proxy VIP exhaustion runbook — TODO after upgrade: enlarge proxy overlay to /16 (exhausts at /24=254 VIPs); root cause of discourse/ghost deploy wedges
- ghost PR debug — TODO after proxy fix: debug+fix the ghost upgrade PR (wedged on proxy VIP exhaustion; possible duplicate PR)