3.2 KiB
3.2 KiB
REVIEW — phase redfix (Adversary)
Phase SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-redfix-canon-sweep-failures.md
Mission: investigate every canon-sweep failure (discourse, mattermost-lts, mumble, bluesky-pds,
gitea, keycloak), isolate → root-cause → classify (flake vs genuine; recipe vs test vs
warm-machinery vs load) → FIX each via a recipe PR or harness improvement → verify green.
No standing exceptions. Nothing merged.
Gates:
- M1 — all six investigated in isolation, classified with evidence. Adversary cold-verifies: claimed flake = reproducibly green in isolation (and red under load); claimed recipe defect = genuinely the recipe (not a stale test / harness artifact); claimed warm-machinery bug = in cc-ci.
- M2 — all six FIXED + verified green (recipe PR via
!testme; harness/cc-ci PR via the harness; flake-stabilization green under load). All six promote/pass. No standing exception. Nothing merged.
DONE = Builder writes ## DONE only after M1+M2 fresh Adversary PASS here.
Verdicts
(none yet — awaiting Builder bootstrap + first gate claim)
Adversary verification log
- 2026-06-17T23:18Z — Phase redfix opened. Refreshed phase plan + plan.md §6.1. Cold access to cc-ci
confirmed (
ssh cc-ci: hostnixos, uptime 4d,systemctl --failedempty, load ~0.8). No Builder state files (STATUS/BACKLOG/JOURNAL-redfix.md) yet; no gate claimed. Idling for the first claim. - 2026-06-18T00:10Z — Non-contending pre-staging (M1 NOT yet claimed; Builder mid-investigation:
gitea isolation running, keycloak pending). Stayed OFF the swarm to avoid contaminating the
Builder's isolation runs. Independently corroborated two deterministic static claims via pure
code reads on cc-ci (no deploys):
- mattermost-lts (recipe @
2.1.9+10.11.15): postgres svc hasbackupbot.backup.pre-hook(pg_dump → /var/lib/postgresql/data/postgres-backup.sql),backup.post-hook(rm dump),backup.path=/var/lib/postgresql/data/(hot live PGDATA) — and NObackupbot.restore.post-hook. immich (passes) uses dump-onlybackup.volumes.postgres.path: backup.sql+restore.post-hook: /pg_backup.sh restore. Corroborates "genuine recipe defect — no restore round-trip." ✔ pre-staged. - discourse (recipe @
0.8.1+3.5.0=bitnamilegacy/discourse:3.5.0+ sidekiq): overlaytests/discourse/test_upgrade.pyis a phase-prevb PR-faithfulness test asserting app image == officialdiscourse/discourse:3.5.3AND sidekiq dropped — only true on an unreleased PR head, not the latest release the canon sweep deploys. So it red-by-construction in the sweep. Corroborates "stale/PR-specific overlay test, not flake/timeout/recipe-deploy." ✔ pre-staged. - STILL OWED before any M1 PASS: my OWN cold isolation run of discourse to confirm the re-classification from the original canon hypothesis ("cold-deploy timeout, ~51-min wedge") to "deploys+serves fine, only the overlay test reds." Will run when M1 is claimed and the swarm is free (Builder not deploying). Same for bluesky app-alias collision (needs live caddy/getent diag). These are NOT verdicts — formal M1 PASS/FAIL awaits the Builder's gate claim.
- mattermost-lts (recipe @