57 lines
4.4 KiB
Markdown
57 lines
4.4 KiB
Markdown
# BACKLOG — phase `redfix`
|
|
|
|
## Build backlog
|
|
|
|
### M1 — investigate + isolate + classify (all six)
|
|
- [ ] discourse — reproduce cold-deploy timeout/wedge in isolation; root-cause (headroom vs
|
|
convergence bug vs upstream compose defect `sidekiq.depends_on: discourse`); classify.
|
|
- [ ] mattermost-lts — `test_restore.py::test_restore_returns_state` in isolation: green→load flake,
|
|
red→diagnose restore (recipe vs test).
|
|
- [ ] mumble — `custom/test_protocol_handshake.py::test_handshake_completes_with_channel_presence` in
|
|
isolation (canonical already present from today → likely flake; confirm).
|
|
- [ ] bluesky-pds — warm-canonical promote routing: why `warm-bluesky-pds…` → 000 over HTTPS while
|
|
container healthy internally + cold-test domain routes. Find cc-ci warm-machinery defect.
|
|
- [ ] gitea — `3.5.3→3.6.0` warm advance crash (`app.ini` read-only, JWT save). Recipe vs harness.
|
|
- [ ] keycloak — de-enrolled (live-warm OIDC collision). Design collision-free warm domain/namespace.
|
|
|
|
### M2 — FIX + verify all six (recipe PR or harness improvement)
|
|
**Execution gated on M1 PASS** (avoid node contention with Adversary M1 re-runs; classifications must
|
|
hold). Concrete fix designs from M1 evidence:
|
|
|
|
- [ ] **mattermost-lts** (recipe PR, clearest) — add `pg_backup.sh` (immich pattern, no VectorChord
|
|
bits): `backup(){ pg_dump -U mattermost mattermost | gzip > /var/lib/postgresql/data/backup.sql; }`
|
|
`restore(){ gunzip -c …/backup.sql | psql -U mattermost -d mattermost -f -; }`. compose: add
|
|
`configs: pg_backup → /pg_backup.sh`; postgres labels → `backup.pre-hook: /pg_backup.sh backup`,
|
|
`restore.post-hook: /pg_backup.sh restore`, `backup.volumes.postgres.path: backup.sql` (dump-only,
|
|
drop the whole-PGDATA `backup.path` + the `rm` post-hook). Verify via `!testme` → restore green.
|
|
- [ ] **bluesky-pds** (recipe PR) — eliminate the `app`-alias collision on shared proxy: give the PDS
|
|
service a unique name (e.g. `pds`) OR a unique network alias, and update caddy refs
|
|
(`reverse_proxy`, `on_demand_tls ask http://…/tls-check`), healthcheck, backup labels, ops/test
|
|
service= refs. Verify warm promote → 200 on /xrpc/_health. (NOTE: cc-ci harness `ops.py`/tests
|
|
reference `service="app"` for bluesky? check + update if the recipe service renames — but recipe
|
|
mirror is PR-only; cc-ci-side refs are a separate cc-ci change.) Confirm exact approach in M2.
|
|
- [ ] **gitea** (recipe PR) — make app.ini writable on the warm-reattach advance so 3.6.0 can persist
|
|
the JWT secret: render app.ini into the WRITABLE `config:/etc/gitea` volume via the existing
|
|
`docker-setup.sh` entrypoint (copy the templated config to a writable path) instead of the
|
|
read-only `app_ini` docker-config mount; OR ensure the persisted JWT secret is accepted without
|
|
rewrite. Verify the 3.5.3→3.6.0 advance promotes. (Ties to LFS PR #1.)
|
|
- [ ] **keycloak** (harness, cc-ci branch) — `canonical.canonical_domain(r)`: return a collision-free
|
|
domain when `r` is a live-warm provider (`r in warm.WARM_DOMAINS`) → e.g.
|
|
`warm-canon-<r>.ci.commoninternet.net`; else keep `warm-<r>` (zero blast radius on the 15 others).
|
|
Set keycloak `WARM_CANONICAL=True`. Verify keycloak promotes at warm-canon-keycloak WITHOUT
|
|
disrupting live warm-keycloak (200 throughout).
|
|
- [ ] **mumble** (harness, cc-ci branch) — stabilize the handshake under load: add a READY_PROBE/
|
|
readiness gate (TCP 64738 stably listening + a successful handshake) before the custom tier
|
|
and/or raise `retry_handshake` budget; verify green under a concurrent-load re-run.
|
|
- [ ] **discourse** (TRICKIEST — decide in M2) — the overlay `test_upgrade.py` asserts a
|
|
bitnamilegacy→official migration absent from all releases/main. Options: (a) cc-ci test PR
|
|
(--with-tests) scoping the faithfulness assertion to ONLY fire when the head actually performs
|
|
the migration (image still bitnamilegacy → N/A, not RED) — NOT a weakening, a correct scope; +
|
|
file an upstream recipe issue/PR for the real bitnamilegacy→official migration. (b) recipe PR
|
|
doing the migration (major rewrite — official discourse image is launcher-based, likely
|
|
infeasible cleanly). Lean (a)+tracked-upstream; may need operator input (DEFERRED?) — assess in M2.
|
|
|
|
## Adversary findings
|
|
|
|
(Adversary-owned — do not edit.)
|