journal(redfix): M1 bluesky-pds — 000 reproduces deterministically; root cause = caddy↔app cross-stack 'app' alias collision on shared proxy (recipe defect)
Some checks failed
continuous-integration/drone/push Build is failing

This commit is contained in:
autonomic-bot
2026-06-18 00:02:26 +00:00
parent 41e161a433
commit f8ba0c3a1f
2 changed files with 48 additions and 3 deletions

View File

@ -38,9 +38,9 @@ flake source per phase plan §2.1). Runs execute on cc-ci from `/etc/cc-ci`.
| discourse | DONE @23:40Z (`/tmp/redfix-discourse.log` on cc-ci) | install/backup/restore/custom PASS; **upgrade overlay FAIL**. Deploys+serves fine — NOT a timeout/FATA. | cc-ci overlay `tests/discourse/test_upgrade.py` asserts head runs official `discourse/discourse:3.5.3` + drops sidekiq; latest tag `0.8.1+3.5.0` AND main both still `bitnamilegacy/discourse:3.5.0`+sidekiq (migration exists in no release/main). The `depends_on discourse` string is a non-fatal prepull-only warning, not the deploy. | **stale/PR-specific cc-ci OVERLAY test** mismatched to canonical-sweep context (not flake/timeout/recipe-deploy/warm-machinery) |
| mattermost-lts | DONE @00:05Z (`/tmp/redfix-mattermost-lts.log`) | install/upgrade/backup/custom PASS; **restore FAIL** `ci_marker does not exist`**deterministic in isolation** (not a load race) | recipe `postgres` svc backup labels: backs up hot live PGDATA + dump but has **NO `backupbot.restore.post-hook`** to replay the dump → restore doesn't round-trip postgres. Contrast immich (passes): dump-only `backup.volumes.postgres.path: backup.sql` + `restore.post-hook: /pg_backup.sh restore`. | **genuine RECIPE defect** at latest → recipe PR (adopt immich-style dump+restore-post-hook) |
| mumble | DONE @00:18Z (`/tmp/redfix-mumble.log`) | **ALL tiers PASS** incl. handshake; no orphans. Canon red under load; canonical written green TODAY | handshake (TLS+ServerSync) not completing within ~60s retry under heavy concurrent sweep load; fine in isolation | **load/timing FLAKE** → harness stabilization (readiness gate / retry). (1 isolation green; will repeat 1-2× before M1 claim) |
| bluesky-pds | running (isolation; promote at end) | — | — | — |
| gitea | pending | — | — | — |
| keycloak | pending | — | — | — |
| bluesky-pds | DONE @00:45Z (`/tmp/redfix-bluesky-pds.log` + live diag) | cold lifecycle GREEN; **WC5 promote 000** reproduces (warm /xrpc/_health last status 0). NOT a flake | caddy on-demand TLS (`ask http://app:3000/tls-check`) can't reach app: caddy resolves bare `app` to OTHER stacks' app endpoints on shared `proxy` net (getent app→only 10.10.0.X, never internal 10.0.3.3; proxy has drone/traefik/keycloak/ccci `app` aliases) → no cert → 000. Promote machinery correct (refused to write canonical). | **genuine routing/RECIPE defect** (cross-stack `app`-alias collision on shared proxy) → recipe PR: unique PDS service name/alias. NOT promote-machinery, NOT flake |
| gitea | running (isolation; warm 3.6.0 advance) | — | — | — |
| keycloak | pending (mostly design; verify collision) | — | — | — |
Gate: M1 not yet claimed.