status(2): discourse — 2 bugs root-caused (post-upgrade backup race + mint_admin ruby PATH), fixes in full4 validation
This commit is contained in:
@ -94,17 +94,25 @@ Standing VETO on DONE (REVIEW-2 @16:22:07Z) requires: ghost + discourse + mumble
|
||||
**upgrade-to-latest** green with justified `compose.ccci.yml` overlays. Current cycle:
|
||||
- **ghost F2-14b — ✅ Adversary PASS @2026-05-30T22:42Z (REVIEW-2, COLD, `/root/adv-ghost-f214b.log`).**
|
||||
Closes the GHOST portion of the DONE VETO checklist. DONE.
|
||||
- **discourse Q4.6 — restore-hook fix, RE-RUNNING.** full1 (`/root/ccci-discourse-full1.log`):
|
||||
install/upgrade/backup PASS; **restore FAIL** (`test_restore_returns_state`: ci_marker gone) +
|
||||
**custom FAIL** (both gate on `/site.json` 200, which never converged). ROOT CAUSE (single):
|
||||
the pg_backup.sh restore hook only did a one-shot `pg_terminate_backend` — the discourse app +
|
||||
sidekiq reconnect over TCP within ms and interfered with the drop/recreate/reimport, breaking the
|
||||
DB → ci_marker lost AND `/site.json` 500 in the post-restore custom tier. FIX (recipe-PR
|
||||
`recipe-maintainers/discourse#1`, new head `3758522`): block all non-local connections via
|
||||
`pg_hba.conf` (`local all all trust` + reload) before drop, restore on exit — mirrors the PROVEN
|
||||
matrix-synapse restore hook (identical backupbot wiring, restore PASSED there). Harness now echoes
|
||||
abra restore output (backupbot post-hook) into the run log (cc-ci `4a29ca6`) so restore is no longer
|
||||
opaque. Run shape full `install,upgrade,backup,restore,custom`. PR head `3758522` (was `7a2e0e0`).
|
||||
- **discourse Q4.6 — TWO bugs root-caused + fixed, VALIDATING (full4).** Investigation across full1-3:
|
||||
- **(A) backup race — backup.sql not captured after the upgrade tier.** restic snapshots of full1/full2
|
||||
(WITH upgrade) lacked `postgresql_data/backup.sql` entirely (only discourse_data+redis_data); the
|
||||
recipe's backupbot db pre-hook `/pg_backup.sh backup` didn't produce the dump at backup time, so
|
||||
restore reimported nothing → ci_marker lost AND `/site.json` 500 in the post-restore custom tier.
|
||||
Proven NOT a script bug: manual `bash -c 'set -o pipefail;/pg_backup.sh backup'` on the live db
|
||||
yields a valid 922KB dump (exit 0); matrix-synapse uses the identical pattern and its snapshots DO
|
||||
contain `postgres/_data/backup.sql`. full3 (WITHOUT upgrade) ran the pre-hook fine + restore PASSED.
|
||||
Conclusion: the immediately-preceding UPGRADE chaos-redeploy cycles the db; pg_dump races that cycle
|
||||
→ dump truncated/absent (same race ghost F2-14b hit). FIX: `BACKUP_VERIFY` probe in
|
||||
`tests/discourse/recipe_meta.py` (gzip-valid + non-empty backup.sql; False → harness re-runs the
|
||||
whole backup, caps 3 then proceeds → non-masking; restore stays the real gate). Also kept the pg_hba
|
||||
connection-block restore hook (recipe-PR head `3758522`) — correct hardening regardless.
|
||||
- **(B) create_topic — `mint_admin` ruby not on PATH.** `bin/rails runner` (`#!/usr/bin/env ruby`) under
|
||||
`bash -lc` (login shell resets PATH) → `env: 'ruby': No such file or directory` (rc=127). FIX: `bash -c`
|
||||
(inherit image ENV) + discover ruby (`command -v ruby || /opt/bitnami/ruby/bin/ruby`) + invoke explicitly.
|
||||
- Harness now echoes abra backup+restore output into the run log (cc-ci `4a29ca6`,`2f6a684`) — backup/
|
||||
restore no longer opaque. cc-ci fixes `8d689d6`. Validation run `/root/ccci-discourse-full4.log`
|
||||
(full `install,upgrade,backup,restore,custom`, PR head `3758522`).
|
||||
- mumble F2-14c + plausible Q4.7b still open.
|
||||
|
||||
## Gate F2-14b — CLAIMED @2026-05-30T22:10Z (ghost upgrade-to-latest + reliable P4 backup-integrity)
|
||||
|
||||
Reference in New Issue
Block a user