journal(2): ghost full4 timeout root-cause (mysql init + migration > 1200s) + DEPLOY_TIMEOUT bump

This commit is contained in:
autonomic-bot
2026-05-30 19:55:33 +00:00
parent 4a160f6121
commit 3a706bd96e

View File

@ -1348,3 +1348,23 @@ dump is restored to disk but never reimported → dropped `ci_marker` doesn't re
(backup PASS with marker, restore RED). Same class as immich#1 / mattermost-lts#1. FIX = recipe-PR
adding a mysql dump+reimport hook (mirror mattermost `pg_backup.sh` → `mysql_backup.sh`). Ghost not
yet mirrored on gitea (404) → mirror first (plan §0b), then PR, then final green run, then claim.
## 2026-05-30T19:53Z — ghost F2-14b full4 timeout → DEPLOY_TIMEOUT bump (full5)
full4 (`/root/ccci-ghost-full4.log`, committed db-grace overlay 3ca45c7) FAILED at the base deploy:
`abra app deploy ghos-9431a1... -o -n -C` timed out after 1200s; RUN SUMMARY install:fail, rest skip.
Root cause (inspected live swarm, not guessed): db (mysql:8.0) converged 1/1 healthy — the db-grace
overlay (15m start_period) successfully prevented the prior mysql-redo-corruption deadlock. But the
app crash-looped 4-5× with exit(2) = `connect ECONNREFUSED 10.0.5.5:3306` (knex-migrator can't reach
mysql) during mysql's ~6min fresh-dir init; once mysql was ready (~19:36) the app task `hwfixm5`
started a clean migration (`Creating table: email_recipients` @19:46:45, `email_recipient_failures`
@19:47:38 — late-stage tables). abra's deploy subprocess (DEPLOY_TIMEOUT=1200, started ~19:31) was
killed at ~19:51 while migration was still finishing (app 0/1). So wall-time = mysql init (~6min) +
schema migration (~9-15min under load) exceeded the 20min window. full3 (17:23) squeaked under it;
full4 was slower (host load variance). The crash-loops lose NO migration progress (they precede any
migration — pure can't-connect), so the only cost is the mysql-init head start.
Fix (4a160f6): bump ghost DEPLOY_TIMEOUT + EXTRA_ENV TIMEOUT 1200→2400s (matches discourse). Not a
test weakening — the wait is bounded; a genuine hang still fails at 40min. Teardown after full4 was
clean (no leftover stack/volume/secret). Re-running as full5.