Root cause: Ghost's fresh-DB first boot runs a ~6-9min schema migration (round-trip-bound, not CPU); the recipe healthcheck start_period:1m (~6min grace) kills the still-migrating task, leaving a stale migrations_lock → every later task deadlocks (MigrationsAreLockedError). Hit on both 2- and 4-vCPU. Fix (cc-ci deploy overlay, NOT a recipe/test change): compose.ccci-health.yml raises app healthcheck start_period to 900s, wired via recipe_meta COMPOSE_FILE + install_steps.sh (+ CHAOS_BASE_DEPLOY for the untracked overlay). No assertion weakened. Budget 1200s = migration + convergence. Only the install tier needs it (upgrade redeploys on the populated DB → fast boot).
19 lines
1.2 KiB
YAML
19 lines
1.2 KiB
YAML
# cc-ci deploy overlay (NOT a recipe change) — raises ONLY the app healthcheck start_period.
|
||
#
|
||
# Ghost's first-boot runs a full schema migration (dozens of CREATE TABLEs, each a separate MySQL
|
||
# round-trip → ~6-9min on cc-ci) against the fresh `ghost` DB. The upstream recipe healthcheck uses
|
||
# `start_period: 1m` (+ 10×30s retries ≈ 6min grace); on cc-ci the migration regularly exceeds that,
|
||
# so swarm marks the still-migrating task unhealthy and KILLS it mid-migration — which leaves a stale
|
||
# `migrations_lock` row, and every later task then refuses to boot (`MigrationsAreLockedError`
|
||
# deadlock). This is round-trip-bound, so more vCPU does not close the gap.
|
||
#
|
||
# Raising the START_PERIOD (failures ignored during it; a PASS still marks healthy immediately) lets
|
||
# the fresh migration finish + release the lock, after which Ghost serves and the (unchanged) check
|
||
# passes. This is DEPLOY/infra tuning, not a test change — no assertion is weakened, and the app's
|
||
# real healthcheck still gates readiness. Applied via recipe_meta COMPOSE_FILE; only the install
|
||
# tier's fresh migration needs it (the upgrade redeploy boots on the already-populated DB → fast).
|
||
services:
|
||
app:
|
||
healthcheck:
|
||
start_period: 900s
|