Files
cc-ci/tests/discourse/compose.ccci-health.yml
autonomic-bot a432058aca fix(2): discourse healthcheck start_period overlay (slow Rails boot) + CHAOS_BASE_DEPLOY + TIMEOUT 2400
Install timed out at 1800s: discourse's 15-25min Rails cold boot overran both the deploy timeout and
the recipe healthcheck start_period:5m (swarm killed the booting app). Add compose.ccci-health.yml
(app healthcheck start_period 1200s) via install_steps.sh + recipe_meta COMPOSE_FILE + CHAOS_BASE_DEPLOY,
bump DEPLOY_TIMEOUT/TIMEOUT to 2400. Image re-pin (bitnamilegacy) already proven working. NO test weakened.
2026-05-30 11:48:18 +01:00

19 lines
1.2 KiB
YAML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# cc-ci deploy overlay (NOT a recipe change) — raises ONLY the app healthcheck start_period.
#
# Discourse (bitnamilegacy/discourse) is a slow-booting Rails app: its first cold boot does DB
# migrate + asset precompile + bootstrap, which on cc-ci's single node regularly takes 15-25min. The
# upstream recipe healthcheck on the `app` service uses `start_period: 5m` (+ 6×30s retries ≈ 8min
# grace); on cc-ci the boot exceeds that, so swarm marks the still-booting task unhealthy and KILLS
# it mid-boot, it restarts, and the deploy never converges within the timeout (observed: deploy timed
# out at 1800s with the app task still Running).
#
# Raising the START_PERIOD (failures ignored during it; a PASS still marks healthy immediately) lets
# the cold boot finish, after which discourse serves /srv/status and the (unchanged) check passes.
# This is DEPLOY/infra tuning, not a test change — no assertion is weakened, and the app's real
# healthcheck still gates readiness. Applied via recipe_meta COMPOSE_FILE. The `app` service name is
# verified against the PR-head compose (ci/bitnamilegacy-repin: services.app holds the healthcheck).
services:
app:
healthcheck:
start_period: 1200s