fix(2): discourse base-deploy timeout — prepull-enable (sidekiq depends_on app, valid compose) + 3600s timeout

full4 base deploy timed out at 2400s on the 7-GiB single node. Root causes: (1) sidekiq.depends_on referenced undefined service 'discourse' (main svc is 'app') → abra config --images rc=15 → prepull SKIPPED → 2.4GB image pulled inline during deploy, eating convergence budget. Overlay now overrides sidekiq.depends_on:[app] (swarm ignores depends_on → no-op at runtime, masks nothing) so prepull resolves+pre-pulls images on both base+head deploys. (2) bumped DEPLOY_TIMEOUT/TIMEOUT 2400→3600 for headroom on the RAM/CPU-constrained Rails cold boot. Also pre-cached bitnamilegacy/discourse:3.3.1 by tag on cc-ci (was dangling <none>). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 01:23:38 +00:00
parent bcc32d997b
commit 04cc44c15e
2 changed files with 18 additions and 2 deletions
--- a/tests/discourse/recipe_meta.py
+++ b/tests/discourse/recipe_meta.py
@ -6,7 +6,8 @@
 # app is actually serving (the canonical "is discourse up" signal — NOT "/", which may redirect to setup).
 HEALTH_PATH = "/srv/status"
 HEALTH_OK = (200,)
-DEPLOY_TIMEOUT = 2400  # slow Rails cold boot (15-25min); matches the EXTRA_ENV TIMEOUT below
+DEPLOY_TIMEOUT = 3600  # slow Rails cold boot (15-25min) on the 7-GiB single node; bumped 2400→3600 for
+# headroom after full4's base deploy timed out at 2400s (RAM/CPU-constrained boot + image re-pull).
 HTTP_TIMEOUT = 1200

 # Slow-cold-boot handling: the recipe-PR (recipe-maintainers/discourse#1) bumps the app healthcheck
@ -33,7 +34,7 @@ HTTP_TIMEOUT = 1200
 CHAOS_BASE_DEPLOY = True
 UPGRADE_BASE_VERSION = "0.7.0+3.3.1"
 EXTRA_ENV = {
-    "TIMEOUT": "2400",
+    "TIMEOUT": "3600",  # abra's internal convergence wait; matches DEPLOY_TIMEOUT (slow Rails boot headroom)
    "COMPOSE_FILE": "compose.yml:compose.ccci.yml",
 }