feat(2): discourse start_period via literal recipe-PR bump (abra can't env-interpolate start_period)
abra rejects env-interpolation in healthcheck start_period (FATA 'Does not match
format duration' for both ${VAR} and quoted forms — validates the literal compose
duration before .env substitution). So §9 pt1's env-var route is impossible for
this field; the §9-compliant fix is a LITERAL start_period:20m bump in the
recipe-PR (recipe everyone runs, not a cc-ci overlay; strictly safer). Remove
APP_START_PERIOD from recipe_meta EXTRA_ENV; record the finding in DECISIONS
(ghost E1 must use the same approach); STATUS-2 → new PR head 7a2e0e0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -1033,3 +1033,24 @@ Orchestrator policy (plan.md §9 + cc-ci-plan/plan-prefer-env-over-compose-overl
|
||||
|
||||
Follow-ups (F2-14 / sub-plan E1-E6, DONE veto'd until cleared): ghost start_period overlay →
|
||||
APP_START_PERIOD env PR (E1); mumble host-ports overlay → justify-as-last-resort or migrate (E4).
|
||||
|
||||
## 2026-05-30 — FINDING: abra rejects env-interpolated healthcheck start_period → literal recipe-PR bump (§9)
|
||||
|
||||
While migrating discourse off its cc-ci compose overlay per plan §9 (prefer an upstream env-var
|
||||
recipe-PR over a cc-ci `compose.*.yml`), discovered abra CANNOT env-interpolate the healthcheck
|
||||
`start_period` field: both `start_period: ${APP_START_PERIOD:-5m}` and the quoted
|
||||
`start_period: "${APP_START_PERIOD:-5m}"` FATA at `abra app new` with
|
||||
`services.app.healthcheck.start_period Does not match format 'duration'`. abra validates the compose
|
||||
schema's duration format on the LITERAL template string before any `.env` substitution, and NO recipe
|
||||
in the catalogue env-interpolates start_period (grep confirmed empty).
|
||||
|
||||
**Consequence for §9 pt1:** "expose the cc-ci-tuned value as an env var" is NOT achievable for
|
||||
`start_period` specifically. The §9-compliant alternative is a **LITERAL bump in the upstream
|
||||
recipe-PR** — still NOT a cc-ci compose overlay (the change lives in the recipe everyone runs), and a
|
||||
larger start_period is strictly safer for all users (it only widens the startup failure-grace; a
|
||||
healthy check still marks healthy immediately, so fast hosts are unaffected). Precedent: the
|
||||
sub-plan's own lasuite-drive collabora "start_period [KEYSTONE]" recipe-PR.
|
||||
|
||||
- **discourse**: recipe-PR `recipe-maintainers/discourse#1` sets `start_period: 20m` (covers the
|
||||
15-25min Rails first-boot; default was 5m). cc-ci recipe_meta no longer sets APP_START_PERIOD.
|
||||
- **ghost (E1)**: must use the SAME literal-bump approach, NOT an env var (same abra limitation).
|
||||
|
||||
@ -9,16 +9,20 @@ HEALTH_OK = (200,)
|
||||
DEPLOY_TIMEOUT = 2400 # slow Rails cold boot (15-25min); matches the EXTRA_ENV TIMEOUT below
|
||||
HTTP_TIMEOUT = 1200
|
||||
|
||||
# Slow-cold-boot handling via env, NOT a cc-ci compose overlay (plan.md §9 anti-drift guardrail):
|
||||
# discourse's 15-25min Rails cold boot exceeds the recipe healthcheck's default start_period (5m) +
|
||||
# grace, so swarm would kill the still-booting app and the deploy never converges. Rather than fork
|
||||
# the recipe with a compose.*.yml overlay (which drifts from what ships), the recipe-PR
|
||||
# (recipe-maintainers/discourse#1) parameterizes the app healthcheck as
|
||||
# `start_period: ${APP_START_PERIOD:-5m}` (default unchanged for real users); cc-ci just sets a larger
|
||||
# value here. TIMEOUT (abra's internal convergence wait) is raised to outlast the boot.
|
||||
# Slow-cold-boot handling via a LITERAL recipe-PR start_period bump, NOT a cc-ci compose overlay
|
||||
# (plan.md §9 anti-drift guardrail). discourse's 15-25min Rails cold boot exceeds the recipe
|
||||
# healthcheck's default start_period (5m) + grace, so swarm would kill the still-booting app and the
|
||||
# deploy never converges. §9 pt1 prefers exposing such a value as an env var — but abra REJECTS
|
||||
# env-interpolation in healthcheck `start_period` (`FATA ...Does not match format 'duration'` for both
|
||||
# `${VAR}` and quoted `"${VAR:-5m}"`; it validates the literal compose duration before substitution,
|
||||
# and no catalogue recipe env-interpolates start_period). So the §9-compliant fix is a LITERAL bump in
|
||||
# the recipe-PR (recipe-maintainers/discourse#1): `start_period: 20m` on the app healthcheck — a change
|
||||
# to the recipe EVERYONE runs (not a cc-ci fork), and strictly safer (start_period only widens the
|
||||
# startup grace; a healthy check still marks healthy immediately, so fast hosts are unaffected).
|
||||
# Precedent: the lasuite-drive collabora start_period recipe-PR. (See DECISIONS.md 2026-05-30.)
|
||||
# TIMEOUT (abra's internal convergence wait) is raised to outlast the boot.
|
||||
EXTRA_ENV = {
|
||||
"TIMEOUT": "2400",
|
||||
"APP_START_PERIOD": "1200s",
|
||||
}
|
||||
|
||||
# Upgrade tier — N/A (declared NOT-TESTABLE under cc-ci; Adversary §7.1 sign-off GRANTED, REVIEW-2
|
||||
|
||||
Reference in New Issue
Block a user