feat(2): discourse start_period via literal recipe-PR bump (abra can't env-interpolate start_period)

abra rejects env-interpolation in healthcheck start_period (FATA 'Does not match
format duration' for both ${VAR} and quoted forms — validates the literal compose
duration before .env substitution). So §9 pt1's env-var route is impossible for
this field; the §9-compliant fix is a LITERAL start_period:20m bump in the
recipe-PR (recipe everyone runs, not a cc-ci overlay; strictly safer). Remove
APP_START_PERIOD from recipe_meta EXTRA_ENV; record the finding in DECISIONS
(ghost E1 must use the same approach); STATUS-2 → new PR head 7a2e0e0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-30 16:24:45 +01:00
parent 5c2d4c2af3
commit fb20321bd9
2 changed files with 33 additions and 8 deletions

View File

@ -1033,3 +1033,24 @@ Orchestrator policy (plan.md §9 + cc-ci-plan/plan-prefer-env-over-compose-overl
Follow-ups (F2-14 / sub-plan E1-E6, DONE veto'd until cleared): ghost start_period overlay →
APP_START_PERIOD env PR (E1); mumble host-ports overlay → justify-as-last-resort or migrate (E4).
## 2026-05-30 — FINDING: abra rejects env-interpolated healthcheck start_period → literal recipe-PR bump (§9)
While migrating discourse off its cc-ci compose overlay per plan §9 (prefer an upstream env-var
recipe-PR over a cc-ci `compose.*.yml`), discovered abra CANNOT env-interpolate the healthcheck
`start_period` field: both `start_period: ${APP_START_PERIOD:-5m}` and the quoted
`start_period: "${APP_START_PERIOD:-5m}"` FATA at `abra app new` with
`services.app.healthcheck.start_period Does not match format 'duration'`. abra validates the compose
schema's duration format on the LITERAL template string before any `.env` substitution, and NO recipe
in the catalogue env-interpolates start_period (grep confirmed empty).
**Consequence for §9 pt1:** "expose the cc-ci-tuned value as an env var" is NOT achievable for
`start_period` specifically. The §9-compliant alternative is a **LITERAL bump in the upstream
recipe-PR** — still NOT a cc-ci compose overlay (the change lives in the recipe everyone runs), and a
larger start_period is strictly safer for all users (it only widens the startup failure-grace; a
healthy check still marks healthy immediately, so fast hosts are unaffected). Precedent: the
sub-plan's own lasuite-drive collabora "start_period [KEYSTONE]" recipe-PR.
- **discourse**: recipe-PR `recipe-maintainers/discourse#1` sets `start_period: 20m` (covers the
15-25min Rails first-boot; default was 5m). cc-ci recipe_meta no longer sets APP_START_PERIOD.
- **ghost (E1)**: must use the SAME literal-bump approach, NOT an env var (same abra limitation).

View File

@ -9,16 +9,20 @@ HEALTH_OK = (200,)
DEPLOY_TIMEOUT = 2400 # slow Rails cold boot (15-25min); matches the EXTRA_ENV TIMEOUT below
HTTP_TIMEOUT = 1200
# Slow-cold-boot handling via env, NOT a cc-ci compose overlay (plan.md §9 anti-drift guardrail):
# discourse's 15-25min Rails cold boot exceeds the recipe healthcheck's default start_period (5m) +
# grace, so swarm would kill the still-booting app and the deploy never converges. Rather than fork
# the recipe with a compose.*.yml overlay (which drifts from what ships), the recipe-PR
# (recipe-maintainers/discourse#1) parameterizes the app healthcheck as
# `start_period: ${APP_START_PERIOD:-5m}` (default unchanged for real users); cc-ci just sets a larger
# value here. TIMEOUT (abra's internal convergence wait) is raised to outlast the boot.
# Slow-cold-boot handling via a LITERAL recipe-PR start_period bump, NOT a cc-ci compose overlay
# (plan.md §9 anti-drift guardrail). discourse's 15-25min Rails cold boot exceeds the recipe
# healthcheck's default start_period (5m) + grace, so swarm would kill the still-booting app and the
# deploy never converges. §9 pt1 prefers exposing such a value as an env var — but abra REJECTS
# env-interpolation in healthcheck `start_period` (`FATA ...Does not match format 'duration'` for both
# `${VAR}` and quoted `"${VAR:-5m}"`; it validates the literal compose duration before substitution,
# and no catalogue recipe env-interpolates start_period). So the §9-compliant fix is a LITERAL bump in
# the recipe-PR (recipe-maintainers/discourse#1): `start_period: 20m` on the app healthcheck — a change
# to the recipe EVERYONE runs (not a cc-ci fork), and strictly safer (start_period only widens the
# startup grace; a healthy check still marks healthy immediately, so fast hosts are unaffected).
# Precedent: the lasuite-drive collabora start_period recipe-PR. (See DECISIONS.md 2026-05-30.)
# TIMEOUT (abra's internal convergence wait) is raised to outlast the boot.
EXTRA_ENV = {
"TIMEOUT": "2400",
"APP_START_PERIOD": "1200s",
}
# Upgrade tier — N/A (declared NOT-TESTABLE under cc-ci; Adversary §7.1 sign-off GRANTED, REVIEW-2