feat(2): discourse start_period via literal recipe-PR bump (abra can't env-interpolate start_period)
abra rejects env-interpolation in healthcheck start_period (FATA 'Does not match
format duration' for both ${VAR} and quoted forms — validates the literal compose
duration before .env substitution). So §9 pt1's env-var route is impossible for
this field; the §9-compliant fix is a LITERAL start_period:20m bump in the
recipe-PR (recipe everyone runs, not a cc-ci overlay; strictly safer). Remove
APP_START_PERIOD from recipe_meta EXTRA_ENV; record the finding in DECISIONS
(ghost E1 must use the same approach); STATUS-2 → new PR head 7a2e0e0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -1033,3 +1033,24 @@ Orchestrator policy (plan.md §9 + cc-ci-plan/plan-prefer-env-over-compose-overl
|
|||||||
|
|
||||||
Follow-ups (F2-14 / sub-plan E1-E6, DONE veto'd until cleared): ghost start_period overlay →
|
Follow-ups (F2-14 / sub-plan E1-E6, DONE veto'd until cleared): ghost start_period overlay →
|
||||||
APP_START_PERIOD env PR (E1); mumble host-ports overlay → justify-as-last-resort or migrate (E4).
|
APP_START_PERIOD env PR (E1); mumble host-ports overlay → justify-as-last-resort or migrate (E4).
|
||||||
|
|
||||||
|
## 2026-05-30 — FINDING: abra rejects env-interpolated healthcheck start_period → literal recipe-PR bump (§9)
|
||||||
|
|
||||||
|
While migrating discourse off its cc-ci compose overlay per plan §9 (prefer an upstream env-var
|
||||||
|
recipe-PR over a cc-ci `compose.*.yml`), discovered abra CANNOT env-interpolate the healthcheck
|
||||||
|
`start_period` field: both `start_period: ${APP_START_PERIOD:-5m}` and the quoted
|
||||||
|
`start_period: "${APP_START_PERIOD:-5m}"` FATA at `abra app new` with
|
||||||
|
`services.app.healthcheck.start_period Does not match format 'duration'`. abra validates the compose
|
||||||
|
schema's duration format on the LITERAL template string before any `.env` substitution, and NO recipe
|
||||||
|
in the catalogue env-interpolates start_period (grep confirmed empty).
|
||||||
|
|
||||||
|
**Consequence for §9 pt1:** "expose the cc-ci-tuned value as an env var" is NOT achievable for
|
||||||
|
`start_period` specifically. The §9-compliant alternative is a **LITERAL bump in the upstream
|
||||||
|
recipe-PR** — still NOT a cc-ci compose overlay (the change lives in the recipe everyone runs), and a
|
||||||
|
larger start_period is strictly safer for all users (it only widens the startup failure-grace; a
|
||||||
|
healthy check still marks healthy immediately, so fast hosts are unaffected). Precedent: the
|
||||||
|
sub-plan's own lasuite-drive collabora "start_period [KEYSTONE]" recipe-PR.
|
||||||
|
|
||||||
|
- **discourse**: recipe-PR `recipe-maintainers/discourse#1` sets `start_period: 20m` (covers the
|
||||||
|
15-25min Rails first-boot; default was 5m). cc-ci recipe_meta no longer sets APP_START_PERIOD.
|
||||||
|
- **ghost (E1)**: must use the SAME literal-bump approach, NOT an env var (same abra limitation).
|
||||||
|
|||||||
@ -9,16 +9,20 @@ HEALTH_OK = (200,)
|
|||||||
DEPLOY_TIMEOUT = 2400 # slow Rails cold boot (15-25min); matches the EXTRA_ENV TIMEOUT below
|
DEPLOY_TIMEOUT = 2400 # slow Rails cold boot (15-25min); matches the EXTRA_ENV TIMEOUT below
|
||||||
HTTP_TIMEOUT = 1200
|
HTTP_TIMEOUT = 1200
|
||||||
|
|
||||||
# Slow-cold-boot handling via env, NOT a cc-ci compose overlay (plan.md §9 anti-drift guardrail):
|
# Slow-cold-boot handling via a LITERAL recipe-PR start_period bump, NOT a cc-ci compose overlay
|
||||||
# discourse's 15-25min Rails cold boot exceeds the recipe healthcheck's default start_period (5m) +
|
# (plan.md §9 anti-drift guardrail). discourse's 15-25min Rails cold boot exceeds the recipe
|
||||||
# grace, so swarm would kill the still-booting app and the deploy never converges. Rather than fork
|
# healthcheck's default start_period (5m) + grace, so swarm would kill the still-booting app and the
|
||||||
# the recipe with a compose.*.yml overlay (which drifts from what ships), the recipe-PR
|
# deploy never converges. §9 pt1 prefers exposing such a value as an env var — but abra REJECTS
|
||||||
# (recipe-maintainers/discourse#1) parameterizes the app healthcheck as
|
# env-interpolation in healthcheck `start_period` (`FATA ...Does not match format 'duration'` for both
|
||||||
# `start_period: ${APP_START_PERIOD:-5m}` (default unchanged for real users); cc-ci just sets a larger
|
# `${VAR}` and quoted `"${VAR:-5m}"`; it validates the literal compose duration before substitution,
|
||||||
# value here. TIMEOUT (abra's internal convergence wait) is raised to outlast the boot.
|
# and no catalogue recipe env-interpolates start_period). So the §9-compliant fix is a LITERAL bump in
|
||||||
|
# the recipe-PR (recipe-maintainers/discourse#1): `start_period: 20m` on the app healthcheck — a change
|
||||||
|
# to the recipe EVERYONE runs (not a cc-ci fork), and strictly safer (start_period only widens the
|
||||||
|
# startup grace; a healthy check still marks healthy immediately, so fast hosts are unaffected).
|
||||||
|
# Precedent: the lasuite-drive collabora start_period recipe-PR. (See DECISIONS.md 2026-05-30.)
|
||||||
|
# TIMEOUT (abra's internal convergence wait) is raised to outlast the boot.
|
||||||
EXTRA_ENV = {
|
EXTRA_ENV = {
|
||||||
"TIMEOUT": "2400",
|
"TIMEOUT": "2400",
|
||||||
"APP_START_PERIOD": "1200s",
|
|
||||||
}
|
}
|
||||||
|
|
||||||
# Upgrade tier — N/A (declared NOT-TESTABLE under cc-ci; Adversary §7.1 sign-off GRANTED, REVIEW-2
|
# Upgrade tier — N/A (declared NOT-TESTABLE under cc-ci; Adversary §7.1 sign-off GRANTED, REVIEW-2
|
||||||
|
|||||||
Reference in New Issue
Block a user