feat(2): discourse Q4.6 policy-compliant shape (plan §9) — env-var start_period, delete cc-ci overlay, upgrade N/A

Migrate discourse off the cc-ci compose overlay per plan §9 / plan-prefer-env-over-compose-overlay.md:
- recipe_meta: drop UPGRADE_BASE_VERSION + COMPOSE_FILE + CHAOS_BASE_DEPLOY; set APP_START_PERIOD=1200s
  via EXTRA_ENV (the recipe-PR exposes start_period: ${APP_START_PERIOD:-5m}); declare upgrade tier N/A
  (both published prev bases pin removed bitnami images; Adversary §7.1 granted, REVIEW-2 efe3790).
- delete tests/discourse/compose.ccci-health.yml + install_steps.sh (existed only to copy the overlay).
- DECISIONS.md + STATUS-2 record the §9 guardrail + discourse shape (upgrade N/A, env start_period,
  pg_backup restore-hook recipe-PR = 5th data-loss recipe cc-ci caught).
recipe-PR head now 8b8df17 (start_period env var added). Not a claim — run STAGES=install,backup,restore,custom next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-30 15:47:28 +01:00
parent a389bd0832
commit c346b9763b
5 changed files with 71 additions and 99 deletions

View File

@ -1001,3 +1001,35 @@ run when ClickHouse fails to boot. NOT weakening anything.
**Re-entry:** when the ClickHouse boot is stabilised (e.g. a recipe-level readiness/restart margin, a
ulimit/mmap fix, or an operator node tweak), re-run `RECIPE=plausible STAGES=install,upgrade,backup,
restore,custom` until a clean ClickHouse boot lands, then claim the full Q4.7 gate. Filed in DEFERRED.md.
## 2026-05-30 — plan §9 anti-overlay guardrail + discourse Q4.6 policy-compliant shape
Orchestrator policy (plan.md §9 + cc-ci-plan/plan-prefer-env-over-compose-overlay.md): AVOID cc-ci
`compose.*.yml` overlays (a private fork that drifts from what ships). Preferred fixes:
1. cc-ci-tuned value (e.g. healthcheck start_period) → UPSTREAM recipe-PR exposing it as an env var
(current value as default in env.sample); cc-ci sets it via `recipe_meta` EXTRA_ENV. No new compose.
2. Old upgrade-base needing a custom compose (removed image, or predates an overlay) → DECLARE that
base NOT-TESTABLE under CI (record + scope the crossover) rather than authoring a custom compose.
**discourse Q4.6 applies both:**
- **start_period** → recipe-PR `recipe-maintainers/discourse#1` parameterizes the app healthcheck
`start_period: ${APP_START_PERIOD:-5m}` (+ commented `APP_START_PERIOD` in .env.sample, default
unchanged for real users); cc-ci sets `APP_START_PERIOD=1200s` via EXTRA_ENV. The cc-ci overlay
`tests/discourse/compose.ccci-health.yml` + `install_steps.sh` + `COMPOSE_FILE`/`CHAOS_BASE_DEPLOY`
are DELETED.
- **upgrade tier N/A** (Adversary §7.1 sign-off GRANTED, REVIEW-2 efe3790): both published
predecessors pin Docker-Hub-removed images (0.7.0+3.3.1→bitnami/discourse:3.3.1 404, 0.6.3+3.1.2→
bitnami/discourse:3.1.2 404). Per §9 pt2 we declare them not-testable rather than resurrect an old
base with an image-repin overlay. So discourse runs the maximal subset install,backup,restore,custom.
(The earlier "honest 0.7.0→0.8.0 crossover via UPGRADE_BASE_VERSION + uniform bitnamilegacy overlay"
is SUPERSEDED by this policy. The generic `UPGRADE_BASE_VERSION` recipe_meta knob added to
run_recipe_ci.py stays as a harmless unused generic hook.)
- **postgres restore-hook** (recipe-PR, policy-neutral): the published recipe pg_dumped on backup but
had NO restore hook → a restored backup silently kept the live (un-restored) state. cc-ci's P4
overlay caught it (seeded ci_marker gone after restore). The PR adds `pg_backup.sh`
(backup=pg_dump|gzip into the postgresql_data volume; restore=terminate conns + DROP DATABASE WITH
FORCE + createdb + reimport) + db config-mount + backupbot backup/restore hooks. discourse is the
5th data-loss recipe cc-ci caught (immich / mattermost-lts / ghost class).
Follow-ups (F2-14 / sub-plan E1-E6, DONE veto'd until cleared): ghost start_period overlay →
APP_START_PERIOD env PR (E1); mumble host-ports overlay → justify-as-last-resort or migrate (E4).