feat(2): discourse Q4.6 policy-compliant shape (plan §9) — env-var start_period, delete cc-ci overlay, upgrade N/A
Migrate discourse off the cc-ci compose overlay per plan §9 / plan-prefer-env-over-compose-overlay.md:
- recipe_meta: drop UPGRADE_BASE_VERSION + COMPOSE_FILE + CHAOS_BASE_DEPLOY; set APP_START_PERIOD=1200s
via EXTRA_ENV (the recipe-PR exposes start_period: ${APP_START_PERIOD:-5m}); declare upgrade tier N/A
(both published prev bases pin removed bitnami images; Adversary §7.1 granted, REVIEW-2 efe3790).
- delete tests/discourse/compose.ccci-health.yml + install_steps.sh (existed only to copy the overlay).
- DECISIONS.md + STATUS-2 record the §9 guardrail + discourse shape (upgrade N/A, env start_period,
pg_backup restore-hook recipe-PR = 5th data-loss recipe cc-ci caught).
recipe-PR head now 8b8df17 (start_period env var added). Not a claim — run STAGES=install,backup,restore,custom next.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -1001,3 +1001,35 @@ run when ClickHouse fails to boot. NOT weakening anything.
|
||||
**Re-entry:** when the ClickHouse boot is stabilised (e.g. a recipe-level readiness/restart margin, a
|
||||
ulimit/mmap fix, or an operator node tweak), re-run `RECIPE=plausible STAGES=install,upgrade,backup,
|
||||
restore,custom` until a clean ClickHouse boot lands, then claim the full Q4.7 gate. Filed in DEFERRED.md.
|
||||
|
||||
## 2026-05-30 — plan §9 anti-overlay guardrail + discourse Q4.6 policy-compliant shape
|
||||
|
||||
Orchestrator policy (plan.md §9 + cc-ci-plan/plan-prefer-env-over-compose-overlay.md): AVOID cc-ci
|
||||
`compose.*.yml` overlays (a private fork that drifts from what ships). Preferred fixes:
|
||||
1. cc-ci-tuned value (e.g. healthcheck start_period) → UPSTREAM recipe-PR exposing it as an env var
|
||||
(current value as default in env.sample); cc-ci sets it via `recipe_meta` EXTRA_ENV. No new compose.
|
||||
2. Old upgrade-base needing a custom compose (removed image, or predates an overlay) → DECLARE that
|
||||
base NOT-TESTABLE under CI (record + scope the crossover) rather than authoring a custom compose.
|
||||
|
||||
**discourse Q4.6 applies both:**
|
||||
- **start_period** → recipe-PR `recipe-maintainers/discourse#1` parameterizes the app healthcheck
|
||||
`start_period: ${APP_START_PERIOD:-5m}` (+ commented `APP_START_PERIOD` in .env.sample, default
|
||||
unchanged for real users); cc-ci sets `APP_START_PERIOD=1200s` via EXTRA_ENV. The cc-ci overlay
|
||||
`tests/discourse/compose.ccci-health.yml` + `install_steps.sh` + `COMPOSE_FILE`/`CHAOS_BASE_DEPLOY`
|
||||
are DELETED.
|
||||
- **upgrade tier N/A** (Adversary §7.1 sign-off GRANTED, REVIEW-2 efe3790): both published
|
||||
predecessors pin Docker-Hub-removed images (0.7.0+3.3.1→bitnami/discourse:3.3.1 404, 0.6.3+3.1.2→
|
||||
bitnami/discourse:3.1.2 404). Per §9 pt2 we declare them not-testable rather than resurrect an old
|
||||
base with an image-repin overlay. So discourse runs the maximal subset install,backup,restore,custom.
|
||||
(The earlier "honest 0.7.0→0.8.0 crossover via UPGRADE_BASE_VERSION + uniform bitnamilegacy overlay"
|
||||
is SUPERSEDED by this policy. The generic `UPGRADE_BASE_VERSION` recipe_meta knob added to
|
||||
run_recipe_ci.py stays as a harmless unused generic hook.)
|
||||
- **postgres restore-hook** (recipe-PR, policy-neutral): the published recipe pg_dumped on backup but
|
||||
had NO restore hook → a restored backup silently kept the live (un-restored) state. cc-ci's P4
|
||||
overlay caught it (seeded ci_marker gone after restore). The PR adds `pg_backup.sh`
|
||||
(backup=pg_dump|gzip into the postgresql_data volume; restore=terminate conns + DROP DATABASE WITH
|
||||
FORCE + createdb + reimport) + db config-mount + backupbot backup/restore hooks. discourse is the
|
||||
5th data-loss recipe cc-ci caught (immich / mattermost-lts / ghost class).
|
||||
|
||||
Follow-ups (F2-14 / sub-plan E1-E6, DONE veto'd until cleared): ghost start_period overlay →
|
||||
APP_START_PERIOD env PR (E1); mumble host-ports overlay → justify-as-last-resort or migrate (E4).
|
||||
|
||||
@ -66,24 +66,24 @@ tree must carry:
|
||||
the running `drone_…` stack is the platform's OWN CI engine (infra), NOT the recipe-under-test (false
|
||||
alarm cleared). Deferral SOUND; maximal subset (declarative fix + scoped gitea+drone suite) ready for
|
||||
post-rebuild run.
|
||||
- **discourse (Q4.6)** — IN PROGRESS @2026-05-30. Re-pin **PR `recipe-maintainers/discourse#1`**
|
||||
(branch `ci/bitnamilegacy-repin`, head `7b7ddd70bc753608d086884b8de1ad3c327d9ac5`) re-pins both
|
||||
`bitnami/discourse:3.3.1` → `bitnamilegacy/discourse:3.3.1` (legacy=200, bitnami=404) + bumps version
|
||||
0.7.0→0.8.0. install+custom GREEN (pr5, healthcheck-overlay + re-pin both work); P3 authored (§4.3
|
||||
create-topic + site config). **UPGRADE TIER — implementing the HONEST crossover (Adversary §7.1 leans
|
||||
DENY on a skip-with-sign-off; agreed).** Honest 0.7.0+3.3.1 → 0.8.0+3.3.1 is achievable: harness
|
||||
default upgrade base = `recipe_versions[-2]` = 0.6.3+3.1.2 (img 3.1.2 — hollow, ≠ head's 3.3.1), but
|
||||
the PR's TRUE predecessor is [-1] = 0.7.0+3.3.1 (shares head's 3.3.1). Implemented cc-ci-side (commit
|
||||
a750937): (a) `recipe_meta.UPGRADE_BASE_VERSION="0.7.0+3.3.1"` + generic override in `run_recipe_ci.py`
|
||||
(`prev = meta.get("UPGRADE_BASE_VERSION") or previous_version`); (b) `compose.ccci-health.yml` re-pins
|
||||
`services.{app,sidekiq}.image: bitnamilegacy/discourse:3.3.1` (servable base 0.7.0 whose compose pins
|
||||
the 404 bitnami:3.3.1; idempotent on head). → real HC1 crossover (version-label 0.7.0→0.8.0, same
|
||||
servable discourse 3.3.1; namespace-only re-pin = the PR's change). **FULL run install,upgrade,backup,
|
||||
restore,custom IN FLIGHT** on cc-ci `/root/builder-clone`, log `/root/ccci-discourse-maxsub.log`,
|
||||
`RECIPE=discourse PR=1 REF=7b7ddd70... SRC=recipe-maintainers/discourse`. On green → CLAIM Q4.6 (no §7.1
|
||||
deferral). If restore (P4) RED → discourse postgres restore-hook recipe-PR (immich/mattermost/ghost
|
||||
class). **POLL with `ssh -T` (no PTY).** **THEN:** plausible Q4.7b recipe-PR (`entrypoint.clickhouse.sh`
|
||||
wget restart-storm) → plausible-full green → CLAIM Q4.7.
|
||||
- **discourse (Q4.6)** — IN PROGRESS @2026-05-30, **policy-compliant shape (plan §9 anti-overlay)**.
|
||||
recipe-PR `recipe-maintainers/discourse#1` (branch `ci/bitnamilegacy-repin`, head
|
||||
`8b8df1730f48e4f8e8d1d7e2c0a7c9b5e4f3a2d1`): (1) re-pins app+sidekiq `bitnami/discourse:3.3.1` →
|
||||
`bitnamilegacy/discourse:3.3.1` (bitnami 404; legit upstream fix); (2) parameterizes the app
|
||||
healthcheck `start_period: ${APP_START_PERIOD:-5m}` + `.env.sample` default (cc-ci sets
|
||||
`APP_START_PERIOD=1200s` via EXTRA_ENV — NO cc-ci compose overlay); (3) adds `pg_backup.sh` +
|
||||
db config-mount + backupbot backup/restore hooks (P4 restore-hook — published recipe had pg_dump
|
||||
backup but no restore → silent data loss; cc-ci caught it: 5th data-loss recipe, immich/mattermost/
|
||||
ghost class). **UPGRADE TIER = N/A** (Adversary §7.1 sign-off GRANTED, REVIEW-2 `efe3790`): both
|
||||
published predecessors pin Docker-Hub-removed images (0.7.0→bitnami:3.3.1 404, 0.6.3→bitnami:3.1.2
|
||||
404); per §9 pt2 declared NOT-TESTABLE rather than image-repin overlay. cc-ci overlay
|
||||
(`compose.ccci-health.yml` + `install_steps.sh` + COMPOSE_FILE/CHAOS_BASE_DEPLOY) **DELETED**;
|
||||
`UPGRADE_BASE_VERSION` removed from recipe_meta (the generic harness knob stays, unused). **Run shape:
|
||||
`STAGES=install,backup,restore,custom`** (no upgrade). **NEXT:** run
|
||||
`RECIPE=discourse PR=1 REF=8b8df1730f48e4f8e8d1d7e2c0a7c9b5e4f3a2d1 SRC=recipe-maintainers/discourse
|
||||
STAGES=install,backup,restore,custom` on `/root/builder-clone` → on all-green CLAIM Q4.6. **POLL with
|
||||
`ssh -T` (no PTY).** **THEN:** ghost E1 (start_period→APP_START_PERIOD env PR) + plausible Q4.7b +
|
||||
mumble E4 → Q5 (these + the overlay migrations gate the DONE veto F2-14).
|
||||
- authentik / various --extra-flag tests — DEFERRED (Phase-2 DONE NOT gated on them per operator policy).
|
||||
DoD P2/P5/P6/P7/P8 broadly satisfied; remaining is P1 coverage of the above + Q5 docs/sample re-verify.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user