diff --git a/cc-ci-plan/plan-prefer-env-over-compose-overlay.md b/cc-ci-plan/plan-prefer-env-over-compose-overlay.md deleted file mode 100644 index ace9357..0000000 --- a/cc-ci-plan/plan-prefer-env-over-compose-overlay.md +++ /dev/null @@ -1,76 +0,0 @@ -# Sub-plan — prefer upstream env-parameterization over cc-ci compose overlays - -**Status:** QUEUED — a policy + a set of recipe-PR / harness follow-ups. Picks up as near-term Phase-2 -units (no phase-pause). **Owner:** Builder + Adversary loops. **This file:** -`/srv/cc-ci/cc-ci-plan/plan-prefer-env-over-compose-overlay.md` -**Codifies:** the `plan.md §9` guardrail "Don't fork the recipe's compose — parameterize upstream, -tune via env." - ---- - -## 0. Principle (operator, 2026-05-30) - -**A cc-ci-authored compose file/overlay must be avoided wherever possible** — every extra -`compose.*.yml` we layer via `COMPOSE_FILE` is a private fork of the deployment that can **drift from -the recipe users actually run**, so we'd stop testing what ships. Two strictly-preferred alternatives: - -1. **Need a value tuned for cc-ci's env (e.g. a longer healthcheck `start_period`)** → open an - **upstream recipe PR** that exposes it as an **env var** (current value as the default in - `env.sample`), e.g. `APP_START_PERIOD`. cc-ci then sets that env in the app `.env` (via - `recipe_meta` `EXTRA_ENV`) — **no new compose**. Bonus: real operators on slow hosts get the same - knob. -2. **Need a custom compose only to make the UPGRADE tier work from an older base version** (a - since-removed image tag, or an overlay the old version predates) → **prefer declaring that older - version not-testable under this CI env** (record it + skip/scope that crossover) over authoring a - custom compose for it. - -A cc-ci compose overlay is a **last resort** only when neither works; it must be Adversary-confirmed -non-drifting and paired with the upstream-env PR that will obsolete it. - -## 1. The current debt to migrate (3 overlays) - -- **ghost `compose.ccci-health.yml`** (app `start_period: 900s`) — the fresh-DB MySQL migration - (~6–9 min) exceeds the recipe's `start_period: 1m` → swarm kills it mid-migration → `migrations_lock` - deadlock. -- **discourse `compose.ccci-health.yml`** (app `start_period: 1200s` + image re-pin - `bitnami/discourse:3.3.1`→`bitnamilegacy/discourse:3.3.1` on app+sidekiq) — Rails cold boot 15–25 min - exceeds the recipe's `start_period: 5m`; the image re-pin is to make the **old base version** - deployable after Docker Hub dropped the `bitnami/discourse` tags. -- **mumble `compose.host-ports.yml`** — a copy of the *upstream* host-ports overlay, provided only so - the **older base version (0.2.0+)** that predates it can resolve `COMPOSE_FILE`. (The current version - ships it natively — that part is fine and stays.) - -## 2. Definition of Done (Adversary cold-verifies) - -- [ ] **E1 — ghost: `start_period` env PR.** Recipe PR to ghost exposing the app healthcheck - `start_period` as an env var (e.g. `APP_START_PERIOD`), **default = the current recipe value** in - `env.sample` (no behavior change for existing users). Verified green on cc-ci (the recipe-PR - dogfood). Then cc-ci sets `APP_START_PERIOD` via `recipe_meta` `EXTRA_ENV` and **removes - `tests/ghost/compose.ccci-health.yml` + its `COMPOSE_FILE`/`install_steps` wiring** — full ghost - suite still green (install migration completes, no deadlock). -- [ ] **E2 — discourse: `start_period` env PR.** Same pattern for discourse's app `start_period`; - remove the `start_period` half of `compose.ccci-health.yml` once CI tunes it by env. -- [ ] **E3 — discourse old-base image re-pin → declare untestable instead.** Do NOT keep the - `bitnami→bitnamilegacy` re-pin in a cc-ci compose. Either (a) the recipe PR re-pins the image - upstream (if that's the genuine recipe fix), OR (b) **declare the old base version not-testable - under cc-ci** (its image is gone from the registry) and scope the upgrade crossover accordingly - — recorded in `DECISIONS.md`. Remove the re-pin from any cc-ci compose. -- [ ] **E4 — mumble old-base host-ports → declare untestable instead.** Drop the cc-ci copy of - `compose.host-ports.yml`; for the old base that predates the upstream overlay, **declare that - version not-testable under cc-ci's on-host port requirement** rather than shipping a copy. The - current version (ships the overlay natively) tests normally. -- [ ] **E5 — No cc-ci compose overlays remain** (`tests/**/compose.*.yml` that cc-ci authored/copied - are gone), OR any that genuinely cannot be replaced is Adversary-justified + paired with a filed - upstream-env PR. The guardrail (`plan.md §9`) holds going forward. -- [ ] **E6 — No test weakened.** Every affected recipe's full suite still passes with real assertions - + real healthcheck gating; the only change is *how* the value is supplied (env, not a forked - compose) or that an un-runnable old crossover is honestly skipped — Adversary cold-verified. - -## 3. Notes / guardrails -- Env PRs follow the standard recipe-PR rule: "working" only when cc-ci verifies the full suite green - (`/recipe-upgrade`'s `!testme`-on-PR path), operator-merged. Mirrors the recipe-robustness PR pattern - (lasuite-drive collabora, plausible Q4.7b, immich). -- **Declaring a version untestable is a first-class, honest outcome** — record which version + why - (registry tag gone / predates a required overlay) in `DECISIONS.md`; it is NOT a test weakening. -- Until an env PR is merged upstream, cc-ci may need the recipe-PR *branch* (via `SRC`+`REF`) to test - green — that's fine (it's the recipe under test), unlike a private cc-ci compose fork.