Files
cc-ci-orchestrator/cc-ci-plan/plan-ccci-compose-overlay-policy.md
autonomic-bot fd08a977d0 overlay policy: standardize the ccci overlay filename to compose.ccci.yml
Operator: use a single uniform filename `compose.ccci.yml` per recipe (one file
holding all cc-ci-side deploy tweaks) rather than per-purpose suffixes like
compose.ccci-health.yml. Updated §9 + plan-ccci-compose-overlay-policy.md; added
a DoD item to rename tests/{ghost,discourse}/compose.ccci-health.yml ->
compose.ccci.yml and update their install_steps.sh cp target + recipe_meta
COMPOSE_FILE.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:25:48 +01:00

81 lines
5.8 KiB
Markdown

# Policy + cleanup — cc-ci compose overlays (when they're justified) & upgrade-tier from-version
**Status:** POLICY (codifies `plan.md §9`) + a small set of follow-ups. **Owner:** Builder + Adversary.
**This file:** `/srv/cc-ci/cc-ci-plan/plan-ccci-compose-overlay-policy.md`
**Supersedes** the earlier `plan-prefer-env-over-compose-overlay.md` (its premise — parameterize
`start_period` via an env var — is **wrong: abra does not support an env value for `start_period`**).
---
## 0. Policy (operator, 2026-05-30)
A cc-ci-authored compose overlay (the single `compose.ccci.yml`, layered via `COMPOSE_FILE`) risks
**drift** from the recipe users run — so **avoid where possible and justify each use**. But it is a
**legitimate, uniform fallback pattern**, not forbidden:
- **Prefer an upstream recipe PR** in most cases — a real robustness fix, or exposing a knob the recipe
should expose. That's where a fix usually belongs.
- **A ccci overlay is the right tool when the value can't be supplied any other way** — notably a
healthcheck **`start_period`**, which **abra cannot take from an env var**. The ghost/discourse
`start_period` bumps therefore **stay as overlays** (an env PR is impossible for that field).
- **Uniform pattern (acceptable fallback):** a single, fixed-name **`compose.ccci.yml`** per recipe
(NOT per-purpose suffixes — one file holds all cc-ci-side deploy tweaks for that recipe), provided
into the checkout by `install_steps.sh`, wired by `recipe_meta` `COMPOSE_FILE`
(`compose.yml:compose.ccci.yml`), kept as an untracked file so it survives the upgrade
`git checkout -f` (`CHAOS_BASE_DEPLOY=True`; `assert_upgraded` strips the `+U` marker — see
DECISIONS 2026-05-30).
- **Each overlay must:** be **minimal + single-purpose**, **document WHY** in its header (the exact
abra/upstream limitation that forces it), and be **Adversary-confirmed** to not weaken a test or mask
a recipe defect. Where the fix also belongs upstream (e.g. a `start_period` too tight for any slow
host), **file the upstream PR too** — the overlay is the cc-ci-side fallback, not a reason to skip it.
## 1. Upgrade tier: always test the upgrade to LATEST
Don't drop the upgrade test because the *from* (older) version is awkward.
- **Always perform the upgrade to the latest version and run the full assertions on the latest.**
- If the older from-version can't be fully deployed/tested (image tag removed from the registry, or it
predates an overlay/feature), you do **NOT** need that older version's **custom tests** to run.
Deploy it minimally (a justified overlay is fine) or upgrade from the nearest deployable prior; skip
only the from-version's custom assertions, and **record** that.
- Skipping a from-version's custom tests = honest, recorded. Skipping upgrade-to-latest = not OK.
## 2. Disposition of the current overlays
- [ ] **RENAME the overlay files to the uniform `compose.ccci.yml`.** `tests/ghost/compose.ccci-health.yml`
and `tests/discourse/compose.ccci-health.yml``compose.ccci.yml`; update each recipe's
`install_steps.sh` (the `cp` target) and `recipe_meta` `COMPOSE_FILE`
(`compose.yml:compose.ccci-health.yml``compose.yml:compose.ccci.yml`). One fixed filename per
recipe going forward.
- [ ] **ghost `compose.ccci.yml` (start_period 900s) — KEEP, justified.** abra can't env-param
`start_period`; the fresh-DB migration needs the larger grace or swarm kills it → deadlock.
Confirm the header documents this; consider an upstream PR raising ghost's `start_period` (it's a
real slow-host fragility) — but the overlay stays regardless.
- [ ] **discourse `compose.ccci.yml` — KEEP, justified (both parts).** (a) `start_period 1200s`
(same reason as ghost). (b) The `bitnami/discourse:3.3.1 → bitnamilegacy/discourse:3.3.1` re-pin
makes the from-version (0.7.0, whose `bitnami/discourse` tag Docker Hub now 404s) **deployable so
the upgrade-to-latest test can run** — namespace-only, identical discourse version, applied to
base+head. This is the §1 case: keep the upgrade-to-latest test; the 0.7.0 custom tests need not
run. Document it; if a deployable prior without the re-pin exists, prefer upgrading from that.
- [ ] **mumble `compose.host-ports.yml` (cc-ci copy for the old base) — DROP it.** Deploying mumble
0.2.0 does NOT need host-ports (that overlay only *publishes* 64738 for on-host tests). Per §1:
deploy 0.2.0 without it, **skip 0.2.0's voice/on-host custom tests**, then upgrade to the latest
version (which ships `compose.host-ports.yml` natively) and run the voice tests on the latest.
Remove the cc-ci copy + its `install_steps`/`COMPOSE_FILE` wiring for the old base; the current
version's native overlay is untouched.
## 3. Definition of Done (Adversary cold-verifies)
- [ ] Every surviving cc-ci overlay is minimal, header-documents its justification (the abra/upstream
limitation), and is Adversary-confirmed to not weaken a test or mask a defect.
- [ ] The mumble old-base cc-ci host-ports copy is removed; mumble still **upgrades to latest** and runs
its voice tests **on the latest** (0.2.0's voice tests skipped + recorded).
- [ ] ghost + discourse still pass full suites; discourse still tests the upgrade to latest.
- [ ] Any upstream PR opened (e.g. ghost/discourse `start_period`) follows the recipe-PR rule
(cc-ci-green via `!testme` before operator merge); the overlay remains as the cc-ci fallback.
- [ ] No upgrade-to-latest test was dropped to avoid an awkward from-version.
## 4. Guardrails
- **Correctness first** — never weaken/skip/soften a check to make a deploy or upgrade pass; an
overlay tunes deploy/infra only (its header must say how), the real assertions stand.
- **Justify + document every overlay**; prefer the upstream PR where the fix belongs.
- **Real abra path** throughout.