From 65ee741869fcc2f182fe7e761b0dc3367a2a2067 Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Tue, 16 Jun 2026 23:55:28 +0000 Subject: [PATCH] plan: queue prevb (dynamic upgrade base + previous/ config, opus) + regall (all-recipe regression, sonnet) Operator 2026-06-16. Replaces the static UPGRADE_BASE_VERSION + leaky single compose.ccci.yml overlay model: dynamic base = last-green(warm canonical) -> main fallback -> skip; optional minimal per-recipe previous/ folder for base-only version repairs (ignored for head, version-guarded, removable when stale). Validated on discourse PR #4 (official-image switch the current overlay masks). regall then sweeps all recipes for regressions on sonnet. --- cc-ci-plan/agents.toml | 4 + .../plan-phase-prevb-previous-dynamic-base.md | 120 ++++++++++++++++++ .../plan-phase-regall-recipe-regression.md | 65 ++++++++++ 3 files changed, 189 insertions(+) create mode 100644 cc-ci-plan/plan-phase-prevb-previous-dynamic-base.md create mode 100644 cc-ci-plan/plan-phase-regall-recipe-regression.md diff --git a/cc-ci-plan/agents.toml b/cc-ci-plan/agents.toml index 1cb8eda..eff998c 100644 --- a/cc-ci-plan/agents.toml +++ b/cc-ci-plan/agents.toml @@ -154,4 +154,8 @@ phases = [ { id = "poe2e", plan = "plan-phase-poe2e-end-to-end.md", status = "STATUS-poe2e.md", models = { builder = "claude-opus-4-8", adversary = "claude-sonnet-4-6" } }, # gitea full-test enrollment + LFS PR #1 verification — see plan-phase-gtea-gitea-fulltests.md (operator 2026-06-15) { id = "gtea", plan = "plan-phase-gtea-gitea-fulltests.md", status = "STATUS-gtea.md", models = { builder = "claude-sonnet-4-6", adversary = "claude-sonnet-4-6" } }, + # dynamic upgrade base + per-recipe previous/ config (opus); validated on discourse PR #4 — see plan-phase-prevb-*.md (operator 2026-06-16) + { id = "prevb", plan = "plan-phase-prevb-previous-dynamic-base.md", status = "STATUS-prevb.md", models = { builder = "claude-opus-4-8", adversary = "claude-opus-4-8" } }, + # full all-recipe regression after prevb (sonnet) — see plan-phase-regall-*.md (operator 2026-06-16) + { id = "regall", plan = "plan-phase-regall-recipe-regression.md", status = "STATUS-regall.md", models = { builder = "claude-sonnet-4-6", adversary = "claude-sonnet-4-6" } }, ] diff --git a/cc-ci-plan/plan-phase-prevb-previous-dynamic-base.md b/cc-ci-plan/plan-phase-prevb-previous-dynamic-base.md new file mode 100644 index 0000000..da8f705 --- /dev/null +++ b/cc-ci-plan/plan-phase-prevb-previous-dynamic-base.md @@ -0,0 +1,120 @@ +# Phase `prevb` — dynamic upgrade base + per-recipe `previous/` config + +**Mission (operator-specified 2026-06-16):** fix how cc-ci handles version-specific needs in the +**upgrade tier**. Today a single per-recipe overlay (`compose.ccci.yml`) conflates two different things — +*environmental* tweaks (cc-ci node is slow/memory-tight) and *version-specific repairs* (an old base's +image reference rotted) — and applies BOTH to EVERY deploy, including the PR head. That silently +overrides the head and masks the real change. Proven live on **discourse PR #4** +(`recipe-maintainers/discourse#4`, `discourse-official-image → main`): the overlay re-pins `app.image` +to `bitnamilegacy/discourse:3.3.1` and re-adds the dropped `sidekiq` service, so `!testme` deploys the +OLD image instead of the PR's new official `discourse/discourse:3.5.3` — the migration is never tested. + +Replace that model with two changes, then prove them on discourse PR #4: + +1. **Dynamic upgrade base** (no hardcoded `UPGRADE_BASE_VERSION`): the base the head upgrades from is + resolved at run time as **last-green (warm canonical)** → fallback **target-branch (`main`) tip** → + else **skip the upgrade tier** (recorded reason). +2. **Optional per-recipe `previous/` folder** holding the **minimal** config needed to deploy the + *previous* (last-green) version, **applied only to the base deploy and ignored for the head**. + +State files: `STATUS-prevb.md`, `BACKLOG-prevb.md`, `REVIEW-prevb.md`, `JOURNAL-prevb.md`. DECISIONS.md shared. + +## 1. Root cause (read first) + +`tests//compose.ccci.yml` + `EXTRA_ENV.COMPOSE_FILE = "compose.yml:compose.ccci.yml"` is applied +to every deploy in the recipe-under-test flow. For discourse it carries BOTH: +- **environmental** (`start_period: 20m` grace, `order: stop-first`) — depends on the cc-ci node, must + apply to all deploys incl. head; and +- **version-specific repair** (`app`/`sidekiq` → `bitnamilegacy/discourse:3.3.1`) — depends on the old + 0.7.0 base whose published `bitnami/discourse:3.3.1` 404s; must apply ONLY to that base. +Fusing them + applying to all deploys is the bug: the version-specific half leaks onto the head +(scalar `image:` last-file-wins override; additive service merge re-adds dropped `sidekiq`). + +## 2. Design + +**Decompose the overlay into two layers** — the harness applies them to different deploys: + +- **Environmental overlay (all deploys, incl. head).** Node-reality tweaks the recipe itself doesn't + encode (e.g. rollout `order`). Keep it MINIMAL and shrink over time (a well-formed recipe head ships + its own grace — PR #4 already has `start_period: 20m`). It must contain **no version-specific image + pins or service add/drop**. +- **`tests//previous/` (base deploy ONLY, ignored for head).** The minimal bundle needed to bring + up the **previous (last-green) version** when it can't deploy as-published — e.g. an image relocation + (`bitnami/* → bitnamilegacy/*`), or an era-specific service/step/env. Mirror the recipe-under-test + layout but scoped to "deploy the previous version" (typically just a `compose.previous.yml`; add an + `install_steps.sh`/`ops.py`/env override only if that version genuinely needs it). **Keep it as small + and simple as possible — add one only where necessary.** + +**Dynamic base resolution (replace static `UPGRADE_BASE_VERSION`):** +1. **Primary: last-green (warm canonical).** Upgrade from the last version cc-ci recorded green for this + recipe (prefer the warm-canonical snapshot where one exists — it's already data-warm, giving a + realistic data-survival signal and avoiding a from-scratch old-version deploy). +2. **Fallback: target-branch (`main`) tip** when there is no last-green (e.g. a recipe with no recorded + green predecessor yet). This is the real predecessor the PR merges on top of. +3. **Else skip** the upgrade tier with a recorded reason (new recipe / no predecessor). Structural skip, + declared (`EXPECTED_NA`), not a silent pass. + +**`previous/` is for the current previous version, and is removable when stale.** To stop a stale folder +silently overriding a non-matching base, `previous/` **declares the version it targets** (simplest: a +one-line marker, or the `coop-cloud.*.version` label in its `compose.previous.yml`). The harness applies +it **only when the resolved base version matches**; on mismatch it **skips it and flags it stale +("previous/ targets X, base is Y — remove it")**. After a recipe upgrade PR merges (new last-green), the +now-stale `previous/` should be removed — keep it to roughly one version's worth, never an accumulating pile. + +## 3. Discourse as the first real case + +- **main today is `bitnamilegacy/discourse:3.3.1`** (deployable — bitnamilegacy exists). So with a dynamic + base, the base = last-green (≈ main) **deploys cleanly with NO `previous/` needed**: the rotted-base + treadmill evaporates because we no longer resurrect the frozen 0.7.0 tag. (Confirm main's image; if the + last-green base genuinely still needs a repair to deploy, add a **minimal** `previous/` — but expect not.) +- Move discourse's environmental tweaks (rollout `order`, any grace the head lacks) into the environmental + overlay; **delete the `bitnamilegacy` image pins and the `sidekiq` block from the all-deploys overlay**; + **remove `UPGRADE_BASE_VERSION`**. +- **PR #4 head now deploys UNMODIFIED** → the chaos redeploy runs the real `discourse/discourse:3.5.3` + with no `sidekiq`, so the upgrade tier finally exercises the actual official-image migration + (last-green bitnamilegacy → official head) the PR claims to support. + +## 4. Gates + +**M1 — implemented + green locally.** Harness: dynamic base resolution (last-green → `main` → skip); +`previous/` discovery + base-only application + version-guard/stale-flag; environmental overlay separated +from version-specific config; `UPGRADE_BASE_VERSION` removed. Discourse migrated. Unit tests for the new +harness surface (base resolution, `previous/` match/skip, overlay layering). Discourse upgrade tier green +locally with **proof the head runs the real head image** — assert the deployed `app` image is +`discourse/discourse:3.5.3` (NOT bitnamilegacy) and that no `sidekiq` service exists post-deploy. Adversary +cold-verifies from a clean checkout: the overlay no longer touches the head; a deliberately-broken head +still fails the upgrade tier (teeth — base resolution didn't paper over it); base falls back to `main` +correctly when last-green is absent; `previous/` is ignored for the head; **no test weakened**. + +**M2 — proven in real CI + a representative spot-check.** discourse PR #4 `!testme` **GREEN**, with +evidence the head genuinely ran `discourse/discourse:3.5.3` (not the old bitnami image) and the migration +was exercised. Spot-check ≥3 other recipes with upgrade tiers (e.g. one warm-canonical recipe, one with a +published predecessor, one that previously relied on a `.ccci` overlay — keycloak/cryptpad/ghost) to +confirm dynamic base works generally and nothing obvious broke. (FULL all-recipe regression is the +**next phase `regall`** — do not attempt it here; just don't ship something obviously broken.) Levels / +records reconciled. Fresh Adversary PASS on both milestones → `## DONE`. + +## 5. Guardrails (binding) + +- **Make the test FAITHFUL, never weaker.** The goal is that the head runs the head's real image; never + resolve the base or apply `previous/` in a way that hides a genuinely broken head. A broken upgrade + must still go RED. +- **`previous/` minimal + non-accumulating.** Only what's strictly needed to deploy the previous base; + version-guarded; removable when stale. No `previous/` at all if the last-green base deploys clean. +- **Don't regress other recipes.** Dynamic base must work for recipes with/without warm canonicals and + with/without published predecessors. (The `regall` phase is the exhaustive proof; here, don't break the + spot-check set.) +- **Recipe mirrors are PR-only.** We VERIFY discourse PR #4 (run the harness / post `!testme`); we do NOT + merge it (operator's call). A recipe defect found → PR comment, not a test weakened. +- Commit author `autonomic-bot `; push every commit; abra over a + pseudo-TTY. Host changes are coordinated (loops self-rebuilding the host is acceptable if clean — verify + host health after; but this phase likely needs none). + +## 6. Definition of Done + +Dynamic upgrade-base resolution (last-green → `main` → skip) and the optional minimal `previous/` folder +shipped and unit-tested; the environmental vs version-specific layers cleanly separated; discourse +migrated off the static base + leaky overlay; **discourse PR #4 verified GREEN in real CI with the head +genuinely running the official `discourse/discourse:3.5.3` image** (the migration actually tested), and a +representative recipe spot-check still green; nothing merged on the mirror; M1 + M2 fresh Adversary PASSes +in REVIEW-prevb.md. (Exhaustive all-recipe regression handed to phase `regall`.) diff --git a/cc-ci-plan/plan-phase-regall-recipe-regression.md b/cc-ci-plan/plan-phase-regall-recipe-regression.md new file mode 100644 index 0000000..72a8aab --- /dev/null +++ b/cc-ci-plan/plan-phase-regall-recipe-regression.md @@ -0,0 +1,65 @@ +# Phase `regall` — full all-recipe regression after the dynamic-base / `previous/` change + +**Mission (operator-specified 2026-06-16):** the `prevb` phase changed the upgrade tier for EVERY recipe +(dynamic base resolution: last-green → `main` → skip; new `previous/` lookup; environmental-vs-version +overlay split). Run the **entire recipe suite** through cc-ci to ensure **nothing regressed**, and **fix +anything that did**. This is the safety net for a cross-cutting harness change. + +State files: `STATUS-regall.md`, `BACKLOG-regall.md`, `REVIEW-regall.md`, `JOURNAL-regall.md`. DECISIONS.md shared. + +## 1. Scope + +- **Every recipe cc-ci tests** — the `weekly` + `external` rows in `cc-ci-plan/used-recipes.md` that have a + `tests//` dir (all the enrolled recipes, ~21). Drone's gitea-dep path counts too. +- **All tiers**, with focus on the **upgrade tier** (changed most by `prevb`): install / upgrade / backup + / restore / custom / lint / screenshot, via the real harness / CI path. +- **Baseline = the recorded pre-`prevb` green levels** per recipe (cc-ci's level records). A regression = + a recipe that dropped a level, or a tier that newly fails, **relative to that baseline** — not relative + to "perfect." + +## 2. Method + +1. Run each recipe through the harness (parallelize within the shared-swarm budget — ≤2–3 concurrent + deploys; tear down every deploy on every exit path). Capture level + per-tier pass/fail. +2. Build a results table: `recipe | baseline level | new level | per-tier delta | verdict`. +3. **Classify each regression**: caused by the `prevb` change (dynamic base resolution / `previous/` + lookup / overlay split), or **pre-existing / flaky / environmental** (was already red, or fails on + re-run independent of `prevb`). Be honest — a recipe that was already red stays red; say so, don't + claim a fix. +4. **Fix the `prevb`-caused regressions:** + - Harness-logic regressions → fix in `runner/**` (e.g. a base that resolves wrong, a `previous/` + mis-match, an overlay layering bug). + - A recipe whose **last-green base no longer deploys** under dynamic resolution → add a **MINIMAL** + `previous/` folder for it (same rules as `prevb`: smallest thing that brings the base up, + version-guarded, removable when stale). Do NOT over-add `previous/` folders — only where a base + genuinely won't deploy. +5. Re-verify each fix green. Never weaken a test to clear a regression. + +## 3. Gates + +**M1 — full sweep done + classified.** Every in-scope recipe run; results table complete with +baseline-vs-new levels; each regression classified `prevb-caused` vs `pre-existing/flaky`, with evidence. +Adversary cold-verifies the classification (spot-re-runs a sample, confirms a claimed flake really is one, +confirms a claimed `prevb`-cause is real). + +**M2 — regressions fixed, suite back to baseline.** Every `prevb`-caused regression fixed + re-verified +green (harness fix and/or a minimal `previous/` folder); no recipe below its pre-`prevb` baseline; any +added `previous/` folders are minimal + version-guarded; pre-existing reds documented (not silently +absorbed). Fresh Adversary PASS → `## DONE`. + +## 4. Guardrails + +- **Never weaken a test** to clear a regression; fix the harness or add a minimal `previous/`. +- **`previous/` stays minimal + non-accumulating** — add only where a base won't deploy. +- **Report honestly** — pre-existing/flaky reds are labelled as such with evidence, never claimed fixed; + any recipe left below baseline is called out explicitly with the reason. +- Shared swarm: ≤2–3 concurrent deploys; tear down every deploy on every exit path; `/upgrade-all`'s + `dev-*` reaper is a backstop, not a substitute. Recipe mirrors PR-only; never merge. Commit author + `autonomic-bot`; push every commit; abra over a pseudo-TTY. + +## 5. Definition of Done + +Every enrolled recipe run through cc-ci after the `prevb` change; a complete baseline-vs-new results table; +all `prevb`-caused regressions fixed + re-verified green (no recipe below its pre-`prevb` baseline); any +needed `previous/` folders added minimally; pre-existing reds documented honestly; M1 + M2 fresh Adversary +PASSes recorded in REVIEW-regall.md.