From bb79e9140eda7c096239da7c375b72b0c97ff049 Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Wed, 17 Jun 2026 00:37:23 +0000 Subject: [PATCH] =?UTF-8?q?claim(prevb):=20M1=20=E2=80=94=20dynamic=20base?= =?UTF-8?q?=20+=20previous/=20+=20discourse=20migration;=20discourse=20upg?= =?UTF-8?q?rade=20GREEN=20locally=20(head=3Dofficial=203.5.3,=20sidekiq=20?= =?UTF-8?q?pruned)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- machine-docs/BACKLOG-prevb.md | 27 ++++++++---------- machine-docs/JOURNAL-prevb.md | 20 ++++++++++++++ machine-docs/STATUS-prevb.md | 52 +++++++++++++++++++++++++++++++++-- 3 files changed, 81 insertions(+), 18 deletions(-) diff --git a/machine-docs/BACKLOG-prevb.md b/machine-docs/BACKLOG-prevb.md index a4fbb44..9b4a627 100644 --- a/machine-docs/BACKLOG-prevb.md +++ b/machine-docs/BACKLOG-prevb.md @@ -4,22 +4,17 @@ SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase-prevb-previous-dynamic-base.md`. ## Build backlog -### M1 — implemented + green locally -- [ ] B1. Dynamic upgrade-base resolution: last-green (warm canonical registry version) → fallback - target-branch (`main`) tip → else skip (declared reason). Replace the static - `previous_version(vers[-2])` default in `run_recipe_ci.upgrade_base`. Wire into `main()` deploy. -- [ ] B2. `tests//previous/` mechanism: discovery, declared-target-version marker, base-only - application (added to base deploy's COMPOSE_FILE), head exclusion (never applied to PR head), - version-guard + stale-flag on mismatch. -- [ ] B3. Discourse migration: shrink `compose.ccci.yml` to environmental-only - (`order: stop-first`), delete bitnamilegacy image pins + sidekiq block; remove - `UPGRADE_BASE_VERSION` from `tests/discourse/recipe_meta.py`. (Expect NO `previous/`.) -- [ ] B4. Unit tests for the new surface: base resolution (last-green / main-tip / skip), `previous/` - match / skip / stale, environmental-vs-version overlay layering. Update `test_upgrade_base.py` - to the new resolver API without weakening coverage. -- [ ] B5. Discourse upgrade tier GREEN locally: base (bitnamilegacy:3.5.0) → head; assert deployed - `app` image == `discourse/discourse:3.5.3` (NOT bitnamilegacy) and no `sidekiq` service post-deploy. -- [ ] B6. CLAIM M1 (clean tree + STATUS verification block). +### M1 — implemented + green locally [CLAIMED @2026-06-17T00:40Z, awaiting Adversary] +- [x] B1. Dynamic upgrade-base resolution (last-green → main-tip → skip): `resolve_upgrade_base`/`BasePlan`. +- [x] B2. `tests//previous/` mechanism: discovery, VERSION marker, base-only application, + head exclusion (stripped before head redeploy), version-guard + stale-flag. Unit-tested. +- [x] B3. Discourse migration: `compose.ccci.yml` environmental-only (`order: stop-first`); bitnamilegacy + pins + sidekiq removed; `UPGRADE_BASE_VERSION` removed. No `previous/` (base deploys clean). +- [x] B4. Unit tests: resolver matrix + `previous/` apply/skip/stale + COMPOSE_FILE layering. +- [x] B5. Discourse upgrade tier GREEN locally (run-prevb-disc2): app image official 3.5.3 (not + bitnamilegacy), no sidekiq (pruned), version 0.8.1+3.5.0→1.0.0+3.5.3, install+upgrade pass. + (Found+fixed: docker stack deploy no-prune left sidekiq orphaned → `prune_orphan_services`.) +- [x] B6. CLAIM M1 (clean tree + STATUS WHAT/HOW/EXPECTED/WHERE/TEETH). ### M2 — proven in real CI + spot-check - [ ] B7. discourse PR #4 `!testme` GREEN in real CI; head ran `discourse/discourse:3.5.3`, migration exercised. diff --git a/machine-docs/JOURNAL-prevb.md b/machine-docs/JOURNAL-prevb.md index 07362b9..725960d 100644 --- a/machine-docs/JOURNAL-prevb.md +++ b/machine-docs/JOURNAL-prevb.md @@ -50,3 +50,23 @@ B5 e2e launched on cc-ci (/root/prevb-deploy @ bb2e3c6), STAGES=install,upgrade, (bitnamilegacy:3.5.0), env overlay provided. Base now in slow Rails cold boot (15-25min). Polling ~5min. (lint rung fail R011 = recipe-level, a rung not a gate; prepull skipped on the known sidekiq-depends-on config rc=15 — non-fatal.) + +## 2026-06-17T00:40Z — M1 GREEN locally; claiming + +discourse install,upgrade e2e GREEN (2nd run, after the prune fix). Evidence in run-prevb-disc2.log on +cc-ci /root/prevb-deploy. The dynamic main-tip base worked first try (kind=ref f87c612d) — crucial, +because main (0.8.1+3.5.0) is AHEAD of the newest published tag (0.7.0+3.3.1), so the OLD vers[-2] +default (=0.6.3) would have been the wrong predecessor entirely. The upgrade moved +0.8.1+3.5.0 (bitnamilegacy, main-tip) → 1.0.0+3.5.3 (official, PR head), chaos-version=ae5a8180+U. + +**The one real bug found+fixed (WHY):** first run, `test_head_runs_official_image` PASSED (head app = +official 3.5.3 — the leak is gone) but `test_sidekiq_service_dropped` FAILED: `docker stack deploy` +(what `abra app deploy` runs) only adds/updates services, it does NOT prune ones the new compose dropped, +so the base's sidekiq orphaned on the old image. This is a swarm mechanic, not a head-deploy failure, but +it means the deployed stack didn't faithfully reflect the head. Fix = `prune_orphan_services` in +perform_upgrade: reconcile the live stack to the head compose's `config --services` set (remove orphans). +Faithful (deployed stack == head), no-op when service sets match / compose unresolvable, weakens nothing. + +Decided to CLAIM with the e2e green + image/sidekiq proof and leave the deliberately-broken-head teeth +probe to the Adversary's cold acceptance (its explicit M1 check; I can't push a broken commit to the +recipe mirror per guardrails). STATUS spells out where the teeth hold so they know where to probe. diff --git a/machine-docs/STATUS-prevb.md b/machine-docs/STATUS-prevb.md index 7543b52..92c2094 100644 --- a/machine-docs/STATUS-prevb.md +++ b/machine-docs/STATUS-prevb.md @@ -7,8 +7,56 @@ State files: this + BACKLOG-prevb.md, REVIEW-prevb.md (Adversary), JOURNAL-prevb Started 2026-06-17. Gates: **M1** (implemented + green locally), **M2** (proven in real CI + spot-check). ## Now -- In flight: M1 implementation (dynamic base resolution + `previous/` mechanism + discourse migration + unit tests). -- No gate CLAIMED yet. +- **Gate: M1 CLAIMED, awaiting Adversary.** (claim commit below.) + +## Gate: M1 — CLAIMED @2026-06-17T00:40Z (HEAD e1b32ea) + +**WHAT (DoD §4 M1):** dynamic upgrade-base resolution (last-green → main-tip → skip); `previous/` +discovery + base-only application + version-guard/stale-flag; environmental overlay separated from +version-specific config; `UPGRADE_BASE_VERSION` removed from discourse; discourse migrated; unit tests +for the new surface; discourse upgrade tier GREEN locally with proof the head ran the real official +image (`discourse/discourse:3.5.3`, NOT bitnamilegacy) and no `sidekiq` service post-deploy. + +**WHERE (commit e1b32ea on origin/main):** +- `runner/run_recipe_ci.py`: `BasePlan` + `resolve_upgrade_base(stages, meta, recipe, head_ref)` + (override → last-green via `canonical.read_registry` → main-tip via `lifecycle.recipe_branch_commit` + → skip); wired in `main()` (deploy `base_ref`/`apply_previous`, gate upgrade tier on `base_plan.runs`). +- `runner/harness/lifecycle.py`: `previous_*` surface (`has_previous`, `previous_target_version`, + `previous_status`, `provide/remove_previous_overlay`, `compose_file_add/remove`), + `recipe_branch_commit`, `stack_service_names`, `compose_services`, `prune_orphan_services`; + `deploy_app` `base_ref`/`apply_previous` paths. +- `runner/harness/generic.py` `perform_upgrade`: strip `previous/` overlay + COMPOSE_FILE entry before + head redeploy; `prune_orphan_services` after convergence (reconcile stack to head compose). +- `tests/discourse/compose.ccci.yml`: ENVIRONMENTAL-only (`app.deploy.update_config.order: stop-first`; + bitnamilegacy pins + `sidekiq` removed). `tests/discourse/recipe_meta.py`: `UPGRADE_BASE_VERSION` removed. +- `tests/discourse/test_upgrade.py`: asserts head image == official 3.5.3 (not bitnamilegacy) + no sidekiq. +- Unit: `tests/unit/test_upgrade_base.py` (resolver matrix), `tests/unit/test_previous.py` (previous/ + + COMPOSE_FILE layering). + +**HOW to verify (cold, from a fresh clone at e1b32ea):** +1. Unit (prevb surface): `cc-ci-run -m pytest tests/unit/test_upgrade_base.py tests/unit/test_previous.py tests/unit/test_meta.py -q`. +2. e2e: `RECIPE=discourse SRC=recipe-maintainers/discourse REF=ae5a81802b4d1d6cd1b449ac46cfa16d80730aaa PR=4 STAGES=install,upgrade cc-ci-run runner/run_recipe_ci.py` (HOME=/root). +3. Inspect: `grep -vE '^\s*#' tests/discourse/compose.ccci.yml` (env-only); `grep UPGRADE_BASE_VERSION tests/discourse/recipe_meta.py` (none). + +**EXPECTED:** +- Unit: all pass (38 across the 2 prevb files; test_meta clean). NOTE scope: the prevb surface is green; + the FULL `tests/unit/` suite has **1 PRE-EXISTING unrelated fail** — + `test_warm_reconcile.py::test_traefik_spec_is_stateless_with_setup` (KeyError 'health_domain') — which + fails identically at gtea-DONE (778720c) and was not touched by prevb (pxgate 0e9fd38 refactored the + spec without updating the test). Out of scope for prevb; flagged to the operator/next phase. +- e2e log shows: `upgrade base: kind=ref ref=f87c612d71b4 (target-branch (main) tip)`; base = main-tip + chaos deploy; `prune-orphans: removed 'sidekiq'`; + `upgrade→PR-head: head_ref=ae5a8180 chaos-version=ae5a8180+U version=0.8.1+3.5.0→1.0.0+3.5.3`; + RUN SUMMARY `deploy-count = 1 (expect 1)`, `install : pass`, `upgrade : pass`; both + `tests/discourse/test_upgrade.py` asserts PASS (app image official 3.5.3 not bitnamilegacy; no sidekiq); + teardown leaves no stacks/volumes/secrets. (Level caps at 2/5 because only install,upgrade ran — not a fail.) + +**TEETH (where a broken head still goes RED — for the Adversary's break-it probe):** the upgrade tier +gates on the REAL head deploy — `assert_upgrade_converged` (rejects silent swarm rollback/pause) + +`wait_healthy` on HEALTH_PATH + HC1 `chaos-version`==head commit + the discourse image/sidekiq asserts. +Base resolution/`prune`/`previous` never deploy the head's code, so a deliberately-broken head cannot be +papered over: it won't converge/serve → RED. `previous/` is base-only (stripped before the head redeploy, +proven by `remove_previous_overlay` + COMPOSE_FILE strip in `perform_upgrade`); discourse ships no `previous/`. ## Ground-truth facts (verified 2026-06-17, recorded for Adversary) - `recipe-maintainers/discourse` PR **#4** (`discourse-official-image` `ae5a8180` → `main` `f87c612d`), open.