diff --git a/docs/recipe-customization.md b/docs/recipe-customization.md index 4ada56b..1916cf9 100644 --- a/docs/recipe-customization.md +++ b/docs/recipe-customization.md @@ -62,7 +62,10 @@ tests// # cc-ci side (repo-local mirrors the same s ├── ops.py # pre_(ctx) seed hooks (§5.2) ├── custom/test_*.py # custom tier: parity ports + recipe-specific + UI flows (§5.3) ├── install_steps.sh # pre-deploy shell hook (the ONLY shell hook) (§5.4) -├── compose.ccci.yml # CI-only compose overlay (first-class) (§5.5) +├── compose.ccci.yml # CI-only ENVIRONMENTAL compose overlay (all deploys) (§5.5) +├── previous/ # version-specific base-only repair (optional) (§5.5b) +│ ├── compose.previous.yml # minimal compose to deploy the previous version +│ └── VERSION # the published version it targets (version-guard) └── PARITY.md # enrollment contract doc (human-read only) ``` @@ -119,7 +122,7 @@ _This table is GENERATED from the `runner/harness/meta.py` KEYS registry by `scr | `BACKUP_CAPABLE` | `bool` | `None` | Override the backup-tier capability auto-detect (compose `backupbot.backup` labels). `False` forces an intentional skip of the backup/restore rung; `True` forces the tier on; unset = auto-detect. | | `EXPECTED_NA` | `dict` | `None` | Declare a non-run rung an INTENTIONAL skip: `{rung: reason}` — the level climbs past it; an undeclared non-run rung is *unverified* and blocks the level above it (classification table: machine-docs/DECISIONS.md phase lvl5). Never overrides an exercised pass/fail; the `lint` rung has no escape hatch. Declaring `upgrade` also suppresses the upgrade-tier BASE deploy — the single deploy is the PR head itself — for recipes whose published versions exist but are genuinely undeployable (phase bsky). | | `READY_PROBE` | `hook` | `None` | Callable `(ctx) -> [probe, ...]` returning extra readiness probes, run after install AND after upgrade: HTTP `{host, path, ok}` or TCP `{tcp_host, tcp_port, stable}`. | -| `UPGRADE_BASE_VERSION` | `str` | `None` | Exact published tag overriding the upgrade tier's base (default: `recipe_versions[-2]`). | +| `UPGRADE_BASE_VERSION` | `str` | `None` | Optional explicit override pinning the upgrade tier's base to an exact published tag (rare; for a PR that adds a version *above* the newest tag). When unset (the norm) the base is resolved DYNAMICALLY (phase prevb): last-green (warm canonical) → target-branch (`main`) tip → else skip. See `run_recipe_ci.resolve_upgrade_base` + DECISIONS. | | `BACKUP_VERIFY` | `hook` | `None` | Callable `(ctx) -> bool` post-backup data-capture check; `False` re-runs the backup (truncated-dump race guard), retried up to 3 attempts. | | `UPGRADE_EXTRA_ENV` | `dict_or_hook` | `None` | Extra `.env` keys applied after the PR-head checkout, before the chaos redeploy (env that exists only at head). Dict, or callable `(ctx) -> dict`. | | `EXTRA_ENV` | `dict_or_hook` | `{}` | Extra `.env` keys applied at EVERY deploy (base install AND upgrade old-app). Dict, or callable `(ctx) -> dict` deriving values from the per-run domain (`ctx.domain`). | @@ -229,9 +232,29 @@ that deploy (the untracked file would otherwise trip abra's clean-tree gate). No `install_steps.sh` copy boilerplate, no flag to remember (the old `CHAOS_BASE_DEPLOY` ⇄ overlay coupling is gone). The overlay is cc-ci-owned only. -Policy unchanged: overlays are a minimal, justified fallback (ghost's is a 15m `start_period` -grace — a literal, because abra validates `start_period` before env substitution). Reference the -overlay from `EXTRA_ENV`'s `COMPOSE_FILE` as usual. Users: ghost, discourse. +Policy (phase prevb): `compose.ccci.yml` is **ENVIRONMENTAL-only** — node-reality tweaks that must +apply to EVERY deploy including the PR head (e.g. ghost's 15m `start_period` grace — a literal, +because abra validates `start_period` before env substitution; discourse's `order: stop-first` for +the memory-tight upgrade crossover). It MUST NOT carry version-specific image pins or service +add/drop — those leak onto the head and mask the change under test. Version-specific base repairs go +in `previous/` (§5.5b). Reference the overlay from `EXTRA_ENV`'s `COMPOSE_FILE` as usual. + +### 5.5b Previous-version base repair — `tests//previous/` + +Optional. The MINIMAL config to deploy the *previous (last-green) version* when it can't deploy +as-published (e.g. an image relocation `bitnami/* → bitnamilegacy/*`, or an era-specific +service/env). Applied to the **base deploy ONLY** and stripped before the head redeploy, so the PR +head runs UNMODIFIED. + +- Layout: `tests//previous/compose.previous.yml` (+ a one-line `previous/VERSION` marker + declaring the published version it targets). Appended to the base deploy's `COMPOSE_FILE`. +- **Version-guarded:** applied only when the resolved base equals `previous/VERSION`. On a main-tip + (ref) base or a version mismatch it is **skipped and flagged stale** (`previous/ targets X, base is + Y — remove it`). After an upgrade PR merges (new last-green), remove the now-stale folder — keep it + to ~one version, never an accumulating pile. +- Keep it minimal and add one only where necessary. Most recipes (incl. discourse) need NONE — the + dynamic base (last-green/main-tip) deploys clean. Symbols: `lifecycle.previous_status` / + `provide_previous_overlay` / `remove_previous_overlay`. ### 5.6 Environment & fixture contract (what custom code can read) @@ -262,10 +285,12 @@ One deploy chain per run (full detail: `docs/testing.md` §2): ``` [DEPS? provision deps FIRST → $CCCI_DEPS_FILE] -deploy BASE (UPGRADE_BASE_VERSION or recipe_versions[-2]; EXTRA_ENV; install_steps.sh; - compose.ccci.yml auto-copied + auto-chaos) +deploy BASE (dynamic: last-green → main-tip → skip, or UPGRADE_BASE_VERSION override; EXTRA_ENV; + install_steps.sh; compose.ccci.yml [environmental] auto-copied + auto-chaos; + tests//previous/ [version-specific, base-ONLY] applied if it matches the base) → INSTALL tier (READY_PROBE; generic + overlay asserts) - → pre_upgrade(ctx) → chaos-deploy PR HEAD (UPGRADE_EXTRA_ENV) + → pre_upgrade(ctx) → strip previous/ + chaos-deploy PR HEAD (UPGRADE_EXTRA_ENV) + → reconcile stack to head compose (prune services the head dropped) → UPGRADE tier (READY_PROBE; version-label == head_ref) → pre_backup(ctx) → backup (BACKUP_CAPABLE; BACKUP_VERIFY) → BACKUP tier @@ -354,6 +379,8 @@ fixtures deleted). | HC2 allowlist | `tests/repo-local-approved.txt` | | Generic assertions + `BACKUP_CAPABLE` detect | `runner/harness/generic.py` | | `compose.ccci.yml` auto-copy + auto-chaos | `runner/harness/lifecycle.py` (`provide_ccci_overlay`, `deploy_app`) | +| Dynamic upgrade base (last-green → main-tip → skip) | `runner/run_recipe_ci.py` (`resolve_upgrade_base`, `BasePlan`); `runner/harness/lifecycle.py` (`recipe_branch_commit`) | +| `previous/` discovery + version-guard + base-only apply + head strip | `runner/harness/lifecycle.py` (`previous_status`, `provide/remove_previous_overlay`); `tests/unit/test_previous.py` | | `READY_PROBE` consumption | `runner/harness/lifecycle.py` (`wait_ready_probes`) | | `EXPECTED_NA` reporting | `runner/harness/results.py` | | `SCREENSHOT` consumer | `runner/harness/screenshot.py` | diff --git a/docs/runbook.md b/docs/runbook.md index 764c70b..208c9de 100644 --- a/docs/runbook.md +++ b/docs/runbook.md @@ -32,9 +32,11 @@ curl -s -H "Authorization: Bearer $DT" --proxy socks5h://localhost:1055 \ from the private mirror origin. All recipe-touching harness calls pass `-C -o` (chaos+offline); `recipe_versions`/upgrade use the upstream tags fetched read-only at clone time. If you see this, a new abra call is missing `-o`. -- **upgrade stage SKIPPED ("no previous published version"):** the recipe clone has no version tags. - `fetch_recipe` read-only-fetches them from the public upstream (`git.coopcloud.tech/coop-cloud/`); - confirm the upstream has ≥2 tags (`git ls-remote --tags`). +- **upgrade stage SKIPPED:** the dynamic base resolved to `skip` (phase prevb) — no last-green warm + canonical AND no resolvable `main` tip, or `head == main tip` (no predecessor delta), or a declared + `EXPECTED_NA[upgrade]`. The run log prints the exact reason (`upgrade base: kind=skip … SKIP: `). + For a recipe that should upgrade from `main`, confirm the per-run clone has `origin/main` (or + `origin/master`) and that it differs from the PR head (`resolve_upgrade_base` in `run_recipe_ci.py`). - **health wait hangs / 502:** the app isn't answering `HEALTH_PATH` yet. Slow apps (keycloak JVM + Liquibase, lasuite 9-service) just need time; raise `DEPLOY_TIMEOUT`/`HTTP_TIMEOUT` in `recipe_meta.py`. A persistent 502 with services 1/1 = wrong `HEALTH_PATH` (e.g. keycloak needs diff --git a/docs/testing.md b/docs/testing.md index 20a2080..beba673 100644 --- a/docs/testing.md +++ b/docs/testing.md @@ -48,8 +48,9 @@ once**; the assertion files (generic and overlay) evaluate the *post-op* state a op themselves. Asserted every run: **`deploy-count = 1`** (one `abra app new`). ``` -deploy ONCE (base version: the previous published version when an upgrade tier will run and one - exists — so upgrade is a real previous→PR-head; else the target / current PR head) +deploy ONCE (base version, resolved DYNAMICALLY when the upgrade tier runs: last-green (warm + canonical) → target-branch `main` tip → else skip — so upgrade is a real + predecessor→PR-head; else the target / current PR head. phase prevb) → INSTALL [optional pre_install seed] then generic + overlay assertions (no op) → UPGRADE [optional pre_upgrade seed] then abra app deploy --chaos to PR-head (op once) then generic + overlay assertions @@ -201,7 +202,11 @@ server's content volume — without it the generic install fails 404, with it it Concretely, the upgrade tier: -1. base deployment is the **previous published version** (a clean pinned-tag deploy). +1. base deployment is the **dynamically-resolved predecessor** (phase prevb): last-green (warm + canonical, pinned-tag deploy) → else the target-branch `main` tip (chaos deploy of the branch + HEAD — the real predecessor the PR merges onto) → else the upgrade tier is skipped. An optional + `tests//previous/` supplies version-specific repair to the base ONLY (stripped before the + head redeploy). `UPGRADE_BASE_VERSION` may still pin an explicit tag override. 2. orchestrator captures `head_ref` (preferring `$REF` — the PR head sha; falls back to the recipe checkout HEAD for non-PR `!testme`). 3. on the upgrade tier: re-checkout the recipe to `head_ref` (the prev-tag base deploy reset the