The nightly cold-on-latest run promotes canonical->LATEST, so every subsequent nightly (until a new version ships) finds canonical==latest==version-under-test -> base==head -> same-version no-op. This is the common path, not a rare edge. M2 now proves the fix on that nightly scenario (canonical already==latest -> step back to previous published).
102 lines
7.0 KiB
Markdown
102 lines
7.0 KiB
Markdown
# Phase `samever` — step back to an older base when last-green == head (no same-version upgrade)
|
|
|
|
**Mission (operator-specified 2026-06-17):** close a gap in the `prevb` dynamic upgrade-base resolver.
|
|
When the resolved **last-green (warm-canonical) base version equals the PR head version**, the upgrade
|
|
tier currently deploys the **same version** as base and head — a vacuous, non-upgrade "upgrade." Instead
|
|
of skipping the tier, **step back to a genuinely older base**: the **newest published version strictly
|
|
older than the head version**. The upgrade must always cross a real version delta when an older version
|
|
exists. (This is design **A**; design B "canonical history" is deferred — see `cc-ci-plan/IDEAS.md`.)
|
|
|
|
State files: `STATUS-samever.md`, `BACKLOG-samever.md`, `REVIEW-samever.md`, `JOURNAL-samever.md`. DECISIONS.md shared.
|
|
|
|
## 1. Background / root cause
|
|
|
|
`resolve_upgrade_base` (`runner/run_recipe_ci.py:111`) resolves the upgrade base. Its two paths are
|
|
guarded **unequally**:
|
|
- **ref (main-tip) path — guarded:** `if main_tip == head_ref → skip "head == main tip (no predecessor
|
|
delta)"`.
|
|
- **version (last-green canonical) path — NOT guarded:** it returns `BasePlan("version", rec["version"],
|
|
…)` without checking that the canonical version differs from the head. The resolver isn't even given
|
|
the head's *version* (only `head_ref`, a commit), so it currently can't compare.
|
|
|
|
When does the canonical equal the head version? The canonical advances **only** on a GREEN + COLD +
|
|
LATEST run of a `WARM_CANONICAL`-enrolled recipe (`should_promote_canonical` = `is_enrolled and
|
|
overall==0 and not quick and not ref`; a PR `!testme` carries `ref` so it NEVER promotes). So the
|
|
canonical is always the latest-published version that last passed a cold sweep.
|
|
|
|
**This is the STEADY STATE of the nightly sweep, not a rare edge (operator insight 2026-06-17).** The
|
|
nightly cold-on-latest run is exactly what *promotes* the canonical to LATEST. So once one green nightly
|
|
run promotes `canonical → vX`, **every subsequent nightly run — until a new version ships — finds
|
|
`canonical == latest == the version under test`**, and its upgrade tier resolves `base == head` → a
|
|
same-version no-op → effectively no upgrade test the second night. (The non-version-bump PR is the same
|
|
collision, less common.) So the step-back is the *common* nightly path, not an oddball — the resolver must
|
|
handle it as the norm. With the fix, the second nightly run tests `vX-1 → vX` (a real upgrade; a repeat of
|
|
the first night's, but real, not vacuous).
|
|
|
|
## 2. Design (A)
|
|
|
|
Give the resolver the **head version** (read the `coop-cloud.*.version` label from the head's
|
|
`compose.yml` — the head checkout already exists), and extend the chain:
|
|
|
|
1. explicit `UPGRADE_BASE_VERSION` override → use it (unchanged).
|
|
2. **last-green canonical, IF its version ≠ head version** → use it (`kind="version"`; the green-verified
|
|
primary).
|
|
3. **last-green canonical version == head version** → **do NOT skip.** Step back: from the recipe's
|
|
published version tags (`warm_reconcile.recipe_tags` + the existing version-ordering used by
|
|
`latest_version`), pick the **newest published version STRICTLY OLDER than the head version** and use
|
|
it (`kind="version"`). `previous/` still applies version-guarded against whatever base version is
|
|
chosen.
|
|
4. no canonical at all → existing **main-tip ref** path (use if `main_tip ≠ head_ref`, else skip) —
|
|
unchanged.
|
|
5. **only if no older published version exists** (genuinely the first version / no predecessor) → skip
|
|
with a declared reason (`"base == head and no older published predecessor"`).
|
|
|
|
Constraints:
|
|
- "Strictly older" — exclude any tag equal to the head version; reuse the existing coop-cloud version
|
|
ordering, do not hand-roll semver. If the head version isn't in the published tag list (a brand-new
|
|
version above all tags), the canonical-≠-head branch already handles it — the step-back only triggers
|
|
when canonical == head.
|
|
- Preserve the **F1d-2** protections: the chosen older base must actually deploy that *pinned* version
|
|
(checkout the tag so the on-disk tree matches), never LATEST.
|
|
- Pure resolver change where possible; keep the `ref` and `skip` paths' behavior identical for all other
|
|
cases (don't perturb discourse #4 or any version-bump PR).
|
|
|
|
## 3. Gates
|
|
|
|
**M1 — implemented + unit-tested.** Resolver reads the head version and implements the chain above.
|
|
Unit tests (extend `tests/unit/test_upgrade_base.py`): canonical==head → resolves to the newest-older
|
|
published version (assert it's strictly older); canonical≠head → uses canonical (unchanged); no
|
|
older-published → declared skip with the new reason; head-version parsed from compose; version ordering
|
|
picks the correct strictly-older tag; override + ref + existing-skip paths unchanged. Adversary
|
|
cold-verifies from a clean checkout: a same-version PR now upgrades from a **real older version** (base
|
|
version < head version, evidenced), not a no-op and not a skip; teeth (a broken head still RED); the
|
|
version-bump path (canonical→head) is untouched.
|
|
|
|
**M2 — proven in real CI.** Demonstrate on the **realistic trigger — the nightly steady state**: a
|
|
**cold-on-latest run of an enrolled recipe whose canonical already == latest** (i.e. simulate the
|
|
second consecutive nightly with no new version — seed/point the canonical at LATEST, then run
|
|
cold-on-latest). Show its upgrade tier **steps back to the previous published version** (evidence
|
|
`base_version < latest`, a genuine delta — not a same-version no-op, not a skip). Also cover the
|
|
PR form (a non-version-bump PR where head version == canonical) the same way. Confirm a normal
|
|
version-bump PR — re-run **discourse #4** or equivalent — is **UNAFFECTED** (canonical, which is older,
|
|
→ head). Spot-check ≥1 other enrolled recipe. Fresh Adversary PASS on both milestones → `## DONE`.
|
|
|
|
## 4. Guardrails
|
|
|
|
- **Never a same-version no-op, and never a needless skip when an older base exists.** Skip only when
|
|
there is genuinely no older published predecessor.
|
|
- **The base must be strictly older than the head version.**
|
|
- **Don't regress the version-bump path** — the common upgrade-PR case (canonical → head) must behave
|
|
exactly as before; discourse #4 must still test the official-image migration.
|
|
- Never weaken a test; minimal, well-scoped resolver change; `previous/` stays the last resort.
|
|
- Commit author `autonomic-bot <autonomic-bot@noreply.git.autonomic.zone>`; push every commit; abra over a
|
|
pseudo-TTY. Recipe mirrors PR-only; never merge.
|
|
|
|
## 5. Definition of Done
|
|
|
|
The resolver steps back to the newest-published-version-older-than-head whenever the last-green canonical
|
|
equals the head version — never a same-version no-op, never a needless skip when an older base exists;
|
|
unit-tested; proven in real CI on a same-version scenario with evidence of a real base<head delta, and the
|
|
version-bump path (discourse #4) confirmed unaffected; M1 + M2 fresh Adversary PASSes in REVIEW-samever.md.
|
|
Design B (canonical history) recorded in `cc-ci-plan/IDEAS.md`, out of scope here.
|