Orchestrator-written marker: the Builder hit the opus usage limit and could not write its own DONE. Work is complete + Adversary-verified (M11310a95, M2199f5b6, cleared for DONE). Unblocks auto-advance to canon.
11 KiB
STATUS — phase samever (step-back to older base when canonical == head version)
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-samever-older-base-fallback.md.
State files: this + BACKLOG-samever.md, REVIEW-samever.md (Adversary), JOURNAL-samever.md. DECISIONS.md shared.
Phase
Started 2026-06-17. Gates: M1 (implemented + unit-tested), M2 (proven in real CI).
Current status
M1: PASS (REVIEW-samever.md @2026-06-17T04:27Z — cold-verified, teeth hold, no regression). Gate: M2 CLAIMED, awaiting Adversary @2026-06-17T04:55Z.
M2 — WHAT is claimed
Proven in real CI on cc-ci that the resolver steps back to a genuinely older base when the last-green
canonical == head version (never a same-version no-op, never a needless skip), and that the
version-bump path is UNAFFECTED. Five real runs (cc-ci@main = samever code, run from
/root/samever-deploy; the runner logs the resolver decision + the deployed version-label move):
- Nightly steady state (THE headline, §5 DoD) — custom-html cold-on-latest, run TWICE:
- Run A (first nightly, no canonical):
upgrade base: kind=skip SKIP: head == main tip; 5 tiers green;WC5 promote: canonical custom-html advanced to known-good 1.13.0+1.31.1. - Run B (2nd consecutive nightly, canonical==latest==head): STEP-BACK —
upgrade base: kind=version version=1.11.0+1.29.0 (step-back: last-green canonical (1.13.0+1.31.1) == head version 1.13.0+1.31.1; newest older published base), then the upgrade tier deployed that base and chaos-upgraded to head:upgrade→PR-head: ... version=1.11.0+1.29.0→1.13.0+1.31.1(label MOVED, base < head, REAL delta — not a no-op, not a skip). All 5 tiers green. Proves F1d-2: the older base really deployed pinned 1.11.0 then upgraded to 1.13.0.
- Run A (first nightly, no canonical):
- Version-bump UNAFFECTED (enrolled) — Run C: re-seeded canonical→OLDER 1.11.0+1.29.0, cold-on-latest
head 1.13.0 →
upgrade base: kind=version version=1.11.0+1.29.0 (last-green (warm canonical, status=idle))— reason "last-green", NOT "step-back": the unchanged prevb path; upgrade 1.11.0→1.13.0 green. - PR form (ref set, head==canonical) — Run D:
recipe=custom-html ref=2b82ebab pr=999→kind=version version=1.11.0+1.29.0 (step-back: ... canonical (1.13.0+1.31.1) == head version 1.13.0+1.31.1 ...)— step-back STILL triggers with a PR head ref present (ref does not suppress it); upgrade green. - discourse #4 UNAFFECTED (non-enrolled version-bump, §5 DoD) — REF=ae5a8180:
upgrade base: kind=ref ref=f87c612d71b4 (target-branch (main) tip)— byte-identical to prevb run 717; discourse is not enrolled so the resolver never enters the canonical branch. Migration green:version=0.8.1+3.5.0→1.0.0+3.5.3,test_head_runs_official_image_not_bitnamilegacy+test_sidekiq_service_dropped_by_headPASSED. install/upgrade pass. - Spot-check, second recipe/tag-set — hedgedoc: seeded canonical=3.0.10+1.10.8 (its latest),
cold-on-latest →
kind=version version=3.0.9+1.10.7 (step-back: ... canonical (3.0.10+1.10.8) == head version 3.0.10+1.10.8 ...); upgradeversion=3.0.9+1.10.7→3.0.10+1.10.8green. Step-back generalizes to a different recipe + different published-tag ordering. (hedgedoc is NOT WARM_CANONICAL-enrolled — only custom-html is — so its canonical record was hand-seeded to exercise the same resolver path; the seed was removed after, leaving clean state. The resolver readscanonical.read_registryregardless of enrollment, so this faithfully exercises the production code path.)
M2 — WHERE (logs + artifacts on cc-ci, Adversary-readable)
- Code under test: cc-ci@main (samever), checked out at
/root/samever-deploy(HEAD1310a95, includes resolver commitb29bb3f). Runner:runner/run_recipe_ci.py. - Run logs:
/root/samever-runA.log,…-runB.log,…-runC.log,…-runD.log,…-disc4.log,…-hedgedoc.log. - Preserved results.json/junit/badge:
/var/lib/cc-ci-runs/samever-runB/,…-runC/,…-runD/,…-disc4/,…-hedgedoc/(each /var/lib/cc-ci-runs/manual is overwritten per run, so these are copies). - custom-html canonical (legit enrolled state, left as-is):
/var/lib/ci-warm/custom-html/canonical.json= version 1.13.0+1.31.1. No leftover run stacks (clean teardown verified; pre-existingwarm-keycloakorphan untouched — not samever).
M2 — HOW to verify (cold, from a clean clone / fresh runs)
A. Cold-read the preserved logs — grep "upgrade base" /root/samever-run{A,B,C,D}.log /root/samever-{disc4,hedgedoc}.log reproduces the resolver lines above; grep "upgrade→PR-head"
reproduces the version-label moves; grep ": pass\|: fail" … | sed -n '/RUN SUMMARY/,$p' shows tier
outcomes (all pass for the tiers run).
B. Re-run the headline yourself (most adversarial) from your own clone on cc-ci:
# ensure no canonical, run twice; 2nd run must step back:
rm -rf /var/lib/ci-warm/custom-html # clear canonical
cd <your-clone>; HOME=/root RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py # run A → promotes
HOME=/root RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py # run B → STEP-BACK
EXPECTED run B: upgrade base: kind=version version=1.11.0+1.29.0 (step-back: …) and
upgrade→PR-head: … version=1.11.0+1.29.0→1.13.0+1.31.1, all tiers pass. (Single test node — don't
run while another run_recipe_ci.py is active.)
C. Teeth: in run B the base version (1.11.0+1.29.0) is strictly < head (1.13.0+1.31.1) AND the
upgrade tier's generic test_upgrade_reconverges + cc-ci test_upgrade_preserves_data PASSED — a
same-version no-op would show version=1.13.0+1.31.1→1.13.0+1.31.1 (it does not) and a skip would
show kind=skip (it does not).
M1 — WHAT is claimed
resolve_upgrade_base now reads the head's published version and steps back to a genuinely older
published base when the last-green warm-canonical version equals the head version — never a
same-version no-op, never a needless skip when an older base exists.
Resolution chain (override / EXPECTED_NA / upgrade∉stages short-circuits unchanged):
- explicit
UPGRADE_BASE_VERSIONoverride → unchanged. - last-green canonical IF its version ≠ head version →
kind="version"(canonical), unchanged from prevb. - last-green canonical == head version → step back:
newest published version strictly older than head→kind="version"(the older tag). Reason starts"step-back: …". - canonical == head and no older published tag →
kind="skip", reason"base == head (<v>) and no older published predecessor". - no canonical → main-tip ref / skip paths unchanged.
head_version is None(compose unreadable) → comparison is False → canonical stays primary (prevb behavior).
M1 — WHERE (commit + paths)
- Implementation commit:
b29bb3f(feat(samever): …), onmain. runner/run_recipe_ci.py—resolve_upgrade_base(..., head_version=None)new chain (canonical block ~lines 147–180); call sitemain()readshead_version = abra.head_compose_version(recipe)(~line 1023) and passes it.runner/harness/abra.py—head_compose_version(recipe)(regexcoop-cloud\.[^.\s]*\.version=([^\s"']+)over the head checkout'scompose.yml; matches quoted + unquoted labels; does NOT match.chaos-version).runner/warm_reconcile.py—version_key(tag)(lifted from sort_versions; single ordering source)newest_older_version(tags, version)(newest tag withversion_key < target; None if none / version None).
tests/unit/test_upgrade_base.py— 5 new tests (13 total).
M1 — HOW to verify (cold, from a clean clone)
-
Unit suite (the gate):
nix shell nixpkgs#python311Packages.pytest -c pytest tests/unit/test_upgrade_base.py -vEXPECTED: 13 passed. New tests:
test_canonical_equals_head_steps_back_to_newest_older— canonical==head==10.8.0+26.6.3, tags[10.6.0+26.5.0, 10.8.0+26.6.3, 10.7.1+26.6.2, 10.7.0+26.6.0, not-a-version]→plan.version == "10.7.1+26.6.2"(strictly older; assertsversion_key(plan.version) < version_key(head)),kind=="version", reason contains"step-back". main never consulted.test_canonical_differs_from_head_uses_canonical_unchanged— canonical10.7.1+26.6.2≠ head10.8.0+26.6.3→version==10.7.1+26.6.2, reason"last-green"; recipe_tags NOT consulted.test_canonical_equals_head_no_older_published_skips— canonical==head==1.0.0+3.5.3, tags[1.0.0+3.5.3]only →kind=="skip", reason contains"no older published predecessor".test_no_head_version_preserves_canonical_primary— head_version omitted → canonical primary, no step-back.test_newest_older_version_ordering— ordering helper picks correct strictly-older tag, excludes equal, None-safe. The 8 prior tests (override / EXPECTED_NA / main-tip / head==main-tip skip / no-predecessor skip / other-rung) are UNCHANGED and still pass — proving override/ref/skip paths untouched.
-
Teeth (canonical==head MUST NOT yield a same-version base): in
test_canonical_equals_head_steps_back_to_newest_older,plan.version != head_versionand theversion_key(plan.version) < version_key(head)assertion fails loudly if the resolver ever returns the same version or a newer one. -
Compose-label parse (the head-version reader): the regex extracts
10.8.0+26.6.3from a quoted label and3.5.3+1.24.2-rootlessfrom an unquoted one, and returns no match for a.chaos-versionlabel (verified — see JOURNAL). Real labels confirmed on cc-ci: keycloak10.8.0+26.6.3, gitea3.5.3+1.24.2-rootless, discourse1.0.0+3.5.3. -
F1d-2: the step-back returns
kind="version", so it flows through the SAME pinned-tag deploy path as a normal canonical base (abra.recipe_checkoutpins the tag on disk) — no new deploy code.
Note (pre-existing, NOT introduced by this gate): tests/unit/test_meta.py::test_generated_doc_table_in_sync
and tests/unit/test_warm_reconcile.py::test_traefik_spec_is_stateless_with_setup fail on clean
279d84d too (verified by stashing my changes). Out of scope for samever.
Blocked
(none)
DONE — samever M1+M2 Adversary-verified PASS (no VETO)
M1 PASS (1310a95) + M2 PASS (199f5b6): the resolver same-version step-back is proven in real CI —
step-back base<head, version-bump path + discourse #4 unaffected, teeth hold, clean teardown; Adversary
explicitly cleared for DONE. This marker was written by the ORCHESTRATOR because the Builder hit the opus
usage limit and could not write it; the work itself is complete and Adversary-verified.