Files
cc-ci/machine-docs/STATUS-samever.md
autonomic-bot 96c4ad9ef3
All checks were successful
continuous-integration/drone/push Build is passing
claim(M2): samever proven in real CI — step-back base<head, version-bump unaffected, discourse #4 + hedgedoc spot-check
5 real cc-ci runs (samever-deploy @ cc-ci main): Run B nightly steady-state step-back
custom-html 1.11.0+1.29.0→1.13.0+1.31.1 (base<head real delta, 5 tiers green); Run C
version-bump UNAFFECTED (last-green path); Run D PR-form step-back (ref set); discourse #4
kind=ref main-tip unaffected (migration 0.8.1→1.0.0 green); hedgedoc spot-check step-back
3.0.9→3.0.10 green. WHAT/HOW/EXPECTED/WHERE in STATUS-samever.md; logs /root/samever-*.log,
artifacts /var/lib/cc-ci-runs/samever-*/ on cc-ci.
2026-06-17 04:58:48 +00:00

10 KiB
Raw Blame History

STATUS — phase samever (step-back to older base when canonical == head version)

SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-samever-older-base-fallback.md. State files: this + BACKLOG-samever.md, REVIEW-samever.md (Adversary), JOURNAL-samever.md. DECISIONS.md shared.

Phase

Started 2026-06-17. Gates: M1 (implemented + unit-tested), M2 (proven in real CI).

Current status

M1: PASS (REVIEW-samever.md @2026-06-17T04:27Z — cold-verified, teeth hold, no regression). Gate: M2 CLAIMED, awaiting Adversary @2026-06-17T04:55Z.

M2 — WHAT is claimed

Proven in real CI on cc-ci that the resolver steps back to a genuinely older base when the last-green canonical == head version (never a same-version no-op, never a needless skip), and that the version-bump path is UNAFFECTED. Five real runs (cc-ci@main = samever code, run from /root/samever-deploy; the runner logs the resolver decision + the deployed version-label move):

  1. Nightly steady state (THE headline, §5 DoD) — custom-html cold-on-latest, run TWICE:
    • Run A (first nightly, no canonical): upgrade base: kind=skip SKIP: head == main tip; 5 tiers green; WC5 promote: canonical custom-html advanced to known-good 1.13.0+1.31.1.
    • Run B (2nd consecutive nightly, canonical==latest==head): STEP-BACKupgrade base: kind=version version=1.11.0+1.29.0 (step-back: last-green canonical (1.13.0+1.31.1) == head version 1.13.0+1.31.1; newest older published base), then the upgrade tier deployed that base and chaos-upgraded to head: upgrade→PR-head: ... version=1.11.0+1.29.0→1.13.0+1.31.1 (label MOVED, base < head, REAL delta — not a no-op, not a skip). All 5 tiers green. Proves F1d-2: the older base really deployed pinned 1.11.0 then upgraded to 1.13.0.
  2. Version-bump UNAFFECTED (enrolled) — Run C: re-seeded canonical→OLDER 1.11.0+1.29.0, cold-on-latest head 1.13.0 → upgrade base: kind=version version=1.11.0+1.29.0 (last-green (warm canonical, status=idle)) — reason "last-green", NOT "step-back": the unchanged prevb path; upgrade 1.11.0→1.13.0 green.
  3. PR form (ref set, head==canonical) — Run D: recipe=custom-html ref=2b82ebab pr=999kind=version version=1.11.0+1.29.0 (step-back: ... canonical (1.13.0+1.31.1) == head version 1.13.0+1.31.1 ...) — step-back STILL triggers with a PR head ref present (ref does not suppress it); upgrade green.
  4. discourse #4 UNAFFECTED (non-enrolled version-bump, §5 DoD) — REF=ae5a8180: upgrade base: kind=ref ref=f87c612d71b4 (target-branch (main) tip) — byte-identical to prevb run 717; discourse is not enrolled so the resolver never enters the canonical branch. Migration green: version=0.8.1+3.5.0→1.0.0+3.5.3, test_head_runs_official_image_not_bitnamilegacy + test_sidekiq_service_dropped_by_head PASSED. install/upgrade pass.
  5. Spot-check, second recipe/tag-set — hedgedoc: seeded canonical=3.0.10+1.10.8 (its latest), cold-on-latest → kind=version version=3.0.9+1.10.7 (step-back: ... canonical (3.0.10+1.10.8) == head version 3.0.10+1.10.8 ...); upgrade version=3.0.9+1.10.7→3.0.10+1.10.8 green. Step-back generalizes to a different recipe + different published-tag ordering. (hedgedoc is NOT WARM_CANONICAL-enrolled — only custom-html is — so its canonical record was hand-seeded to exercise the same resolver path; the seed was removed after, leaving clean state. The resolver reads canonical.read_registry regardless of enrollment, so this faithfully exercises the production code path.)

M2 — WHERE (logs + artifacts on cc-ci, Adversary-readable)

  • Code under test: cc-ci@main (samever), checked out at /root/samever-deploy (HEAD 1310a95, includes resolver commit b29bb3f). Runner: runner/run_recipe_ci.py.
  • Run logs: /root/samever-runA.log, …-runB.log, …-runC.log, …-runD.log, …-disc4.log, …-hedgedoc.log.
  • Preserved results.json/junit/badge: /var/lib/cc-ci-runs/samever-runB/, …-runC/, …-runD/, …-disc4/, …-hedgedoc/ (each /var/lib/cc-ci-runs/manual is overwritten per run, so these are copies).
  • custom-html canonical (legit enrolled state, left as-is): /var/lib/ci-warm/custom-html/canonical.json = version 1.13.0+1.31.1. No leftover run stacks (clean teardown verified; pre-existing warm-keycloak orphan untouched — not samever).

M2 — HOW to verify (cold, from a clean clone / fresh runs)

A. Cold-read the preserved logsgrep "upgrade base" /root/samever-run{A,B,C,D}.log /root/samever-{disc4,hedgedoc}.log reproduces the resolver lines above; grep "upgrade→PR-head" reproduces the version-label moves; grep ": pass\|: fail" … | sed -n '/RUN SUMMARY/,$p' shows tier outcomes (all pass for the tiers run). B. Re-run the headline yourself (most adversarial) from your own clone on cc-ci:

# ensure no canonical, run twice; 2nd run must step back:
rm -rf /var/lib/ci-warm/custom-html        # clear canonical
cd <your-clone>; HOME=/root RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py   # run A → promotes
HOME=/root RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py                    # run B → STEP-BACK

EXPECTED run B: upgrade base: kind=version version=1.11.0+1.29.0 (step-back: …) and upgrade→PR-head: … version=1.11.0+1.29.0→1.13.0+1.31.1, all tiers pass. (Single test node — don't run while another run_recipe_ci.py is active.) C. Teeth: in run B the base version (1.11.0+1.29.0) is strictly < head (1.13.0+1.31.1) AND the upgrade tier's generic test_upgrade_reconverges + cc-ci test_upgrade_preserves_data PASSED — a same-version no-op would show version=1.13.0+1.31.1→1.13.0+1.31.1 (it does not) and a skip would show kind=skip (it does not).

M1 — WHAT is claimed

resolve_upgrade_base now reads the head's published version and steps back to a genuinely older published base when the last-green warm-canonical version equals the head version — never a same-version no-op, never a needless skip when an older base exists.

Resolution chain (override / EXPECTED_NA / upgrade∉stages short-circuits unchanged):

  1. explicit UPGRADE_BASE_VERSION override → unchanged.
  2. last-green canonical IF its version ≠ head versionkind="version" (canonical), unchanged from prevb.
  3. last-green canonical == head versionstep back: newest published version strictly older than headkind="version" (the older tag). Reason starts "step-back: …".
  4. canonical == head and no older published tagkind="skip", reason "base == head (<v>) and no older published predecessor".
  5. no canonical → main-tip ref / skip paths unchanged. head_version is None (compose unreadable) → comparison is False → canonical stays primary (prevb behavior).

M1 — WHERE (commit + paths)

  • Implementation commit: b29bb3f (feat(samever): …), on main.
  • runner/run_recipe_ci.pyresolve_upgrade_base(..., head_version=None) new chain (canonical block ~lines 147180); call site main() reads head_version = abra.head_compose_version(recipe) (~line 1023) and passes it.
  • runner/harness/abra.pyhead_compose_version(recipe) (regex coop-cloud\.[^.\s]*\.version=([^\s"']+) over the head checkout's compose.yml; matches quoted + unquoted labels; does NOT match .chaos-version).
  • runner/warm_reconcile.pyversion_key(tag) (lifted from sort_versions; single ordering source)
    • newest_older_version(tags, version) (newest tag with version_key < target; None if none / version None).
  • tests/unit/test_upgrade_base.py — 5 new tests (13 total).

M1 — HOW to verify (cold, from a clean clone)

  1. Unit suite (the gate):

    nix shell nixpkgs#python311Packages.pytest -c pytest tests/unit/test_upgrade_base.py -v
    

    EXPECTED: 13 passed. New tests:

    • test_canonical_equals_head_steps_back_to_newest_older — canonical==head==10.8.0+26.6.3, tags [10.6.0+26.5.0, 10.8.0+26.6.3, 10.7.1+26.6.2, 10.7.0+26.6.0, not-a-version]plan.version == "10.7.1+26.6.2" (strictly older; asserts version_key(plan.version) < version_key(head)), kind=="version", reason contains "step-back". main never consulted.
    • test_canonical_differs_from_head_uses_canonical_unchanged — canonical 10.7.1+26.6.2 ≠ head 10.8.0+26.6.3version==10.7.1+26.6.2, reason "last-green"; recipe_tags NOT consulted.
    • test_canonical_equals_head_no_older_published_skips — canonical==head==1.0.0+3.5.3, tags [1.0.0+3.5.3] only → kind=="skip", reason contains "no older published predecessor".
    • test_no_head_version_preserves_canonical_primary — head_version omitted → canonical primary, no step-back.
    • test_newest_older_version_ordering — ordering helper picks correct strictly-older tag, excludes equal, None-safe. The 8 prior tests (override / EXPECTED_NA / main-tip / head==main-tip skip / no-predecessor skip / other-rung) are UNCHANGED and still pass — proving override/ref/skip paths untouched.
  2. Teeth (canonical==head MUST NOT yield a same-version base): in test_canonical_equals_head_steps_back_to_newest_older, plan.version != head_version and the version_key(plan.version) < version_key(head) assertion fails loudly if the resolver ever returns the same version or a newer one.

  3. Compose-label parse (the head-version reader): the regex extracts 10.8.0+26.6.3 from a quoted label and 3.5.3+1.24.2-rootless from an unquoted one, and returns no match for a .chaos-version label (verified — see JOURNAL). Real labels confirmed on cc-ci: keycloak 10.8.0+26.6.3, gitea 3.5.3+1.24.2-rootless, discourse 1.0.0+3.5.3.

  4. F1d-2: the step-back returns kind="version", so it flows through the SAME pinned-tag deploy path as a normal canonical base (abra.recipe_checkout pins the tag on disk) — no new deploy code.

Note (pre-existing, NOT introduced by this gate): tests/unit/test_meta.py::test_generated_doc_table_in_sync and tests/unit/test_warm_reconcile.py::test_traefik_spec_is_stateless_with_setup fail on clean 279d84d too (verified by stashing my changes). Out of scope for samever.

Blocked

(none)