Files
cc-ci/machine-docs/REVIEW-samever.md

9.7 KiB
Raw Blame History

REVIEW — phase samever (Adversary writes here)

Phase: samever — step back to older base when canonical == head version (no same-version upgrade) SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-samever-older-base-fallback.md Adversary loop started: 2026-06-17T04:09Z Adversary clone: /srv/cc-ci/cc-ci-adv


Gate verdicts

M2: PASS @2026-06-17T05:04Z

Proven in real CI. Cold-read the Builder's preserved logs AND — the strongest check — independently reproduced the headline from my OWN fresh clone on cc-ci (git clone … /root/adv-verify @ 96c4ad9, NOT the Builder's /root/samever-deploy), so the step-back is not an artifact of the Builder's tree.

Independent reproduction (my clone, my runs /root/adv-runA.log,/root/adv-runB.log):

  • Run A (canonical cleared): upgrade base: kind=skip SKIP: head == main tip → promotes canonical→1.13.0+1.31.1.
  • Run B (canonical==head==1.13.0+1.31.1): STEP-BACKkind=version version=1.11.0+1.29.0 (step-back: last-green canonical (1.13.0+1.31.1) == head version 1.13.0+1.31.1; newest older published base) then upgrade→PR-head: … version=1.11.0+1.29.0→ 1.13.0+1.31.1. All 5 tiers pass. base 1.11.0 < head 1.13.0 — a REAL delta, not a no-op, not a skip. ✓

Cold-read of Builder's 5 runs (corroborates, all consistent with verified resolver logic):

  1. Headline runA/runB — identical to my independent repro above. F1d-2 confirmed: base tier prepulled nginx:1.29.0 (pinned 1.11.0+1.29.0), upgrade tier prepulled nginx:1.31.1 (head 1.13.0+1.31.1) — distinct images ⇒ the older base really deployed pinned, not LATEST.
  2. Version-bump UNAFFECTED (runC): canonical re-seeded to OLDER 1.11.0+1.29.0 → reason "last-green" NOT "step-back" (the unchanged prevb path); upgrade 1.11.0→1.13.0 green. Corroborates my M1 direct probe (canonical≠head → last-green, recipe_tags not consulted).
  3. PR form (runD, ref=2b82ebab pr=999): step-back STILL triggers with a PR head ref present (ref does not suppress it); upgrade green. ✓
  4. discourse #4 UNAFFECTED (disc4, REF=ae5a8180): kind=ref ref=f87c612d71b4 (target-branch (main) tip) — discourse is non-enrolled so the resolver never enters the canonical branch; migration 0.8.1+3.5.0→1.0.0+3.5.3 green, test_head_runs_official_image_not_bitnamilegacy + test_sidekiq_service_dropped_by_head PASSED. The official-image migration is untouched. ✓
  5. Spot-check hedgedoc: kind=version version=3.0.9+1.10.7 (step-back: … canonical (3.0.10+1.10.8) == head 3.0.10+1.10.8 …), upgrade 3.0.9→3.0.10 green. I independently confirmed via newest_older_version that 3.0.9+1.10.7 IS the newest-older for hedgedoc's tag-set ⇒ step-back generalizes to a different recipe + ordering. ✓

Teeth: in both my Run B and the Builder's, base version 1.11.0+1.29.0 is strictly < head 1.13.0+1.31.1; a same-version no-op would log …→1.13.0+1.31.1 from 1.13.0+1.31.1 (it does not), a needless skip would log kind=skip (it does not). Distinct base/head app images seal it.

Hygiene (cold-checked): canonical restored to legit 1.13.0+1.31.1 (byte-diff vs pre-verify snapshot = unchanged); no leftover custom-html run stacks (clean teardown); hedgedoc hand-seed removed (no /var/lib/ci-warm/hedgedoc); pre-existing warm-keycloak orphan untouched (not samever). My own verify clone/script removed afterward.

Verdict: M2 PASS. Resolver steps back to a genuinely older base in real CI (headline reproduced from my own clone), version-bump path + discourse #4 demonstrably unaffected, generalizes to a 2nd recipe, teeth hold, clean teardown. (Consulted JOURNAL only after writing this verdict.)

Both M1 + M2 are fresh Adversary PASSes. No VETO. The Builder is cleared to write ## DONE to STATUS-samever.md per the §6.1 handshake.

M1: PASS @2026-06-17T04:27Z

Cold-verified from own clone /srv/cc-ci/cc-ci-adv @ b29bb3f (claim c5a0d20). Implemented + unit-tested gate. Independent (not trusting Builder's tests) — re-ran the suite AND wrote my own break-it probes.

Evidence:

  1. Unit suite cold: pytest tests/unit/test_upgrade_base.py -v13 passed (8 prior unchanged
    • 5 new). The 8 prior (override / EXPECTED_NA / main-tip / head==main-tip skip / no-predecessor / other-rung) still green ⇒ override/ref/skip paths untouched.
  2. My own primitive probes (direct import, adversarial inputs):
    • newest_older_version strictly-older semantics: suffix tags (-rootless) ordered correctly; head-version BETWEEN tags → newest strictly older; equal-key tag EXCLUDED (1.0.0+3.5.3 vs 1.0.0+3.5.3 → None); head-is-oldest → None; None/empty safe; recipe-major ordering beats app (9.9.9+99.0.0 < 10.0.0+1.0.0). ✓
    • _VERSION_LABEL_RE: parses quoted, unquoted, single-quoted labels; .chaos-version → None (not matched); chaos-then-real picks the real label. ✓
  3. My own resolver-chain probes (monkeypatched canonical + recipe_tags, direct resolve_upgrade_base):
    • canonical==head (TEETH): 10.8.0+26.6.3 → base 10.7.1+26.6.2, kind=version, reason="step-back: …"; asserted version != head AND version_key(base) < version_key(head). Never a same-version no-op; strictly older.
    • canonical≠head (version-bump path): uses canonical unchanged AND recipe_tags is NOT consulted (patched it to raise — no raise) ⇒ discourse #4 / version-bump PRs cannot be perturbed by this gate. ✓
    • canonical==head, no older tag: kind=skip, reason "base == head (…) and no older published predecessor" ⇒ declared, not silent. ✓
    • head_version=None (compose unreadable): canonical stays primary (prevb behavior preserved). ✓
  4. sort_versions refactor behavior-preserving: version_key lifted verbatim from the old inline key; test_warm_reconcile.py version-ordering tests pass (8 passed; single failure unrelated).
  5. Pre-existing failures disclosed honestly: test_meta::test_generated_doc_table_in_sync and test_warm_reconcile::test_traefik_spec_is_stateless_with_setup FAIL on parent 279d84d too (re-ran in a temp worktree — both fail there); samever diff touches neither SPECS nor the doc table. Out of scope, NOT a regression.

F1d-2: step-back returns kind="version" ⇒ inherits the same pinned-tag deploy path as any canonical base (no new deploy code) — the on-disk tree is checked out at the pinned older tag. This is an M1 (unit) claim; the REAL pinned-deploy proof belongs to M2 (live CI, evidenced base<head delta).

Verdict: M1 PASS. Implementation matches plan §2 chain exactly; teeth hold; no regression to override/ref/skip/version-bump paths. (Consulted JOURNAL only after writing this — did not need it.)


Orientation @2026-06-17T04:09Z

Phase samever plan created 2026-06-17T03:56Z. Builder has not yet started (no STATUS-samever.md).

Root cause confirmed (cold-read of resolver, lines 133148 of run_recipe_ci.py):

rec = canonical.read_registry(recipe)
if rec and rec.get("version"):
    return BasePlan(
        "version",
        rec["version"],
        None,
        f"last-green (warm canonical, status={rec.get('status')})",
    )

The warm-canonical path returns canonical["version"] WITHOUT checking if it equals the head version. The resolver is not passed the head's semantic version (only head_ref, a commit sha), so it cannot compare.

Current unit tests (8 tests in tests/unit/test_upgrade_base.py) — none cover canonical==head:

  • test_upgrade_not_in_stages_skip
  • test_expected_na_upgrade_skip_even_with_canonical_and_override
  • test_explicit_override_wins_over_canonical
  • test_last_green_warm_canonical_is_primary ← uses canonical["version"]="0.6.0+3.1.1", HEAD="aaaa1111head" (different version — correct but doesn't test the same-version edge)
  • test_main_tip_fallback_when_no_last_green
  • test_head_equals_main_tip_skip
  • test_no_canonical_no_main_skip
  • test_expected_na_other_rung_does_not_suppress_upgrade

Key utilities available for the fix:

  • warm_reconcile.recipe_tags(recipe) — returns all git tags from recipe clone
  • warm_reconcile.sort_versions(tags) — ascending sort of version tags (coop-cloud semver)
  • warm_reconcile.latest_version(tags) — the newest tag
  • Head version read from compose.yml: coop-cloud.${STACK_NAME}.version label at abra.recipe_dir(recipe)/compose.yml (head checkout already at that path when resolver runs)

M1 verification plan (what I'll cold-verify when claimed):

  1. Resolver reads head version from compose.yml (inspect the parsing — look for compose YAML read + coop-cloud.*version label extraction)
  2. New chain: override → (canonical if canonical≠head_version) → (newest older published if canonical==head_version) → main-tip → skip
  3. Unit tests added: at minimum canonical==head→step_back, canonical≠head→unchanged, no_older_published→skip, version ordering correct
  4. Run python -m pytest tests/unit/test_upgrade_base.py -v cold from own clone
  5. Confirm OVERRIDE, EXPECTED_NA, main-tip, skip paths are untouched (regression: existing 8 tests still pass)
  6. Teeth check: a "broken base" scenario should still fail (unit test or from plan F1d-2 evidence)

M2 verification plan:

  1. Cold-on-latest run on an enrolled recipe whose canonical == latest (seed the canonical to latest, then trigger cold run)
  2. Evidence in logs: base_version < head_version (not a no-op, not a skip)
  3. Re-run discourse #4 or equivalent version-bump PR → UNAFFECTED (canonical→head path still uses canonical)
  4. Spot-check ≥1 other recipe

Adversary findings

(empty — phase not yet started)


Break-it probes log

(none yet)