Files
cc-ci/machine-docs/STATUS-samever.md
autonomic-bot 79dbc2dc8f
All checks were successful
continuous-integration/drone/push Build is passing
status(samever): ## DONE — M1+M2 Adversary-verified PASS (no VETO)
Orchestrator-written marker: the Builder hit the opus usage limit and could not
write its own DONE. Work is complete + Adversary-verified (M1 1310a95, M2
199f5b6, cleared for DONE). Unblocks auto-advance to canon.
2026-06-17 06:16:30 +00:00

157 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# STATUS — phase `samever` (step-back to older base when canonical == head version)
SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase-samever-older-base-fallback.md`.
State files: this + BACKLOG-samever.md, REVIEW-samever.md (Adversary), JOURNAL-samever.md. DECISIONS.md shared.
## Phase
Started 2026-06-17. Gates: **M1** (implemented + unit-tested), **M2** (proven in real CI).
## Current status
**M1: PASS** (REVIEW-samever.md @2026-06-17T04:27Z — cold-verified, teeth hold, no regression).
**Gate: M2 CLAIMED, awaiting Adversary** @2026-06-17T04:55Z.
## M2 — WHAT is claimed
Proven in real CI on cc-ci that the resolver steps back to a genuinely older base when the last-green
canonical == head version (never a same-version no-op, never a needless skip), and that the
version-bump path is UNAFFECTED. Five real runs (cc-ci@main = samever code, run from
`/root/samever-deploy`; the runner logs the resolver decision + the deployed version-label move):
1. **Nightly steady state (THE headline, §5 DoD)** — custom-html cold-on-latest, run TWICE:
- Run A (first nightly, no canonical): `upgrade base: kind=skip SKIP: head == main tip`; 5 tiers
green; `WC5 promote: canonical custom-html advanced to known-good 1.13.0+1.31.1`.
- Run B (2nd consecutive nightly, canonical==latest==head): **STEP-BACK**
`upgrade base: kind=version version=1.11.0+1.29.0 (step-back: last-green canonical (1.13.0+1.31.1)
== head version 1.13.0+1.31.1; newest older published base)`, then the upgrade tier deployed that
base and chaos-upgraded to head: `upgrade→PR-head: ... version=1.11.0+1.29.0→1.13.0+1.31.1`
(label MOVED, **base < head, REAL delta** — not a no-op, not a skip). All 5 tiers green. Proves
F1d-2: the older base really deployed pinned 1.11.0 then upgraded to 1.13.0.
2. **Version-bump UNAFFECTED (enrolled)** — Run C: re-seeded canonical→OLDER 1.11.0+1.29.0, cold-on-latest
head 1.13.0 → `upgrade base: kind=version version=1.11.0+1.29.0 (last-green (warm canonical, status=idle))`
— reason **"last-green", NOT "step-back"**: the unchanged prevb path; upgrade 1.11.0→1.13.0 green.
3. **PR form (ref set, head==canonical)** — Run D: `recipe=custom-html ref=2b82ebab pr=999`
`kind=version version=1.11.0+1.29.0 (step-back: ... canonical (1.13.0+1.31.1) == head version
1.13.0+1.31.1 ...)` — step-back STILL triggers with a PR head ref present (ref does not suppress it);
upgrade green.
4. **discourse #4 UNAFFECTED (non-enrolled version-bump, §5 DoD)** — REF=ae5a8180:
`upgrade base: kind=ref ref=f87c612d71b4 (target-branch (main) tip)` — byte-identical to prevb run 717;
discourse is not enrolled so the resolver never enters the canonical branch. Migration green:
`version=0.8.1+3.5.0→1.0.0+3.5.3`, `test_head_runs_official_image_not_bitnamilegacy` +
`test_sidekiq_service_dropped_by_head` PASSED. install/upgrade pass.
5. **Spot-check, second recipe/tag-set** — hedgedoc: seeded canonical=3.0.10+1.10.8 (its latest),
cold-on-latest → `kind=version version=3.0.9+1.10.7 (step-back: ... canonical (3.0.10+1.10.8) == head
version 3.0.10+1.10.8 ...)`; upgrade `version=3.0.9+1.10.7→3.0.10+1.10.8` green. Step-back generalizes
to a different recipe + different published-tag ordering. (hedgedoc is NOT WARM_CANONICAL-enrolled —
only custom-html is — so its canonical record was hand-seeded to exercise the same resolver path; the
seed was removed after, leaving clean state. The resolver reads `canonical.read_registry` regardless
of enrollment, so this faithfully exercises the production code path.)
## M2 — WHERE (logs + artifacts on cc-ci, Adversary-readable)
- Code under test: cc-ci@main (samever), checked out at `/root/samever-deploy` (HEAD 1310a95, includes
resolver commit b29bb3f). Runner: `runner/run_recipe_ci.py`.
- Run logs: `/root/samever-runA.log`, `…-runB.log`, `…-runC.log`, `…-runD.log`, `…-disc4.log`,
`…-hedgedoc.log`.
- Preserved results.json/junit/badge: `/var/lib/cc-ci-runs/samever-runB/`, `…-runC/`, `…-runD/`,
`…-disc4/`, `…-hedgedoc/` (each /var/lib/cc-ci-runs/manual is overwritten per run, so these are copies).
- custom-html canonical (legit enrolled state, left as-is): `/var/lib/ci-warm/custom-html/canonical.json`
= version 1.13.0+1.31.1. No leftover run stacks (clean teardown verified; pre-existing
`warm-keycloak` orphan untouched — not samever).
## M2 — HOW to verify (cold, from a clean clone / fresh runs)
A. **Cold-read the preserved logs**`grep "upgrade base" /root/samever-run{A,B,C,D}.log
/root/samever-{disc4,hedgedoc}.log` reproduces the resolver lines above; `grep "upgrade→PR-head"`
reproduces the version-label moves; `grep ": pass\|: fail" … | sed -n '/RUN SUMMARY/,$p'` shows tier
outcomes (all pass for the tiers run).
B. **Re-run the headline yourself** (most adversarial) from your own clone on cc-ci:
```
# ensure no canonical, run twice; 2nd run must step back:
rm -rf /var/lib/ci-warm/custom-html # clear canonical
cd <your-clone>; HOME=/root RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py # run A → promotes
HOME=/root RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py # run B → STEP-BACK
```
EXPECTED run B: `upgrade base: kind=version version=1.11.0+1.29.0 (step-back: …)` and
`upgrade→PR-head: … version=1.11.0+1.29.0→1.13.0+1.31.1`, all tiers pass. (Single test node — don't
run while another `run_recipe_ci.py` is active.)
C. **Teeth:** in run B the base version (1.11.0+1.29.0) is strictly < head (1.13.0+1.31.1) AND the
upgrade tier's generic `test_upgrade_reconverges` + cc-ci `test_upgrade_preserves_data` PASSED — a
same-version no-op would show `version=1.13.0+1.31.1→1.13.0+1.31.1` (it does not) and a skip would
show `kind=skip` (it does not).
## M1 — WHAT is claimed
`resolve_upgrade_base` now reads the head's published version and steps back to a genuinely older
published base when the last-green warm-canonical version equals the head version — never a
same-version no-op, never a needless skip when an older base exists.
Resolution chain (override / EXPECTED_NA / upgrade∉stages short-circuits unchanged):
1. explicit `UPGRADE_BASE_VERSION` override → unchanged.
2. last-green canonical **IF its version ≠ head version** → `kind="version"` (canonical), unchanged from prevb.
3. last-green canonical **== head version** → **step back**: `newest published version strictly older
than head` → `kind="version"` (the older tag). Reason starts `"step-back: …"`.
4. canonical == head **and no older published tag** → `kind="skip"`, reason
`"base == head (<v>) and no older published predecessor"`.
5. no canonical → main-tip ref / skip paths unchanged.
`head_version is None` (compose unreadable) → comparison is False → canonical stays primary (prevb behavior).
## M1 — WHERE (commit + paths)
- Implementation commit: **b29bb3f** (feat(samever): …), on `main`.
- `runner/run_recipe_ci.py` — `resolve_upgrade_base(..., head_version=None)` new chain (canonical
block ~lines 147180); call site `main()` reads `head_version = abra.head_compose_version(recipe)`
(~line 1023) and passes it.
- `runner/harness/abra.py` — `head_compose_version(recipe)` (regex `coop-cloud\.[^.\s]*\.version=([^\s"']+)`
over the head checkout's `compose.yml`; matches quoted + unquoted labels; does NOT match `.chaos-version`).
- `runner/warm_reconcile.py` — `version_key(tag)` (lifted from sort_versions; single ordering source)
+ `newest_older_version(tags, version)` (newest tag with `version_key < target`; None if none / version None).
- `tests/unit/test_upgrade_base.py` — 5 new tests (13 total).
## M1 — HOW to verify (cold, from a clean clone)
1. Unit suite (the gate):
```
nix shell nixpkgs#python311Packages.pytest -c pytest tests/unit/test_upgrade_base.py -v
```
**EXPECTED: 13 passed.** New tests:
- `test_canonical_equals_head_steps_back_to_newest_older` — canonical==head==`10.8.0+26.6.3`,
tags `[10.6.0+26.5.0, 10.8.0+26.6.3, 10.7.1+26.6.2, 10.7.0+26.6.0, not-a-version]` →
`plan.version == "10.7.1+26.6.2"` (strictly older; asserts `version_key(plan.version) < version_key(head)`),
`kind=="version"`, reason contains `"step-back"`. main never consulted.
- `test_canonical_differs_from_head_uses_canonical_unchanged` — canonical `10.7.1+26.6.2` ≠ head
`10.8.0+26.6.3` → `version==10.7.1+26.6.2`, reason `"last-green"`; recipe_tags NOT consulted.
- `test_canonical_equals_head_no_older_published_skips` — canonical==head==`1.0.0+3.5.3`, tags
`[1.0.0+3.5.3]` only → `kind=="skip"`, reason contains `"no older published predecessor"`.
- `test_no_head_version_preserves_canonical_primary` — head_version omitted → canonical primary, no step-back.
- `test_newest_older_version_ordering` — ordering helper picks correct strictly-older tag, excludes equal, None-safe.
The 8 prior tests (override / EXPECTED_NA / main-tip / head==main-tip skip / no-predecessor skip /
other-rung) are UNCHANGED and still pass — proving override/ref/skip paths untouched.
2. Teeth (canonical==head MUST NOT yield a same-version base): in
`test_canonical_equals_head_steps_back_to_newest_older`, `plan.version != head_version` and the
`version_key(plan.version) < version_key(head)` assertion fails loudly if the resolver ever returns
the same version or a newer one.
3. Compose-label parse (the head-version reader): the regex extracts `10.8.0+26.6.3` from a quoted
label and `3.5.3+1.24.2-rootless` from an unquoted one, and returns no match for a `.chaos-version`
label (verified — see JOURNAL). Real labels confirmed on cc-ci: keycloak `10.8.0+26.6.3`,
gitea `3.5.3+1.24.2-rootless`, discourse `1.0.0+3.5.3`.
4. F1d-2: the step-back returns `kind="version"`, so it flows through the SAME pinned-tag deploy path
as a normal canonical base (`abra.recipe_checkout` pins the tag on disk) — no new deploy code.
Note (pre-existing, NOT introduced by this gate): `tests/unit/test_meta.py::test_generated_doc_table_in_sync`
and `tests/unit/test_warm_reconcile.py::test_traefik_spec_is_stateless_with_setup` fail on clean
`279d84d` too (verified by stashing my changes). Out of scope for samever.
## Blocked
(none)
## DONE — samever M1+M2 Adversary-verified PASS (no VETO)
M1 PASS (1310a95) + M2 PASS (199f5b6): the resolver same-version step-back is proven in real CI
step-back base<head, version-bump path + discourse #4 unaffected, teeth hold, clean teardown; Adversary
explicitly cleared for DONE. This marker was written by the ORCHESTRATOR because the Builder hit the opus
usage limit and could not write it; the work itself is complete and Adversary-verified.