Files
cc-ci/machine-docs/REVIEW-samever.md

161 lines
9.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# REVIEW — phase `samever` (Adversary writes here)
**Phase:** samever — step back to older base when canonical == head version (no same-version upgrade)
**SSOT:** `/srv/cc-ci/cc-ci-plan/plan-phase-samever-older-base-fallback.md`
**Adversary loop started:** 2026-06-17T04:09Z
**Adversary clone:** /srv/cc-ci/cc-ci-adv
---
## Gate verdicts
### M2: PASS @2026-06-17T05:04Z
Proven in real CI. Cold-read the Builder's preserved logs AND — the strongest check — **independently
reproduced the headline from my OWN fresh clone** on cc-ci (`git clone … /root/adv-verify` @ 96c4ad9,
NOT the Builder's `/root/samever-deploy`), so the step-back is not an artifact of the Builder's tree.
**Independent reproduction (my clone, my runs `/root/adv-runA.log`,`/root/adv-runB.log`):**
- Run A (canonical cleared): `upgrade base: kind=skip SKIP: head == main tip` → promotes
canonical→`1.13.0+1.31.1`.
- Run B (canonical==head==`1.13.0+1.31.1`): **STEP-BACK**
`kind=version version=1.11.0+1.29.0 (step-back: last-green canonical (1.13.0+1.31.1) == head version
1.13.0+1.31.1; newest older published base)` then `upgrade→PR-head: … version=1.11.0+1.29.0→
1.13.0+1.31.1`. **All 5 tiers pass.** base `1.11.0` < head `1.13.0` a REAL delta, not a no-op,
not a skip.
**Cold-read of Builder's 5 runs (corroborates, all consistent with verified resolver logic):**
1. Headline runA/runB identical to my independent repro above. F1d-2 confirmed: base tier
prepulled `nginx:1.29.0` (pinned `1.11.0+1.29.0`), upgrade tier prepulled `nginx:1.31.1`
(head `1.13.0+1.31.1`) **distinct images ⇒ the older base really deployed pinned, not LATEST.**
2. **Version-bump UNAFFECTED (runC):** canonical re-seeded to OLDER `1.11.0+1.29.0` reason
**`"last-green"` NOT `"step-back"`** (the unchanged prevb path); upgrade `1.11.0→1.13.0` green.
Corroborates my M1 direct probe (canonicalhead last-green, `recipe_tags` not consulted).
3. **PR form (runD, ref=2b82ebab pr=999):** step-back STILL triggers with a PR head ref present
(ref does not suppress it); upgrade green.
4. **discourse #4 UNAFFECTED (disc4, REF=ae5a8180):** `kind=ref ref=f87c612d71b4 (target-branch
(main) tip)` — discourse is non-enrolled so the resolver never enters the canonical branch;
migration `0.8.1+3.5.01.0.0+3.5.3` green, `test_head_runs_official_image_not_bitnamilegacy` +
`test_sidekiq_service_dropped_by_head` PASSED. The official-image migration is untouched. ✓
5. **Spot-check hedgedoc:** `kind=version version=3.0.9+1.10.7 (step-back: canonical (3.0.10+1.10.8)
== head 3.0.10+1.10.8 …)`, upgrade `3.0.93.0.10` green. I independently confirmed via
`newest_older_version` that `3.0.9+1.10.7` IS the newest-older for hedgedoc's tag-set ⇒ step-back
generalizes to a different recipe + ordering. ✓
**Teeth:** in both my Run B and the Builder's, base version `1.11.0+1.29.0` is strictly `<` head
`1.13.0+1.31.1`; a same-version no-op would log `…→1.13.0+1.31.1` from `1.13.0+1.31.1` (it does not),
a needless skip would log `kind=skip` (it does not). Distinct base/head app images seal it.
**Hygiene (cold-checked):** canonical restored to legit `1.13.0+1.31.1` (byte-diff vs pre-verify
snapshot = unchanged); no leftover custom-html run stacks (clean teardown); hedgedoc hand-seed
removed (no `/var/lib/ci-warm/hedgedoc`); pre-existing `warm-keycloak` orphan untouched (not samever).
My own verify clone/script removed afterward.
Verdict: **M2 PASS.** Resolver steps back to a genuinely older base in real CI (headline reproduced
from my own clone), version-bump path + discourse #4 demonstrably unaffected, generalizes to a 2nd
recipe, teeth hold, clean teardown. (Consulted JOURNAL only after writing this verdict.)
**Both M1 + M2 are fresh Adversary PASSes. No VETO. The Builder is cleared to write `## DONE` to
STATUS-samever.md per the §6.1 handshake.**
### M1: PASS @2026-06-17T04:27Z
Cold-verified from own clone `/srv/cc-ci/cc-ci-adv` @ b29bb3f (claim c5a0d20). Implemented + unit-tested
gate. Independent (not trusting Builder's tests) — re-ran the suite AND wrote my own break-it probes.
**Evidence:**
1. **Unit suite cold:** `pytest tests/unit/test_upgrade_base.py -v` → **13 passed** (8 prior unchanged
+ 5 new). The 8 prior (override / EXPECTED_NA / main-tip / head==main-tip skip / no-predecessor /
other-rung) still green ⇒ override/ref/skip paths untouched.
2. **My own primitive probes** (direct import, adversarial inputs):
- `newest_older_version` strictly-older semantics: suffix tags (`-rootless`) ordered correctly;
head-version BETWEEN tags → newest strictly older; **equal-key tag EXCLUDED** (1.0.0+3.5.3 vs
1.0.0+3.5.3 → None); head-is-oldest → None; None/empty safe; recipe-major ordering beats app
(9.9.9+99.0.0 < 10.0.0+1.0.0). ✓
- `_VERSION_LABEL_RE`: parses quoted, unquoted, single-quoted labels; **`.chaos-version` → None**
(not matched); chaos-then-real picks the real label. ✓
3. **My own resolver-chain probes** (monkeypatched canonical + recipe_tags, direct `resolve_upgrade_base`):
- **canonical==head (TEETH):** `10.8.0+26.6.3` → base `10.7.1+26.6.2`, `kind=version`,
`reason="step-back: …"`; asserted `version != head` AND `version_key(base) < version_key(head)`.
**Never a same-version no-op; strictly older.** ✓
- **canonical≠head (version-bump path):** uses canonical unchanged AND `recipe_tags` is NOT consulted
(patched it to raise — no raise) ⇒ discourse #4 / version-bump PRs cannot be perturbed by this gate. ✓
- **canonical==head, no older tag:** `kind=skip`, reason `"base == head (…) and no older published
predecessor"` ⇒ declared, not silent. ✓
- **head_version=None (compose unreadable):** canonical stays primary (prevb behavior preserved). ✓
4. **sort_versions refactor behavior-preserving:** `version_key` lifted verbatim from the old inline
key; `test_warm_reconcile.py` version-ordering tests pass (8 passed; single failure unrelated).
5. **Pre-existing failures disclosed honestly:** `test_meta::test_generated_doc_table_in_sync` and
`test_warm_reconcile::test_traefik_spec_is_stateless_with_setup` FAIL on **parent 279d84d** too
(re-ran in a temp worktree — both fail there); samever diff touches neither SPECS nor the doc table.
Out of scope, NOT a regression.
**F1d-2:** step-back returns `kind="version"` ⇒ inherits the same pinned-tag deploy path as any
canonical base (no new deploy code) — the on-disk tree is checked out at the pinned older tag. This is
an M1 (unit) claim; the REAL pinned-deploy proof belongs to **M2** (live CI, evidenced base<head delta).
Verdict: **M1 PASS.** Implementation matches plan §2 chain exactly; teeth hold; no regression to
override/ref/skip/version-bump paths. (Consulted JOURNAL only after writing this — did not need it.)
---
## Orientation @2026-06-17T04:09Z
Phase `samever` plan created 2026-06-17T03:56Z. Builder has not yet started (no STATUS-samever.md).
**Root cause confirmed (cold-read of resolver, lines 133148 of run_recipe_ci.py):**
```python
rec = canonical.read_registry(recipe)
if rec and rec.get("version"):
return BasePlan(
"version",
rec["version"],
None,
f"last-green (warm canonical, status={rec.get('status')})",
)
```
The warm-canonical path returns `canonical["version"]` WITHOUT checking if it equals the head version.
The resolver is not passed the head's semantic version (only `head_ref`, a commit sha), so it cannot compare.
**Current unit tests (8 tests in tests/unit/test_upgrade_base.py) — none cover canonical==head:**
- test_upgrade_not_in_stages_skip
- test_expected_na_upgrade_skip_even_with_canonical_and_override
- test_explicit_override_wins_over_canonical
- test_last_green_warm_canonical_is_primary ← uses canonical["version"]="0.6.0+3.1.1", HEAD="aaaa1111head" (different version — correct but doesn't test the same-version edge)
- test_main_tip_fallback_when_no_last_green
- test_head_equals_main_tip_skip
- test_no_canonical_no_main_skip
- test_expected_na_other_rung_does_not_suppress_upgrade
**Key utilities available for the fix:**
- `warm_reconcile.recipe_tags(recipe)` — returns all git tags from recipe clone
- `warm_reconcile.sort_versions(tags)` — ascending sort of version tags (coop-cloud semver)
- `warm_reconcile.latest_version(tags)` — the newest tag
- Head version read from compose.yml: `coop-cloud.${STACK_NAME}.version` label at `abra.recipe_dir(recipe)/compose.yml` (head checkout already at that path when resolver runs)
**M1 verification plan (what I'll cold-verify when claimed):**
1. Resolver reads head version from compose.yml (inspect the parsing — look for compose YAML read + `coop-cloud.*version` label extraction)
2. New chain: override → (canonical if canonical≠head_version) → (newest older published if canonical==head_version) → main-tip → skip
3. Unit tests added: at minimum canonical==head→step_back, canonical≠head→unchanged, no_older_published→skip, version ordering correct
4. Run `python -m pytest tests/unit/test_upgrade_base.py -v` cold from own clone
5. Confirm OVERRIDE, EXPECTED_NA, main-tip, skip paths are untouched (regression: existing 8 tests still pass)
6. Teeth check: a "broken base" scenario should still fail (unit test or from plan F1d-2 evidence)
**M2 verification plan:**
1. Cold-on-latest run on an enrolled recipe whose canonical == latest (seed the canonical to latest, then trigger cold run)
2. Evidence in logs: `base_version < head_version` (not a no-op, not a skip)
3. Re-run discourse #4 or equivalent version-bump PR UNAFFECTED (canonicalhead path still uses canonical)
4. Spot-check 1 other recipe
---
## Adversary findings
(empty phase not yet started)
---
## Break-it probes log
(none yet)