plan(canon): enroll ALL recipes, weekly cadence, and REQUIRE proof it plays nice with samever

Operator 2026-06-17: all recipes WARM_CANONICAL (watch warm-volume disk),
weekly timer, and an explicit M2 requirement to prove the sweep<->samever
interaction — skip uses COMMIT equality, samever uses VERSION equality; M2
must demonstrate all 3 sweep paths (skip / version-bump upgrade / same-version-
label step-back) and the commit-vs-version boundary.
This commit is contained in:
2026-06-17 04:09:56 +00:00
parent 05e2635019
commit 9a81fe88e8

View File

@ -12,9 +12,11 @@ nightly sweep, with two additions the operator wants:
2. **Skip a recipe whose `main` is unchanged** vs its current canonical (no rerun needed).
…then **run CI cold-on-`main` for each recipe and actually promote the canonical for any that pass**
and **prove the whole thing works**. **The schedule/cadence is NOT the point** (nightly vs weekly is
trivially retunable later) — *correctness of the machinery, verified end-to-end,* is the deliverable.
Keep the existing nightly timer slot; this REPLACES the hollow sweep, it is not a parallel job.
and **prove the whole thing works**. **The deliverable is correctness, verified end-to-end** — and the
operator specifically wants confidence it **plays nicely with the `samever` upgrade-base work** (§2
"Plays-nice-with-samever"). Operator decisions (2026-06-17): **all recipes enrolled** (§2.B), and the
**cadence is weekly** (change the existing daily timer to weekly — a one-line `OnCalendar` tune; exact
day/time is not critical). This REPLACES the hollow nightly sweep; it is not a parallel job.
State files: `STATUS-canon.md`, `BACKLOG-canon.md`, `REVIEW-canon.md`, `JOURNAL-canon.md`. DECISIONS.md shared.
@ -37,15 +39,16 @@ subsequent `--quick` warm-reattach uses it (`deploy_canonical` reattaches the re
doesn't happen today, find and fix why (this is the real defect behind the hollow sweep). A canonical
must demonstrably exist and be reusable before anything else is meaningful.
**B. Realize "promote the canonical for any recipe that passes."** Today only custom-html is enrolled, so
"each recipe" is vacuous. Decide + document the **enrollment scope**:
- Default intent (operator): broaden `WARM_CANONICAL` so the sweep tracks the real recipe set — at
minimum prove it across **several** recipes, not one.
- **Flag the resource cost:** each warm canonical retains a data volume on the single node — enrolling
everything has a disk budget. If "promote for all that pass" is wanted but warm-volumes-for-all is too
much disk, consider **decoupling** the cheap *version record* (last-green {version,commit}, promote for
all) from the expensive *warm volume* (retain selectively). Surface this in DECISIONS; get the
operator's enrollment set rather than silently enrolling all.
**B. Enroll ALL recipes (operator decision 2026-06-17).** Set `WARM_CANONICAL = True` for **every** recipe
cc-ci tracks (the `used-recipes.md` set) — the sweep promotes a canonical for each that passes, not just
custom-html.
- **Watch the warm-volume disk budget:** ~21 recipes each retaining a data volume on the single node is
real disk. Verify headroom, lean on the existing WC8 disk-hygiene / `ci-docker-prune`, and if disk
becomes the binding limit, **raise it** rather than silently dropping recipes (a fallback if needed:
decouple the cheap last-green *version record* — kept for all — from the expensive retained *volume*).
Default remains all-enrolled.
- If a specific recipe genuinely cannot be enrolled (e.g. unbounded data, no stable health), record the
exception + reason in DECISIONS — don't silently skip it.
**C. Add the upstream mirror-sync step.** Before the per-recipe CI, reconcile each mirror's `main` + tags
to coopcloud upstream — reuse `recipe-upgrade`'s `open-recipe-pr.sh <recipe> --reconcile-only` (handles
@ -59,9 +62,24 @@ ask and is also the determinism property (see M2 run-twice proof).
**E. Keep it deterministic + AI-free at runtime** (it already is — a script + timer). The additions must
stay pure code: no AI calls during the run. AI (the loops) only authors + verifies.
**Note — exercises `samever`:** the sweep's cold-on-latest upgrade tier hits the same-version case as its
steady state (canonical == latest after a promotion); this phase is also a real-world validation that the
`samever` step-back (previous-published base) behaves under the sweep.
**F. Make the timer weekly** (operator preference): change the existing daily `OnCalendar` to weekly. The
exact day/time is not critical — pick a low-traffic slot; it's a one-line tune. `Persistent = true` to
catch up a missed run. This is the only schedule work; do not over-invest in it.
**Plays-nice-with-`samever` (operator wants this CONFIRMED, not assumed).** In the sweep, two distinct
guards keep the upgrade tier from a vacuous same-version run — and they use **deliberately different
keys**, so verify them together:
- **skip-when-unchanged uses COMMIT equality** (`main` commit == canonical commit) → if literally
nothing changed, the recipe is skipped *before* the upgrade tier runs. This is the primary
same-version avoider in the sweep.
- **`samever` uses VERSION equality** → for the case where `main` *changed* (new commit) but the version
LABEL still equals the canonical's version (a non-version-bump recipe change), the upgrade tier's base
would equal the head version, and `samever` steps back to the previous published version so a real
delta is still tested.
These are complementary: commit-equality (skip) is the coarse filter; version-equality (`samever`) is
the backstop for "changed commit, same version label." Together the sweep must **never** run a vacuous
`vX → vX` upgrade **and never** wrongly skip a real change. M2 must prove all three sweep paths
explicitly (see Gates).
## 3. Gates
@ -79,12 +97,23 @@ exist with correct version+commit), red recipes left intact, unchanged recipes s
per-recipe results log. **Determinism proof: run the sweep a SECOND time immediately → it SKIPS every
recipe** (all `main` == the canonicals just promoted) = a clean no-op, no CI rerun. Confirm the **deployed
timer fires the real (non-hollow) job** — after a fire, canonicals have advanced (evidence), not exit-0
on an empty set. No AI in the loop. Fresh Adversary PASS on both milestones → `## DONE`.
on an empty set.
**`samever` interaction proven (operator-required).** Demonstrate, with evidence, all three sweep paths:
(1) `main` commit == canonical commit → recipe SKIPPED (no upgrade tier run); (2) `main` changed +
version bumped → upgrade tier runs `canonical(older) → head(new)`, a real delta; (3) `main` changed but
version label == canonical version → `samever` steps back to the previous published version (base version
< head version), NOT a vacuous `vX→vX` and NOT skipped. Confirm the boundary explicitly: a
same-version-label-but-different-commit recipe is **not** skipped (commits differ) and **does** hit the
`samever` step-back. Construct the scenarios if the natural recipe set doesn't cover all three.
No AI in the loop. Fresh Adversary PASS on both milestones `## DONE`.
## 4. Guardrails
- **Correctness over cadence.** The schedule is incidental; do not spend effort tuning nightly-vs-weekly.
The bar is: the machinery *demonstrably promotes canonicals, syncs mirrors, and skips unchanged.*
- **Correctness over cadence.** The bar is the machinery *demonstrably promotes canonicals, syncs mirrors,
skips unchanged, and plays nicely with `samever`.* The cadence is decided (**weekly**) set it in one
`OnCalendar` line and move on; don't agonize over the exact slot.
- **No AI at runtime** pure script + systemd timer; AI only builds/verifies.
- **Single-node safety:** serial; skip the whole run if a Drone/test build is in flight (reuse the
existing nightly guard); tear down every deploy; bound total runtime; mind the warm-volume disk budget.