plan(canon): enroll ALL recipes, weekly cadence, and REQUIRE proof it plays nice with samever
Operator 2026-06-17: all recipes WARM_CANONICAL (watch warm-volume disk), weekly timer, and an explicit M2 requirement to prove the sweep<->samever interaction — skip uses COMMIT equality, samever uses VERSION equality; M2 must demonstrate all 3 sweep paths (skip / version-bump upgrade / same-version- label step-back) and the commit-vs-version boundary.
This commit is contained in:
@ -12,9 +12,11 @@ nightly sweep, with two additions the operator wants:
|
||||
2. **Skip a recipe whose `main` is unchanged** vs its current canonical (no rerun needed).
|
||||
|
||||
…then **run CI cold-on-`main` for each recipe and actually promote the canonical for any that pass** —
|
||||
and **prove the whole thing works**. **The schedule/cadence is NOT the point** (nightly vs weekly is
|
||||
trivially retunable later) — *correctness of the machinery, verified end-to-end,* is the deliverable.
|
||||
Keep the existing nightly timer slot; this REPLACES the hollow sweep, it is not a parallel job.
|
||||
and **prove the whole thing works**. **The deliverable is correctness, verified end-to-end** — and the
|
||||
operator specifically wants confidence it **plays nicely with the `samever` upgrade-base work** (§2
|
||||
"Plays-nice-with-samever"). Operator decisions (2026-06-17): **all recipes enrolled** (§2.B), and the
|
||||
**cadence is weekly** (change the existing daily timer to weekly — a one-line `OnCalendar` tune; exact
|
||||
day/time is not critical). This REPLACES the hollow nightly sweep; it is not a parallel job.
|
||||
|
||||
State files: `STATUS-canon.md`, `BACKLOG-canon.md`, `REVIEW-canon.md`, `JOURNAL-canon.md`. DECISIONS.md shared.
|
||||
|
||||
@ -37,15 +39,16 @@ subsequent `--quick` warm-reattach uses it (`deploy_canonical` reattaches the re
|
||||
doesn't happen today, find and fix why (this is the real defect behind the hollow sweep). A canonical
|
||||
must demonstrably exist and be reusable before anything else is meaningful.
|
||||
|
||||
**B. Realize "promote the canonical for any recipe that passes."** Today only custom-html is enrolled, so
|
||||
"each recipe" is vacuous. Decide + document the **enrollment scope**:
|
||||
- Default intent (operator): broaden `WARM_CANONICAL` so the sweep tracks the real recipe set — at
|
||||
minimum prove it across **several** recipes, not one.
|
||||
- **Flag the resource cost:** each warm canonical retains a data volume on the single node — enrolling
|
||||
everything has a disk budget. If "promote for all that pass" is wanted but warm-volumes-for-all is too
|
||||
much disk, consider **decoupling** the cheap *version record* (last-green {version,commit}, promote for
|
||||
all) from the expensive *warm volume* (retain selectively). Surface this in DECISIONS; get the
|
||||
operator's enrollment set rather than silently enrolling all.
|
||||
**B. Enroll ALL recipes (operator decision 2026-06-17).** Set `WARM_CANONICAL = True` for **every** recipe
|
||||
cc-ci tracks (the `used-recipes.md` set) — the sweep promotes a canonical for each that passes, not just
|
||||
custom-html.
|
||||
- **Watch the warm-volume disk budget:** ~21 recipes each retaining a data volume on the single node is
|
||||
real disk. Verify headroom, lean on the existing WC8 disk-hygiene / `ci-docker-prune`, and if disk
|
||||
becomes the binding limit, **raise it** rather than silently dropping recipes (a fallback if needed:
|
||||
decouple the cheap last-green *version record* — kept for all — from the expensive retained *volume*).
|
||||
Default remains all-enrolled.
|
||||
- If a specific recipe genuinely cannot be enrolled (e.g. unbounded data, no stable health), record the
|
||||
exception + reason in DECISIONS — don't silently skip it.
|
||||
|
||||
**C. Add the upstream mirror-sync step.** Before the per-recipe CI, reconcile each mirror's `main` + tags
|
||||
to coopcloud upstream — reuse `recipe-upgrade`'s `open-recipe-pr.sh <recipe> --reconcile-only` (handles
|
||||
@ -59,9 +62,24 @@ ask and is also the determinism property (see M2 run-twice proof).
|
||||
**E. Keep it deterministic + AI-free at runtime** (it already is — a script + timer). The additions must
|
||||
stay pure code: no AI calls during the run. AI (the loops) only authors + verifies.
|
||||
|
||||
**Note — exercises `samever`:** the sweep's cold-on-latest upgrade tier hits the same-version case as its
|
||||
steady state (canonical == latest after a promotion); this phase is also a real-world validation that the
|
||||
`samever` step-back (previous-published base) behaves under the sweep.
|
||||
**F. Make the timer weekly** (operator preference): change the existing daily `OnCalendar` to weekly. The
|
||||
exact day/time is not critical — pick a low-traffic slot; it's a one-line tune. `Persistent = true` to
|
||||
catch up a missed run. This is the only schedule work; do not over-invest in it.
|
||||
|
||||
**Plays-nice-with-`samever` (operator wants this CONFIRMED, not assumed).** In the sweep, two distinct
|
||||
guards keep the upgrade tier from a vacuous same-version run — and they use **deliberately different
|
||||
keys**, so verify them together:
|
||||
- **skip-when-unchanged uses COMMIT equality** (`main` commit == canonical commit) → if literally
|
||||
nothing changed, the recipe is skipped *before* the upgrade tier runs. This is the primary
|
||||
same-version avoider in the sweep.
|
||||
- **`samever` uses VERSION equality** → for the case where `main` *changed* (new commit) but the version
|
||||
LABEL still equals the canonical's version (a non-version-bump recipe change), the upgrade tier's base
|
||||
would equal the head version, and `samever` steps back to the previous published version so a real
|
||||
delta is still tested.
|
||||
These are complementary: commit-equality (skip) is the coarse filter; version-equality (`samever`) is
|
||||
the backstop for "changed commit, same version label." Together the sweep must **never** run a vacuous
|
||||
`vX → vX` upgrade **and never** wrongly skip a real change. M2 must prove all three sweep paths
|
||||
explicitly (see Gates).
|
||||
|
||||
## 3. Gates
|
||||
|
||||
@ -79,12 +97,23 @@ exist with correct version+commit), red recipes left intact, unchanged recipes s
|
||||
per-recipe results log. **Determinism proof: run the sweep a SECOND time immediately → it SKIPS every
|
||||
recipe** (all `main` == the canonicals just promoted) = a clean no-op, no CI rerun. Confirm the **deployed
|
||||
timer fires the real (non-hollow) job** — after a fire, canonicals have advanced (evidence), not exit-0
|
||||
on an empty set. No AI in the loop. Fresh Adversary PASS on both milestones → `## DONE`.
|
||||
on an empty set.
|
||||
|
||||
**`samever` interaction proven (operator-required).** Demonstrate, with evidence, all three sweep paths:
|
||||
(1) `main` commit == canonical commit → recipe SKIPPED (no upgrade tier run); (2) `main` changed +
|
||||
version bumped → upgrade tier runs `canonical(older) → head(new)`, a real delta; (3) `main` changed but
|
||||
version label == canonical version → `samever` steps back to the previous published version (base version
|
||||
< head version), NOT a vacuous `vX→vX` and NOT skipped. Confirm the boundary explicitly: a
|
||||
same-version-label-but-different-commit recipe is **not** skipped (commits differ) and **does** hit the
|
||||
`samever` step-back. Construct the scenarios if the natural recipe set doesn't cover all three.
|
||||
|
||||
No AI in the loop. Fresh Adversary PASS on both milestones → `## DONE`.
|
||||
|
||||
## 4. Guardrails
|
||||
|
||||
- **Correctness over cadence.** The schedule is incidental; do not spend effort tuning nightly-vs-weekly.
|
||||
The bar is: the machinery *demonstrably promotes canonicals, syncs mirrors, and skips unchanged.*
|
||||
- **Correctness over cadence.** The bar is the machinery *demonstrably promotes canonicals, syncs mirrors,
|
||||
skips unchanged, and plays nicely with `samever`.* The cadence is decided (**weekly**) — set it in one
|
||||
`OnCalendar` line and move on; don't agonize over the exact slot.
|
||||
- **No AI at runtime** — pure script + systemd timer; AI only builds/verifies.
|
||||
- **Single-node safety:** serial; skip the whole run if a Drone/test build is in flight (reuse the
|
||||
existing nightly guard); tear down every deploy; bound total runtime; mind the warm-volume disk budget.
|
||||
|
||||
Reference in New Issue
Block a user