From 9a81fe88e889d333fc57f0559e4efaeef9d516bd Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Wed, 17 Jun 2026 04:09:56 +0000 Subject: [PATCH] plan(canon): enroll ALL recipes, weekly cadence, and REQUIRE proof it plays nice with samever MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Operator 2026-06-17: all recipes WARM_CANONICAL (watch warm-volume disk), weekly timer, and an explicit M2 requirement to prove the sweep<->samever interaction — skip uses COMMIT equality, samever uses VERSION equality; M2 must demonstrate all 3 sweep paths (skip / version-bump upgrade / same-version- label step-back) and the commit-vs-version boundary. --- .../plan-phase-canon-canonical-sweep.md | 65 ++++++++++++++----- 1 file changed, 47 insertions(+), 18 deletions(-) diff --git a/cc-ci-plan/plan-phase-canon-canonical-sweep.md b/cc-ci-plan/plan-phase-canon-canonical-sweep.md index bb0a5d7..befae06 100644 --- a/cc-ci-plan/plan-phase-canon-canonical-sweep.md +++ b/cc-ci-plan/plan-phase-canon-canonical-sweep.md @@ -12,9 +12,11 @@ nightly sweep, with two additions the operator wants: 2. **Skip a recipe whose `main` is unchanged** vs its current canonical (no rerun needed). …then **run CI cold-on-`main` for each recipe and actually promote the canonical for any that pass** — -and **prove the whole thing works**. **The schedule/cadence is NOT the point** (nightly vs weekly is -trivially retunable later) — *correctness of the machinery, verified end-to-end,* is the deliverable. -Keep the existing nightly timer slot; this REPLACES the hollow sweep, it is not a parallel job. +and **prove the whole thing works**. **The deliverable is correctness, verified end-to-end** — and the +operator specifically wants confidence it **plays nicely with the `samever` upgrade-base work** (§2 +"Plays-nice-with-samever"). Operator decisions (2026-06-17): **all recipes enrolled** (§2.B), and the +**cadence is weekly** (change the existing daily timer to weekly — a one-line `OnCalendar` tune; exact +day/time is not critical). This REPLACES the hollow nightly sweep; it is not a parallel job. State files: `STATUS-canon.md`, `BACKLOG-canon.md`, `REVIEW-canon.md`, `JOURNAL-canon.md`. DECISIONS.md shared. @@ -37,15 +39,16 @@ subsequent `--quick` warm-reattach uses it (`deploy_canonical` reattaches the re doesn't happen today, find and fix why (this is the real defect behind the hollow sweep). A canonical must demonstrably exist and be reusable before anything else is meaningful. -**B. Realize "promote the canonical for any recipe that passes."** Today only custom-html is enrolled, so -"each recipe" is vacuous. Decide + document the **enrollment scope**: - - Default intent (operator): broaden `WARM_CANONICAL` so the sweep tracks the real recipe set — at - minimum prove it across **several** recipes, not one. - - **Flag the resource cost:** each warm canonical retains a data volume on the single node — enrolling - everything has a disk budget. If "promote for all that pass" is wanted but warm-volumes-for-all is too - much disk, consider **decoupling** the cheap *version record* (last-green {version,commit}, promote for - all) from the expensive *warm volume* (retain selectively). Surface this in DECISIONS; get the - operator's enrollment set rather than silently enrolling all. +**B. Enroll ALL recipes (operator decision 2026-06-17).** Set `WARM_CANONICAL = True` for **every** recipe +cc-ci tracks (the `used-recipes.md` set) — the sweep promotes a canonical for each that passes, not just +custom-html. + - **Watch the warm-volume disk budget:** ~21 recipes each retaining a data volume on the single node is + real disk. Verify headroom, lean on the existing WC8 disk-hygiene / `ci-docker-prune`, and if disk + becomes the binding limit, **raise it** rather than silently dropping recipes (a fallback if needed: + decouple the cheap last-green *version record* — kept for all — from the expensive retained *volume*). + Default remains all-enrolled. + - If a specific recipe genuinely cannot be enrolled (e.g. unbounded data, no stable health), record the + exception + reason in DECISIONS — don't silently skip it. **C. Add the upstream mirror-sync step.** Before the per-recipe CI, reconcile each mirror's `main` + tags to coopcloud upstream — reuse `recipe-upgrade`'s `open-recipe-pr.sh --reconcile-only` (handles @@ -59,9 +62,24 @@ ask and is also the determinism property (see M2 run-twice proof). **E. Keep it deterministic + AI-free at runtime** (it already is — a script + timer). The additions must stay pure code: no AI calls during the run. AI (the loops) only authors + verifies. -**Note — exercises `samever`:** the sweep's cold-on-latest upgrade tier hits the same-version case as its -steady state (canonical == latest after a promotion); this phase is also a real-world validation that the -`samever` step-back (previous-published base) behaves under the sweep. +**F. Make the timer weekly** (operator preference): change the existing daily `OnCalendar` to weekly. The +exact day/time is not critical — pick a low-traffic slot; it's a one-line tune. `Persistent = true` to +catch up a missed run. This is the only schedule work; do not over-invest in it. + +**Plays-nice-with-`samever` (operator wants this CONFIRMED, not assumed).** In the sweep, two distinct +guards keep the upgrade tier from a vacuous same-version run — and they use **deliberately different +keys**, so verify them together: + - **skip-when-unchanged uses COMMIT equality** (`main` commit == canonical commit) → if literally + nothing changed, the recipe is skipped *before* the upgrade tier runs. This is the primary + same-version avoider in the sweep. + - **`samever` uses VERSION equality** → for the case where `main` *changed* (new commit) but the version + LABEL still equals the canonical's version (a non-version-bump recipe change), the upgrade tier's base + would equal the head version, and `samever` steps back to the previous published version so a real + delta is still tested. + These are complementary: commit-equality (skip) is the coarse filter; version-equality (`samever`) is + the backstop for "changed commit, same version label." Together the sweep must **never** run a vacuous + `vX → vX` upgrade **and never** wrongly skip a real change. M2 must prove all three sweep paths + explicitly (see Gates). ## 3. Gates @@ -79,12 +97,23 @@ exist with correct version+commit), red recipes left intact, unchanged recipes s per-recipe results log. **Determinism proof: run the sweep a SECOND time immediately → it SKIPS every recipe** (all `main` == the canonicals just promoted) = a clean no-op, no CI rerun. Confirm the **deployed timer fires the real (non-hollow) job** — after a fire, canonicals have advanced (evidence), not exit-0 -on an empty set. No AI in the loop. Fresh Adversary PASS on both milestones → `## DONE`. +on an empty set. + +**`samever` interaction proven (operator-required).** Demonstrate, with evidence, all three sweep paths: +(1) `main` commit == canonical commit → recipe SKIPPED (no upgrade tier run); (2) `main` changed + +version bumped → upgrade tier runs `canonical(older) → head(new)`, a real delta; (3) `main` changed but +version label == canonical version → `samever` steps back to the previous published version (base version +< head version), NOT a vacuous `vX→vX` and NOT skipped. Confirm the boundary explicitly: a +same-version-label-but-different-commit recipe is **not** skipped (commits differ) and **does** hit the +`samever` step-back. Construct the scenarios if the natural recipe set doesn't cover all three. + +No AI in the loop. Fresh Adversary PASS on both milestones → `## DONE`. ## 4. Guardrails -- **Correctness over cadence.** The schedule is incidental; do not spend effort tuning nightly-vs-weekly. - The bar is: the machinery *demonstrably promotes canonicals, syncs mirrors, and skips unchanged.* +- **Correctness over cadence.** The bar is the machinery *demonstrably promotes canonicals, syncs mirrors, + skips unchanged, and plays nicely with `samever`.* The cadence is decided (**weekly**) — set it in one + `OnCalendar` line and move on; don't agonize over the exact slot. - **No AI at runtime** — pure script + systemd timer; AI only builds/verifies. - **Single-node safety:** serial; skip the whole run if a Drone/test build is in flight (reuse the existing nightly guard); tear down every deploy; bound total runtime; mind the warm-volume disk budget.