plan: queue canon — make the canonical sweep actually work (substitute for hollow nightly sweep)
Operator 2026-06-17. The nightly-sweep timer fires green but is a no-op: only custom-html is WARM_CANONICAL and zero canonical.json records exist -> no canonical has ever been promoted end-to-end. canon makes it real + proven: fix/prove the promote path, broaden enrollment, add upstream mirror-sync + skip-when-unchanged, verify end-to-end (incl. run-twice no-op). Schedule is incidental; correctness is the deliverable. Replaces the hollow sweep. opus.
This commit is contained in:
@ -160,4 +160,6 @@ phases = [
|
||||
{ id = "regall", plan = "plan-phase-regall-recipe-regression.md", status = "STATUS-regall.md", models = { builder = "claude-sonnet-4-6", adversary = "claude-sonnet-4-6" } },
|
||||
# same-version upgrade-base gap: step back to newest-older-published when last-green==head (opus, design A; B in IDEAS) — see plan-phase-samever-*.md (operator 2026-06-17)
|
||||
{ id = "samever", plan = "plan-phase-samever-older-base-fallback.md", status = "STATUS-samever.md", models = { builder = "claude-opus-4-8", adversary = "claude-opus-4-8" } },
|
||||
# make the canonical sweep ACTUALLY work (substitute for the hollow nightly sweep) + upstream-sync + skip-unchanged; verify end-to-end (opus) — see plan-phase-canon-*.md (operator 2026-06-17)
|
||||
{ id = "canon", plan = "plan-phase-canon-canonical-sweep.md", status = "STATUS-canon.md", models = { builder = "claude-opus-4-8", adversary = "claude-opus-4-8" } },
|
||||
]
|
||||
|
||||
107
cc-ci-plan/plan-phase-canon-canonical-sweep.md
Normal file
107
cc-ci-plan/plan-phase-canon-canonical-sweep.md
Normal file
@ -0,0 +1,107 @@
|
||||
# Phase `canon` — make the canonical sweep actually work (the real "nightly sweep") + verify it
|
||||
|
||||
**Mission (operator-specified 2026-06-17):** the "nightly sweep" was specified in theory but **was never
|
||||
actually doing anything** — confirmed live: `nightly-sweep.timer` is deployed and fires green
|
||||
(`nightly_sweep.py`, last run 2026-06-17 03:09 UTC exit 0), but **only `custom-html` is `WARM_CANONICAL`
|
||||
-enrolled and ZERO `canonical.json` records exist** — i.e. the machinery has **never actually promoted a
|
||||
canonical end-to-end**. This phase makes it **real and proven**, as the **substitute for** that hollow
|
||||
nightly sweep, with two additions the operator wants:
|
||||
|
||||
1. **Sync each recipe mirror's `main`** on `git.autonomic.zone/recipe-maintainers/<recipe>` to its
|
||||
**upstream** (`git.coopcloud.tech/coop-cloud/<recipe>`) first, so the sweep tests true upstream latest.
|
||||
2. **Skip a recipe whose `main` is unchanged** vs its current canonical (no rerun needed).
|
||||
|
||||
…then **run CI cold-on-`main` for each recipe and actually promote the canonical for any that pass** —
|
||||
and **prove the whole thing works**. **The schedule/cadence is NOT the point** (nightly vs weekly is
|
||||
trivially retunable later) — *correctness of the machinery, verified end-to-end,* is the deliverable.
|
||||
Keep the existing nightly timer slot; this REPLACES the hollow sweep, it is not a parallel job.
|
||||
|
||||
State files: `STATUS-canon.md`, `BACKLOG-canon.md`, `REVIEW-canon.md`, `JOURNAL-canon.md`. DECISIONS.md shared.
|
||||
|
||||
## 1. Verified starting state (2026-06-17)
|
||||
|
||||
- `nightly-sweep.timer` enabled + active (next ~03:00 UTC); `nightly_sweep.py` runs and exits 0. The
|
||||
timer/service plumbing already works — **reuse it, don't rebuild it.**
|
||||
- **Only `custom-html` sets `WARM_CANONICAL = True`.** The sweep iterates `canonical.enrolled_recipes()`
|
||||
→ essentially one recipe → near-no-op across the fleet.
|
||||
- **No `canonical.json` exists** on the host → the promote path (`should_promote_canonical` →
|
||||
`promote_canonical` → `write_registry`) has **never successfully produced a canonical**, even for
|
||||
custom-html. This is the crux of "theory, not actually doing it."
|
||||
- The sweep does **not** reconcile mirrors to upstream, and does **not** skip-when-unchanged.
|
||||
|
||||
## 2. The work
|
||||
|
||||
**A. Prove + fix the promote path FIRST (the core).** On `custom-html` (already enrolled), make a green
|
||||
cold-on-latest run **actually write `canonical.json`** (recipe/version/commit/status) AND prove a
|
||||
subsequent `--quick` warm-reattach uses it (`deploy_canonical` reattaches the retained volume). If it
|
||||
doesn't happen today, find and fix why (this is the real defect behind the hollow sweep). A canonical
|
||||
must demonstrably exist and be reusable before anything else is meaningful.
|
||||
|
||||
**B. Realize "promote the canonical for any recipe that passes."** Today only custom-html is enrolled, so
|
||||
"each recipe" is vacuous. Decide + document the **enrollment scope**:
|
||||
- Default intent (operator): broaden `WARM_CANONICAL` so the sweep tracks the real recipe set — at
|
||||
minimum prove it across **several** recipes, not one.
|
||||
- **Flag the resource cost:** each warm canonical retains a data volume on the single node — enrolling
|
||||
everything has a disk budget. If "promote for all that pass" is wanted but warm-volumes-for-all is too
|
||||
much disk, consider **decoupling** the cheap *version record* (last-green {version,commit}, promote for
|
||||
all) from the expensive *warm volume* (retain selectively). Surface this in DECISIONS; get the
|
||||
operator's enrollment set rather than silently enrolling all.
|
||||
|
||||
**C. Add the upstream mirror-sync step.** Before the per-recipe CI, reconcile each mirror's `main` + tags
|
||||
to coopcloud upstream — reuse `recipe-upgrade`'s `open-recipe-pr.sh <recipe> --reconcile-only` (handles
|
||||
go-git private-mirror auth, fetches coopcloud via an `upstream` remote, closes already-merged-upstream
|
||||
PRs, leaves unrelated PRs). This is a **faithful mirror sync, not a push of our own changes.**
|
||||
|
||||
**D. Add skip-when-unchanged.** After sync, if the recipe's `main` commit == its canonical record's commit
|
||||
(no change since the last promotion) → **skip** (log `SKIP unchanged`). This is the operator's efficiency
|
||||
ask and is also the determinism property (see M2 run-twice proof).
|
||||
|
||||
**E. Keep it deterministic + AI-free at runtime** (it already is — a script + timer). The additions must
|
||||
stay pure code: no AI calls during the run. AI (the loops) only authors + verifies.
|
||||
|
||||
**Note — exercises `samever`:** the sweep's cold-on-latest upgrade tier hits the same-version case as its
|
||||
steady state (canonical == latest after a promotion); this phase is also a real-world validation that the
|
||||
`samever` step-back (previous-published base) behaves under the sweep.
|
||||
|
||||
## 3. Gates
|
||||
|
||||
**M1 — machinery works locally, each piece proven.** (A) a real `canonical.json` is produced by a green
|
||||
cold run on ≥1 recipe and reused by a warm reattach — **demonstrated, not assumed**. (C) mirror-sync and
|
||||
(D) skip-when-unchanged implemented, reusing the existing reconcile + sweep code, with unit tests
|
||||
(skip = commit-equality; sync invoked per recipe; promote still gated on green+cold+latest+enrolled).
|
||||
(B) enrollment scope decided + recorded (≥several recipes enrolled, or the decouple decision). Adversary
|
||||
cold-verifies: a canonical actually exists + reattaches; skip-logic correct; sync is faithful-mirror-only;
|
||||
a RED recipe does NOT promote (prior known-good intact); no AI at runtime.
|
||||
|
||||
**M2 — proven end-to-end in real CI (the heart of this phase).** A full sweep run across the enrolled set
|
||||
on cc-ci: mirrors synced to upstream, **canonicals actually promoted for the green recipes** (records
|
||||
exist with correct version+commit), red recipes left intact, unchanged recipes skipped — with a
|
||||
per-recipe results log. **Determinism proof: run the sweep a SECOND time immediately → it SKIPS every
|
||||
recipe** (all `main` == the canonicals just promoted) = a clean no-op, no CI rerun. Confirm the **deployed
|
||||
timer fires the real (non-hollow) job** — after a fire, canonicals have advanced (evidence), not exit-0
|
||||
on an empty set. No AI in the loop. Fresh Adversary PASS on both milestones → `## DONE`.
|
||||
|
||||
## 4. Guardrails
|
||||
|
||||
- **Correctness over cadence.** The schedule is incidental; do not spend effort tuning nightly-vs-weekly.
|
||||
The bar is: the machinery *demonstrably promotes canonicals, syncs mirrors, and skips unchanged.*
|
||||
- **No AI at runtime** — pure script + systemd timer; AI only builds/verifies.
|
||||
- **Single-node safety:** serial; skip the whole run if a Drone/test build is in flight (reuse the
|
||||
existing nightly guard); tear down every deploy; bound total runtime; mind the warm-volume disk budget.
|
||||
- **Never force-promote / never weaken:** promote only on green-cold-latest-enrolled; a red recipe keeps
|
||||
its prior known-good. Never weaken a test to make a recipe promote.
|
||||
- **Faithful mirror sync only:** force-sync `main`/tags to coopcloud upstream; never push our own changes
|
||||
to mirror `main`; never merge/disturb unrelated PRs.
|
||||
- **Nix/host changes** (enrollment is recipe-meta; any timer/module tweak is a nixos-rebuild): loops may
|
||||
deploy if clean and **verify host health after**; else file for the orchestrator. Commit author
|
||||
`autonomic-bot <autonomic-bot@noreply.git.autonomic.zone>`; push every commit; abra over a pseudo-TTY.
|
||||
|
||||
## 5. Definition of Done
|
||||
|
||||
The canonical sweep **actually works and is proven**: a green cold-on-latest run produces a real,
|
||||
reusable `canonical.json`; the sweep reconciles each recipe mirror's `main` to upstream, skips recipes
|
||||
whose `main` is unchanged vs canonical, runs CI on the rest, and promotes the canonical for any that pass
|
||||
— across a real multi-recipe set, demonstrated end-to-end in CI, including the run-twice no-op determinism
|
||||
proof and a real (non-hollow) timer fire. Enrollment scope + the warm-volume budget decided/recorded; the
|
||||
runtime job is AI-free; it is the substitute for the hollow nightly sweep (not a parallel job). M1 + M2
|
||||
fresh Adversary PASSes in REVIEW-canon.md.
|
||||
Reference in New Issue
Block a user