M1 (machinery works locally, each piece proven) — code HEAD d4cc9e4, unit suite 295 passed:
- M1.1 tagged-promote gate + promote-tested-version: live proof-A wrote a fresh canonical
(commit df2e273 = the tag commit, correcting samever's main-HEAD 2b82eba); live proof-C
green-untagged → 0 promotes, canonical byte-identical (tagged-gate blocks untagged).
- M1.2 sweep_decision (version-keyed trigger) + vendored faithful recipe-mirror-sync.sh
(smoke-tested: faithful no-op main/tags push, closed merged-upstream PR #2, left PR #5);
nightly_sweep rewritten (mirror_sync -> trigger -> run_on_tag). Live SKIP demo on custom-html.
- M1.3 all 21 used-recipes enrolled. M1.4 hollow-sweep fix (CCCI_REPO=/etc/cc-ci). M1.5 weekly timer.
- M1(A) reattach: live proof-B --quick reused the retained volume green; known-good unchanged.
Evidence + verify recipes in STATUS-canon.md; reasoning in JOURNAL-canon.md; DECISIONS appended.
Gate: M1 CLAIMED, awaiting Adversary.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
9.6 KiB
JOURNAL — phase canon (canonical sweep, make it real)
Builder reasoning log. WHY lives here; WHAT/HOW/EXPECTED/WHERE live in STATUS-canon.md.
2026-06-17 — bootstrap / code survey
Read the phase canon (plan-phase-canon-canonical-sweep.md) + plan.md §6.1/§7/§9. Surveyed the
existing canonical/sweep machinery before designing. Key findings:
Clone identity
/srv/cc-ci is a symlink → /srv/cc-ci-orch; the env's two "working dirs" are the same directory.
This IS the Builder clone (reflog shows the claim(M2)/status(samever) ## DONE commits). The
Adversary cold-verifies from its own fresh clones. No collision.
What already works (phase doc is partly stale)
- The phase doc says "ZERO canonical.json exist". Not true any more: a real canonical for
custom-htmlexists on the host at/var/lib/ci-warm/custom-html/canonical.json(version 1.13.0+1.31.1, commit2b82eba…, status idle, ts20260617T050314Z) with its retained data volumewarm-custom-html_..._content. It was produced by a manual cold run during thesameverphase, NOT by the timer. So the promote primitive (seed_canonical → write_registry + warmsnap) demonstrably works; the sweep that should drive it is what's hollow.
The real "hollow sweep" defect (root cause, confirmed live)
The deployed nightly-sweep.timer fired 2026-06-17 03:09 and logged:
===== nightly cold sweep: enrolled canonicals = [] ===== → a true no-op.
Cause: nightly_sweep.py does REPO = os.environ.get("CCCI_REPO", "/root/cc-ci") then
sys.path.insert(0, REPO/runner); from harness import canonical. The systemd unit
(nix/modules/nightly-sweep.nix) sets no CCCI_REPO, and /root/cc-ci does not exist on the
host. So the import falls through to the harness packaged in the nix store (runnerSrc=../../runner
— runner/ only, NO tests/). meta.TESTS_DIR = ROOT/tests then points at a nonexistent dir →
enrolled_recipes() swallows the OSError → []. Even though custom-html is enrolled in the repo,
the deployed timer never sees it. This is the machinery that was "specified but never doing
anything." Fix: point the sweep at a real, current checkout that has tests/.
How current code stays live on the host
- Normal recipe CI: Drone
execpipeline auto-clones cc-ci per build into its workspace, then runscc-ci-run runner/run_recipe_ci.pyfrom that fresh clone → tests/runner always current. /etc/cc-ciis a git clone (the nixos flake source:nixos-rebuild --flake /etc/cc-ci#…). It is currently STALE (e60415d, far behind main) because recent phases only touchedrunner/(picked up by Drone's fresh clone) and needed no nixos-rebuild. The sweep is the first thing that needs/etc/cc-cicurrent.- Plan: sweep service sets
CCCI_REPO=/etc/cc-ciand runsnightly_sweep.pyFROM the checkout (change the nix to exec$CCCI_REPO/runner/nightly_sweep.py, not the store copy) → after a deploy that doesgit -C /etc/cc-ci pull && nixos-rebuild, the sweep reads current tests/ + runner. This reuses the flake-source checkout (declarative, reproducible) rather than inventing a new clone.
Promote path (the core, §2.A)
should_promote_canonical(recipe, ref, overall, quick)= enrolled & green & cold(not quick) & not-ref (no PR head).promote_canonicaldeployslatest_version(recipe_tags(recipe))(the latest git tag) fresh/in-place, waits healthy, undeploys,seed_canonical(snapshot + write_registry).- Tagged-promote addition needed: the green gate currently tests whatever fetch_recipe checked
out (catalogue
mainHEAD for a cold run), which can be untagged-ahead of the latest tag, while promote always writes the latest TAG. Per operator: a canonical must only ever be a real release. Add ataggedrequirement: the tested head version (abra.head_compose_version, the composeversionlabel) must equal a published release tag (recipe_tags). When main HEAD == latest release (the common just-cut case) head_version == latest tag → promote; when main is untagged-ahead → no promote.
Trigger on a NEW RELEASE TAG (§2.D) + test the tag (not main)
- Version ordering is centralized in
warm_reconcile.version_key/latest_version/newest_older_version(already used by samever step-back). Reuse them. - Trigger (pure, in the sweep, per recipe): after mirror-sync,
latest = latest_version(recipe_tags);canon = read_registry(recipe).version. No tag → SKIP (never released).latest <= canon(by version_key) → SKIP no-new-version (even if main has untagged commits — we compare tags not commits).latest > canon→ run cold on the tag. - Test the TAG cold: to honour "run CI cold on that tagged version" (and so a green gate proves
the exact thing that gets promoted), check out the latest tag in
~/.abra/recipes/<recipe>and run withCCCI_SKIP_FETCH=1(the existing staging mechanism) → head_version = tag, head_ref = tag commit, REF empty (sonot refstill holds → promote allowed). The upgrade-base resolver then sees canonical(older) < head(new tag) → real delta (samever step-back never fires: tag>canon by construction).
samever orthogonality (operator-required)
The release-tag trigger guarantees, in the sweep, version-under-test > canonical, so the upgrade
base is strictly older → samever's same-version step-back never fires. (a) no new tag → SKIP, no
upgrade-tier run; (b) new tag → canonical(older)→new, real delta, promote. samever's same-version
behaviour stays owned by the samever phase on the PR path. Will demonstrate both in M2.
Enroll-all set (§2.B)
Authoritative inventory = cc-ci-plan/used-recipes.md (21 rows: 20 weekly + uptime-kuma
external). NOT the test fixtures (custom-html-bkp-bad / -rst-bad, concurrency, regression,
_generic). custom-html-tiny IS in used-recipes (weekly) → enroll it too.
Disk budget (§2.B watch-item)
Host /: 150G total, 104G used, 40G free (73%). du of /var/lib/ci-warm today: custom-html 32K,
keycloak 159M. Retaining ~21 fresh-install data volumes should be a few GB; immich/matrix/mailu are
the ones to watch. Will measure during the M2 full sweep and record the real budget; raise the VM
disk (orchestrator) rather than silently drop recipes if it binds.
§2.G UPGRADE_BASE_VERSION retirement — gated on M2
plausible pins UPGRADE_BASE_VERSION="3.0.1+v2.0.0"; bluesky-pds only references it in a comment.
Retirement requires plausible's canonical to actually land at its latest green release so the dynamic
resolver picks the right base — so this is sequenced AFTER M2 promotes plausible. Keep the pin if
plausible can't go green dynamically (record why).
2026-06-17 — M1 built + live-proven (CLAIMED)
All M1 code landed (HEAD d4cc9e4). Reasoning behind the choices:
- Tagged-gate computes
taggedat the call site, not inside the gate — keepsshould_promote_canonicalpure (the Adversary anti-anchoring + the existing unit-test contract).is_released_versionlives in warm_reconcile (owns version logic + recipe_tags I/O). - Promote the TESTED version (divergence fix,
d4cc9e4): the Adversary's pre-claim probe flagged that the gate checkshead_versionbut promote recordedlatest_version(recipe_tags). Live proof-A made this concrete and favourable: the OLD record had commit2b82eba(a merge-to-main commit), but the tag1.13.0+1.31.1actually points todf2e273. Recording the tested version's head_ref now writes the TAG commit — strictly more correct. Sweep path was already safe (head==tag), but the manualRECIPE=<r>path needed it. - Why a vendored mirror-sync script, not the nix-store open-recipe-pr.sh: the recipe clones on
cc-ci have INCONSISTENT remotes (n8n: origin=mirror; mumble: origin=coopcloud; ghost/discourse:
origin=mirror, no
upstream). open-recipe-pr.sh assumes origin=coopcloud → would force-sync mirror main to mirror main (no-op) for most. The vendoredscripts/recipe-mirror-sync.shpins an explicit coopcloudupstreamremote from the recipe name, syncs main+TAGS (canon needs upstream tags for the trigger), and authes via the bot token (self-contained, not host .git-credentials). Behaviour matches the phase's described open-recipe-pr.sh --reconcile-only (faithful, close merged-upstream PRs, leave unrelated). See DECISIONS. - Why test the TAG via checkout+CCCI_SKIP_FETCH (run_on_tag), not just REF=tag: REF alone (no SRC)
takes fetch_recipe's
abra recipe fetchbranch (ignores REF) AND would setref→ should_promote blocks. Staging the tag in the clone + CCCI_SKIP_FETCH makes head=tag with REF empty → promote allowed, and exercises the real "cold on the tagged release" path.
Live proof evidence (cc-ci, /root/canon-verify @ d4cc9e4)
- proof-A (promote): canonical.json fresh ts 065027Z, commit df2e273 (=tag commit). Note: because custom-html canonical already == latest, run_on_tag here re-promoted an EQUAL version → the samever step-back fired (base 1.11.0+1.29.0). That is an artifact of bypassing the trigger for the proof; the REAL sweep SKIPs equal-version (sweep_decision), so the step-back never fires in the sweep — to be shown live in M2 (canonical(older)→new tag, base=canonical, no step-back).
- proof-B (reattach): --quick reattached the retained volume, green (4 tests passed), known-good version+commit UNCHANGED (df2e273); ts re-stamped only by the idle-status write (write_registry stamps ts on every status write) — NOT a promote.
- proof-C (untagged→no-promote): green cold run (level 5/5) on an untagged head (label 1.13.1+1.31.1) → 0 promote log lines, canonical.json byte-identical before/after. Tagged-gate works live.