Commit Graph

1240 Commits

Author SHA1 Message Date
655a9998be fix(canon): release cold-run app/dep locks before promote (cold-dep self-deadlock)
All checks were successful
continuous-integration/drone/push Build is passing
drone (DEPS=[gitea], a COLD dep) deadlocked in promote: the cold test holds the gitea dep's
app-lock for the whole process lifetime, and promote's _provision_deps re-acquires the same lock
in the same process → blocks forever. By promote time the cold test + its deps are torn down
(dep teardown runs in the run finally, before promote), so the locks are stale. New
lifecycle.release_app_locks() frees them at promote start; the serial sweep guarantees no
concurrent run relies on them. lasuite-* (warm keycloak dep) were unaffected (no cold deploy).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 10:04:14 +00:00
24579383f4 fix(canon): mirror-sync detects upstream default branch (master vs main)
All checks were successful
continuous-integration/drone/push Build is passing
Adversary-flagged: drone/gitea mirror-sync hit rc=128 ('couldn't find remote ref main') —
coopcloud/coop-cloud/{drone,gitea} use `master`, not `main`. The script hardcoded
`git fetch upstream main` → sync skipped (non-fatal) so the mirror wasn't reconciled (the trigger
still used correct upstream tags from the local abra-fetch clone, so the version tested was right;
only the mirror push was missed). Now resolves the upstream HEAD symref and fetches that branch,
force-pushing it to the mirror's `main`. Consumes BUILDER-INBOX.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 09:37:24 +00:00
d9987a0fbf inbox(canon): heads-up to Builder before M2 claim — (1) drone mirror-sync rc=128 swallowed (clarify §2.C); (2) determinism run-twice-skip-all vs red/promote-failed recipes (reconcile in claim evidence)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 09:35:35 +00:00
4accd22d50 review(canon): pre-claim observations — DEFECT-1 label fix live/honest; NEW mirror-sync drone rc=128 swallowed (scrutinise §2.C); determinism M2.3 run-twice-skip-all at risk for red/promote-failed recipes
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 09:35:11 +00:00
df26041307 chore(canon): consume ADVERSARY-INBOX (fix f94de22 validated, M2 re-run in flight); pre-claim note — scrutinise bluesky 'documented RED' as possible warm-domain routing machinery defect at claim
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 09:12:01 +00:00
0eca8b5089 status+inbox(canon): promote fix validated (custom-html-tiny+ghost promote); bluesky warm-routing red; full re-run in flight
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 09:11:07 +00:00
3393dba11e review(M2.2): file DEFECT-1 (untrustworthy PASS label) + DEFECT-2 (promote path failing broadly) as OPEN adversary findings; close only after re-verify of fix f94de22
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:55:31 +00:00
2126747e2e status(canon): M2.2 run-1 surfaced+fixed promote bug; validating faithful-install fix
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:51:49 +00:00
f94de22234 fix(canon): promote does a FAITHFUL warm install (clean tree + deps + install_steps)
All checks were successful
continuous-integration/drone/push Build is passing
M2 finding (Adversary-flagged): promote_canonical did a bare `abra app deploy` that lacked the
cold install's wiring, so recipes that passed the cold test still failed to promote:
- ghost: `abra app new` FATA 'locally unstaged changes' — the CCCI_SKIP_FETCH per-run tree was
  left dirty by the tier suite. Fix: force re-checkout the tag + `git clean -fd` before deploy.
- bluesky-pds: missing pds_plc_rotation_key (install_steps inserts it, #generate=false).
- custom-html-tiny: 404 (install_steps seeds index.html). Fix: run install_steps_hook in promote.
- OIDC recipes would miss their realm. Fix: provision DEPS in promote like the cold install.
promote_canonical now: clean tree → provision deps → deploy_app with install_steps_hook + overlay +
ready-probes, then snapshot. Also: sweep result label now derives from whether the canonical was
actually written (promote is non-fatal; rc==0 did not imply promoted) — fixes the misleading
'PASS (promoted)'.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 08:50:59 +00:00
4cf1b32f4c chore(canon): consume BUILDER-INBOX (promote failing ~4/5 + misleading PASS label — diagnosing)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:41:28 +00:00
d933585e92 note(canon): pre-claim finding — sweep PASS-label vs actual promote failures (4/5), determinism risk; evidence captured for M2 verification
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:40:41 +00:00
ba28a8897a inbox(canon): heads-up — sweep logs PASS(promoted) but 4/5 promotes FAILED (only cryptpad wrote a canonical); label derives from rc not record; determinism M2.3 at risk
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:40:16 +00:00
0f2f57b5ca chore(canon): consume BUILDER-INBOX (discourse wedge heads-up; will time out → RED → sweep continues)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:17:27 +00:00
7ca77f95ca inbox(canon): heads-up — M2.2 sweep stuck on discourse ~51m (abra deploy hung, 0 containers, ~08:24Z timeout); canonical count 2
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:15:59 +00:00
38f9c8a30a note(canon): pre-claim — M2.1 deploy verified live read-only (/etc/cc-ci pulled to 3bdd5d1, weekly timer deployed, sweep runs non-hollow path); M2 not yet claimed
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 07:20:47 +00:00
7a08f05d59 chore(canon): consume ADVERSARY-INBOX (M1 PASS ack'd; Builder starting M2.2 long sweep)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 07:20:07 +00:00
b619e8168f inbox(canon): heads-up — M2.1 deployed; starting long M2.2 full sweep
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 07:19:20 +00:00
3bdd5d143b review(M1): PASS — tagged-gate + trigger + mirror-sync + all-21-enrolled + weekly timer cold-verified; live canonical records tag commit df2e273; 295 unit pass from fresh clone. No VETO
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 07:11:34 +00:00
8a52c16abb journal(canon): M2-prep recon — 20 recipes will seed, runtime/disk risks noted
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 07:08:50 +00:00
626badd333 claim(M1): canonical sweep machinery built + live-proven on custom-html
All checks were successful
continuous-integration/drone/push Build is passing
M1 (machinery works locally, each piece proven) — code HEAD d4cc9e4, unit suite 295 passed:
- M1.1 tagged-promote gate + promote-tested-version: live proof-A wrote a fresh canonical
  (commit df2e273 = the tag commit, correcting samever's main-HEAD 2b82eba); live proof-C
  green-untagged → 0 promotes, canonical byte-identical (tagged-gate blocks untagged).
- M1.2 sweep_decision (version-keyed trigger) + vendored faithful recipe-mirror-sync.sh
  (smoke-tested: faithful no-op main/tags push, closed merged-upstream PR #2, left PR #5);
  nightly_sweep rewritten (mirror_sync -> trigger -> run_on_tag). Live SKIP demo on custom-html.
- M1.3 all 21 used-recipes enrolled. M1.4 hollow-sweep fix (CCCI_REPO=/etc/cc-ci). M1.5 weekly timer.
- M1(A) reattach: live proof-B --quick reused the retained volume green; known-good unchanged.

Evidence + verify recipes in STATUS-canon.md; reasoning in JOURNAL-canon.md; DECISIONS appended.
Gate: M1 CLAIMED, awaiting Adversary.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 07:07:44 +00:00
69f59fdcc5 status(canon): M1 code complete + unit-tested; live M1(A) proofs in flight
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 06:49:53 +00:00
d4cc9e4530 fix(canon): promote the TESTED release version, not a re-derived latest tag
All checks were successful
continuous-integration/drone/push Build is passing
Closes the head_version-vs-latest_version divergence: should_promote gates on head_version
(code under test) but promote_canonical recorded latest_version(recipe_tags). In a manual
RECIPE=<r> run whose main checkout sits on a tag OLDER than the newest published tag, the gate
would pass on the older tag yet promote the newer (never-tested) one. promote_canonical now
takes the tested `version` (head_version, guaranteed a release tag by the tagged-gate) and
records exactly that. Sweep path unaffected (head==tag by construction).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 06:47:33 +00:00
a20890a363 feat(canon): M1.2 release-tag trigger + faithful mirror-sync in the weekly sweep (§2.C/§2.D)
All checks were successful
continuous-integration/drone/push Build is passing
- warm_reconcile.sweep_decision(latest_tag, canon_version): pure new-release-tag trigger
  keyed on version_key (NOT commit) — new tag>canon → run; ==/older → skip no-new-version
  (even with untagged main commits); no tag → skip never-released. Unit-tested.
- scripts/recipe-mirror-sync.sh: faithful mirror sync (adapted from open-recipe-pr.sh
  --reconcile-only) — explicit coopcloud `upstream` remote (robust to inconsistent clone
  remotes), syncs main+TAGS, closes merged-upstream PRs, leaves unrelated PRs, bot-token auth.
- nightly_sweep rewritten: per enrolled recipe → mirror_sync → fetch → sweep_decision →
  run_on_tag (checkout the release tag + CCCI_SKIP_FETCH=1 so head IS the tag → tagged-promote
  gate passes, REF empty → promote allowed). Skips logged; run-twice → skip-all determinism.
- smoke-tested recipe-mirror-sync.sh live on custom-html: faithful no-op main/tags push,
  closed merged-upstream PR #2, left pending PR #5.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 06:45:43 +00:00
f089c30040 chore(canon): pre-claim code-read notes (M1.1/1.3/1.4/1.5 landed; M1.2 outstanding; probe list)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 06:42:08 +00:00
f8c0e53521 feat(canon): M1.4 hollow-sweep fix + M1.5 weekly timer
All checks were successful
continuous-integration/drone/push Build is passing
M1.4: run the sweep from the deployed checkout (CCCI_REPO=/etc/cc-ci, cd there, exec
$CCCI_REPO/runner/nightly_sweep.py) instead of a nix-store runner copy. The store copy
had no tests/, so enrolled_recipes() resolved TESTS_DIR to a missing dir and returned []
— the root cause of the hollow no-op sweep. /etc/cc-ci has runner/ AND tests/ and is the
same checkout run_recipe_ci already runs from.
M1.5: timer OnCalendar daily -> weekly (Sun 03:00 UTC), Persistent kept.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 06:37:39 +00:00
136100f610 feat(canon): M1.3 enroll all 21 used-recipes as data-warm canonicals (§2.B)
All checks were successful
continuous-integration/drone/push Build is passing
WARM_CANONICAL=True added to every recipe in cc-ci-plan/used-recipes.md (20 weekly +
uptime-kuma external). enrolled_recipes() now returns all 21. Test fixtures
(custom-html-*-bad, concurrency, regression) intentionally left unenrolled.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 06:35:30 +00:00
27e06289f8 feat(canon): M1.1 tagged-promote gate — canonical only advances to a published release tag
All checks were successful
continuous-integration/drone/push Build is passing
- should_promote_canonical gains a `tagged` requirement (canon §2.A): a green cold
  latest run promotes only when the tested head version is a published release tag;
  an untagged main commit never becomes a canonical.
- warm_reconcile.is_released_version(recipe, version): release-tag membership (exact or
  by version_key). Caller computes `tagged` so the gate stays pure.
- unit tests: untagged -> no promote; is_released_version cases.
- drive-by (pre-existing reds, unrelated to canon, now green): test_warm_reconcile
  traefik assertion was stale vs the phase-pxgate spec (probes /api/version, no
  health_domain); meta.py UPGRADE_BASE_VERSION KEYS help synced to the prevb doc text.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 06:34:09 +00:00
23c02c59b6 status(canon): bootstrap phase canon — state files, hollow-sweep root cause, M1/M2 backlog
All checks were successful
continuous-integration/drone/push Build is passing
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 06:28:35 +00:00
cfb341e244 chore(canon): Adversary online + cold baseline of starting state (1 enrolled, 1 canonical from samever, daily timer)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 06:19:45 +00:00
79dbc2dc8f status(samever): ## DONE — M1+M2 Adversary-verified PASS (no VETO)
All checks were successful
continuous-integration/drone/push Build is passing
Orchestrator-written marker: the Builder hit the opus usage limit and could not
write its own DONE. Work is complete + Adversary-verified (M1 1310a95, M2
199f5b6, cleared for DONE). Unblocks auto-advance to canon.
2026-06-17 06:16:30 +00:00
199f5b6cb8 review(samever): M2 PASS — headline step-back reproduced from own clone; version-bump + discourse #4 unaffected; teeth hold; clean teardown. No VETO; cleared for DONE
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 05:04:42 +00:00
96c4ad9ef3 claim(M2): samever proven in real CI — step-back base<head, version-bump unaffected, discourse #4 + hedgedoc spot-check
All checks were successful
continuous-integration/drone/push Build is passing
5 real cc-ci runs (samever-deploy @ cc-ci main): Run B nightly steady-state step-back
custom-html 1.11.0+1.29.0→1.13.0+1.31.1 (base<head real delta, 5 tiers green); Run C
version-bump UNAFFECTED (last-green path); Run D PR-form step-back (ref set); discourse #4
kind=ref main-tip unaffected (migration 0.8.1→1.0.0 green); hedgedoc spot-check step-back
3.0.9→3.0.10 green. WHAT/HOW/EXPECTED/WHERE in STATUS-samever.md; logs /root/samever-*.log,
artifacts /var/lib/cc-ci-runs/samever-*/ on cc-ci.
2026-06-17 04:58:48 +00:00
8e8985b96f journal(samever): M2 evidence — step-back (B), version-bump-unaffected (C), discourse kind=ref unaffected
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 04:47:53 +00:00
7902fb327d chore(samever): consume ADVERSARY-INBOX (M2 heads-up read)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 04:33:32 +00:00
aff7b14299 inbox(samever): heads-up — starting M2 e2e (custom-html two-run) on cc-ci
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 04:32:52 +00:00
398f559168 status(samever): M1 PASS recorded; M2 in progress (custom-html two-run on cc-ci)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 04:32:51 +00:00
1310a95ac2 review(samever): M1 PASS — resolver step-back cold-verified; teeth hold (base<head), version-bump path untouched, 13/13 + own probes
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 04:28:22 +00:00
61c7739285 journal(samever): M2 prep notes while parked at M1 gate
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 04:26:27 +00:00
c5a0d204c1 claim(M1): samever resolver step-back implemented + unit-tested (13 pass)
All checks were successful
continuous-integration/drone/push Build is passing
WHAT/HOW/EXPECTED/WHERE in STATUS-samever.md. Adversary: cold pytest
tests/unit/test_upgrade_base.py → 13 passed; canonical==head steps back to a
strictly-older base, canonical!=head unchanged, no-older→declared skip.
2026-06-17 04:25:16 +00:00
b29bb3f804 feat(samever): step back to older base when last-green canonical == head version
resolve_upgrade_base now reads the head's published version (abra.head_compose_version,
the coop-cloud.<stack>.version label) and, when the last-green warm-canonical version
equals it, steps back to the newest published version strictly older than head instead
of deploying a same-version no-op. warm_reconcile gains version_key + newest_older_version
(single coop-cloud ordering source; sort_versions refactored onto version_key, no behavior
change). Skip only when no older published predecessor exists. Step-back returns kind=version
so it inherits F1d-2 pinned-tag checkout. Extends tests/unit/test_upgrade_base.py (13 pass).
2026-06-17 04:24:14 +00:00
279d84d229 fix(STATUS-regall): bare ## DONE marker so watchdog detects phase complete
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 04:14:14 +00:00
f97ed0299a review(samever): Adversary orientation — samever phase started; awaiting M1 claim
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 04:11:09 +00:00
dc74b1efb9 docs(recipe-customization): make previous/ a documented last-resort — prefer not to use
All checks were successful
continuous-integration/drone/push Build is passing
The previous/ base-repair mechanism exists and can be used when updating tests
if a previous base won't deploy, but it is explicitly a last resort: reach for
it only after the dynamic base (last-green -> main-tip) fails to come up, since
each previous/ re-introduces the per-version patching treadmill the dynamic
base removed. Most recipes (incl. discourse) need none.
2026-06-17 03:36:31 +00:00
eff8b1a93f review(regall): M1 PASS + M2 PASS — full sweep 21/21 GREEN, no prevb regressions, no VETO
All checks were successful
continuous-integration/drone/push Build is passing
M1: All 21 recipes cold-verified from results.json. Classification table accurate.
Zero prevb regressions. A-regall-2 (plausible) = recipe bug in 3.0.1+v2.0.0, not prevb.
BPs 1-5 complete. No flake misclassifications found.

M2: Trivially satisfied — no prevb-caused regressions, no cc-ci code fixes needed.

Both M1+M2 PASS. regall phase DONE.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-17 03:04:38 +00:00
3403309136 status(regall): ## DONE — M1+M2 Adversary-verified PASS (no VETO); all 21 GREEN
All checks were successful
continuous-integration/drone/push Build is passing
21/21 recipes GREEN post-prevb. 0 prevb regressions. A-regall-2 closed
(plausible backup_restore=fail was recipe bug in 3.0.1+v2.0.0, NOT prevb;
run 758 / PR#3 / 3.1.0+v2.0.0 confirms L5 pass with fixed backup mechanism).
All batches 1-6 complete. M1+M2 both claimed 2026-06-17T04:45Z.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-17 03:03:06 +00:00
848e0c6b1e review(regall): A-regall-2 CLOSED — plausible L5 via PR#3 (run 758); recipe bug NOT prevb
All checks were successful
continuous-integration/drone/push Build is passing
Builder diagnosis (a3d115d) accepted:
- backupbot.backup.path in 3.0.1+v2.0.0 places dump in writable layer (not restic volume)
- PR#4 (trivial regall trigger at 3.0.1+v2.0.0) exposes the bug; PR#3 (3.1.0+v2.0.0) fixes it
- Baseline run 658 used PR#3 (d77adba4698b) — same passing ref as run 758

Cold-verified: run 758 (PR#3, d77adba4698b) → level=5, backup_restore=pass ✓
Plausible regall result = L5 GREEN. Sweep now 21/21 complete.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-17 03:01:55 +00:00
a3d115d6e3 diagnose(regall): A-regall-2 root cause — recipe bug in 3.0.1+v2.0.0, NOT prevb
All checks were successful
continuous-integration/drone/push Build is passing
backupbot.backup.path: "/postgres.dump.gz" places dump in container writable
layer (not a volume), so restic never captures it. Restore post-hook fails
with "No such file or directory". PR#3 (3.1.0+v2.0.0) fixes this with
backupbot.backup.volumes.db-data.path. Baseline run 658 tested PR#3 (working
mechanism), not 3.0.1+v2.0.0 (broken). Re-opened PR#3 + !testme triggered
(comment 14651) to demonstrate backup_restore=pass. BUILDER-INBOX consumed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-17 02:58:06 +00:00
3edd0713d2 review(regall): A-regall-2 CONFIRMED — plausible backup_restore=fail 2/2 (genuine regression)
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
Runs 750 and 754 both fail: ci_marker absent after restore.
No-op upgrade (3.0.1+v2.0.0→3.0.1+v2.0.0) via UPGRADE_BASE_VERSION path is prevb-specific.
Baseline run 658 had genuine git-ref upgrade and passed L5.

Builder-INBOX written. M1 blocked pending plausible fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-17 02:34:04 +00:00
a7317a54fb review(regall): batches 5-6 verified; A-regall-2 filed for plausible backup_restore=fail
All checks were successful
continuous-integration/drone/push Build is passing
Batch 5 results:
- uptime-kuma (748): L5 all pass ✓
- lasuite-drive (749): L5 all pass ✓
- plausible (750): L2, backup_restore=FAIL — regression from baseline L5
  - ci_marker not found after restore; no-op upgrade (3.0.1+v2.0.0→3.0.1+v2.0.0)
  - Builder re-running as Drone 754

Batch 6 results:
- custom-html-tiny (752): L5, upgrade=pass, backup_restore=skip (expected) ✓
- bluesky-pds (753): L5, upgrade=skip (expected/EXPECTED_NA), backup_restore=pass ✓

A-regall-2: plausible backup_restore=fail — prevb regression or flake TBD.
Run 750 shows no-op upgrade (prevb UPGRADE_BASE_VERSION path) vs baseline run 658 genuine upgrade (git ref).
Same failure seen in m2r/m2rr-plausible during prevb development.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-17 02:32:26 +00:00
ec1dc5978d status(regall): batch 5 partial (lasuite-drive/uptime-kuma L5; plausible restore=fail LIKELY FLAKY, re-triggered); batch 6 IN FLIGHT
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 02:28:31 +00:00