drone (DEPS=[gitea], a COLD dep) deadlocked in promote: the cold test holds the gitea dep's
app-lock for the whole process lifetime, and promote's _provision_deps re-acquires the same lock
in the same process → blocks forever. By promote time the cold test + its deps are torn down
(dep teardown runs in the run finally, before promote), so the locks are stale. New
lifecycle.release_app_locks() frees them at promote start; the serial sweep guarantees no
concurrent run relies on them. lasuite-* (warm keycloak dep) were unaffected (no cold deploy).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adversary-flagged: drone/gitea mirror-sync hit rc=128 ('couldn't find remote ref main') —
coopcloud/coop-cloud/{drone,gitea} use `master`, not `main`. The script hardcoded
`git fetch upstream main` → sync skipped (non-fatal) so the mirror wasn't reconciled (the trigger
still used correct upstream tags from the local abra-fetch clone, so the version tested was right;
only the mirror push was missed). Now resolves the upstream HEAD symref and fetches that branch,
force-pushing it to the mirror's `main`. Consumes BUILDER-INBOX.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
M2 finding (Adversary-flagged): promote_canonical did a bare `abra app deploy` that lacked the
cold install's wiring, so recipes that passed the cold test still failed to promote:
- ghost: `abra app new` FATA 'locally unstaged changes' — the CCCI_SKIP_FETCH per-run tree was
left dirty by the tier suite. Fix: force re-checkout the tag + `git clean -fd` before deploy.
- bluesky-pds: missing pds_plc_rotation_key (install_steps inserts it, #generate=false).
- custom-html-tiny: 404 (install_steps seeds index.html). Fix: run install_steps_hook in promote.
- OIDC recipes would miss their realm. Fix: provision DEPS in promote like the cold install.
promote_canonical now: clean tree → provision deps → deploy_app with install_steps_hook + overlay +
ready-probes, then snapshot. Also: sweep result label now derives from whether the canonical was
actually written (promote is non-fatal; rc==0 did not imply promoted) — fixes the misleading
'PASS (promoted)'.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Closes the head_version-vs-latest_version divergence: should_promote gates on head_version
(code under test) but promote_canonical recorded latest_version(recipe_tags). In a manual
RECIPE=<r> run whose main checkout sits on a tag OLDER than the newest published tag, the gate
would pass on the older tag yet promote the newer (never-tested) one. promote_canonical now
takes the tested `version` (head_version, guaranteed a release tag by the tagged-gate) and
records exactly that. Sweep path unaffected (head==tag by construction).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
M1.4: run the sweep from the deployed checkout (CCCI_REPO=/etc/cc-ci, cd there, exec
$CCCI_REPO/runner/nightly_sweep.py) instead of a nix-store runner copy. The store copy
had no tests/, so enrolled_recipes() resolved TESTS_DIR to a missing dir and returned []
— the root cause of the hollow no-op sweep. /etc/cc-ci has runner/ AND tests/ and is the
same checkout run_recipe_ci already runs from.
M1.5: timer OnCalendar daily -> weekly (Sun 03:00 UTC), Persistent kept.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
WARM_CANONICAL=True added to every recipe in cc-ci-plan/used-recipes.md (20 weekly +
uptime-kuma external). enrolled_recipes() now returns all 21. Test fixtures
(custom-html-*-bad, concurrency, regression) intentionally left unenrolled.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- should_promote_canonical gains a `tagged` requirement (canon §2.A): a green cold
latest run promotes only when the tested head version is a published release tag;
an untagged main commit never becomes a canonical.
- warm_reconcile.is_released_version(recipe, version): release-tag membership (exact or
by version_key). Caller computes `tagged` so the gate stays pure.
- unit tests: untagged -> no promote; is_released_version cases.
- drive-by (pre-existing reds, unrelated to canon, now green): test_warm_reconcile
traefik assertion was stale vs the phase-pxgate spec (probes /api/version, no
health_domain); meta.py UPGRADE_BASE_VERSION KEYS help synced to the prevb doc text.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Orchestrator-written marker: the Builder hit the opus usage limit and could not
write its own DONE. Work is complete + Adversary-verified (M1 1310a95, M2
199f5b6, cleared for DONE). Unblocks auto-advance to canon.
resolve_upgrade_base now reads the head's published version (abra.head_compose_version,
the coop-cloud.<stack>.version label) and, when the last-green warm-canonical version
equals it, steps back to the newest published version strictly older than head instead
of deploying a same-version no-op. warm_reconcile gains version_key + newest_older_version
(single coop-cloud ordering source; sort_versions refactored onto version_key, no behavior
change). Skip only when no older published predecessor exists. Step-back returns kind=version
so it inherits F1d-2 pinned-tag checkout. Extends tests/unit/test_upgrade_base.py (13 pass).
The previous/ base-repair mechanism exists and can be used when updating tests
if a previous base won't deploy, but it is explicitly a last resort: reach for
it only after the dynamic base (last-green -> main-tip) fails to come up, since
each previous/ re-introduces the per-version patching treadmill the dynamic
base removed. Most recipes (incl. discourse) need none.
21/21 recipes GREEN post-prevb. 0 prevb regressions. A-regall-2 closed
(plausible backup_restore=fail was recipe bug in 3.0.1+v2.0.0, NOT prevb;
run 758 / PR#3 / 3.1.0+v2.0.0 confirms L5 pass with fixed backup mechanism).
All batches 1-6 complete. M1+M2 both claimed 2026-06-17T04:45Z.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Builder diagnosis (a3d115d) accepted:
- backupbot.backup.path in 3.0.1+v2.0.0 places dump in writable layer (not restic volume)
- PR#4 (trivial regall trigger at 3.0.1+v2.0.0) exposes the bug; PR#3 (3.1.0+v2.0.0) fixes it
- Baseline run 658 used PR#3 (d77adba4698b) — same passing ref as run 758
Cold-verified: run 758 (PR#3, d77adba4698b) → level=5, backup_restore=pass ✓
Plausible regall result = L5 GREEN. Sweep now 21/21 complete.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>