capacity=2 went live with three stale capacity=1-era assumptions that corrupted
concurrent runs (immich 229/230 '/pg_backup.sh: No such file'):
- ~/.abra/recipes/<recipe> is ONE shared working tree that fetch_recipe rm-rf's/
reclones and the upgrade tier git-checkouts mid-run. Same-recipe runs now
serialise on an exclusive flock (/run/lock/cc-ci-recipe-<recipe>.lock), taken
in main() BEFORE fetch_recipe and held for the whole run; the kernel releases
it on any process death, so there is no stale-lock failure mode. Different
recipes still run in parallel.
- CCCI_JANITOR_MAX_AGE=0 made a starting build reap ANY in-flight run app. Every
run now registers its app domain + pid in /run/cc-ci-active/<domain> before
app creation; the janitor checks the owner: alive (pid is a live run_recipe_ci
process) -> never reaped; dead -> reaped immediately; unknown (pre-registry or
post-reboot) -> age fallback (default 2h). The MAX_AGE=0 env override is gone
from .drone.yml.
- .drone.yml: concurrency.limit 1 -> 2 to match DRONE_RUNNER_CAPACITY=2; the
'safe because capacity=1' comments now describe the flock+registry model.
lint: PASS, unit tests: 138 passed.
bridge/bridge.py: parse_trigger(body) → (is_trigger, quick); accepts exactly
'!testme' (cold, default) and '!testme --quick' (opt-in fast lane), rejects
'!testmexyz'/'!testme foo'/etc. Threaded through both poll + webhook paths and
process_testme → trigger_build adds the CCCI_QUICK=1 Drone param (auto-exposed
to run_recipe_ci). PR comment labels a quick run lower-confidence. .drone.yml
echoes quick=. +3 unit tests (incl. the !testmexyz negative). 64 unit pass.
WC7: default !testme stays full cold; --quick opt-in, never gates merge.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
flake.nix/flake.lock STAY at root so the build ref #cc-ci is unchanged; only flake's internal
configuration.nix path updated. Root-relative refs inside moved modules re-based ../X -> ../../X
(secrets/bridge/dashboard); configuration.nix's ../../modules imports unchanged (both dirs under nix/).
Living docs (README, architecture/install/secrets/enroll) + .drone.yml comment updated to nix/...;
append-only history logs left as-is. DECISIONS.md records RL5 + the deferred-coordinated RL6.
Verified on cc-ci: nixos-rebuild build 'path:#cc-ci' -> toplevel 8i3jcad9 (BYTE-IDENTICAL to the
pre-move build — store derivations are content-addressed on file contents, module .nix not in the
runtime closure); scripts/lint.sh -> lint: PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The exec runner sets HOME to a per-build workspace, leaving ~/.abra empty
(FATA directory is empty: .../home/drone/.abra/servers). Force HOME=/root in the
step so abra and the harness's ~/.abra/recipes resolve to the real config, as the
manual runs did. Safe at capacity=1 (no concurrent build shares /root/.abra).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Splits .drone.yml into a push-triggered self-test pipeline and a custom-triggered
recipe-ci pipeline. The bridge fires event=custom builds with RECIPE/REF/PR/SRC
params; recipe-ci runs the shared harness (install/upgrade/backup + recipe-local)
with STAGES set and CCCI_JANITOR_MAX_AGE=0 (safe at capacity=1), concurrency limit 1.
Connects the verified !testme trigger to actual recipe CI (D2/D10 path).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>