Run #2 base deploy: fresh mysql:8.0 init on the loaded cc-ci host (load ~8) took >6min (InnoDB ~90s + system-tables + root-pw apply, starved by the app crash-loop churn), exceeding the recipe's 1m db start_period (+6min retry grace) → swarm killed mysql mid-init (exit 137 unhealthy) → corrupt InnoDB redo logs → permanent deadlock (same signature as run #1's stale vol). Widen db healthcheck start_period to 15m (matches app) so the slow first-boot finishes before the healthcheck can fail it. Grace-only, masks no defect; bites base+head (published recipe ships db start_period 1m everywhere) so overlay covers both. Torn down corrupt vol. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7feeadd) — static read policy-compliant (minimal/justified/grace-only); NOT a PASS, durable proof = green upgrade-to-latest run; VETO stands
cc-ci — Co-op Cloud recipe CI server
Comment !testme on a PR in an enrolled Co-op Cloud recipe repo and cc-ci deploys the recipe
at that commit onto a real single-node Docker Swarm, runs install / upgrade / backup-restore tests
(Python + Playwright) end-to-end, and reports a live, tail-able run with pass/fail back to the PR.
This repo declares the entire server as a NixOS flake and holds the test harness, the per-recipe test trees, and the docs to enroll a recipe or rebuild the box from scratch.
Status: under active autonomous construction. See
machine-docs/STATUS.mdfor the live phase andplan.md-driven milestones inmachine-docs/BACKLOG.md. Definition of Done is D1–D10 (see the build plan).
Layout
flake.nix NixOS entry point + devshells (stays at root; build ref #cc-ci)
nix/hosts/cc-ci/ the cc-ci machine config
nix/modules/ drone, comment-bridge, swarm, dashboard, secrets (Nix modules)
secrets/ sops-encrypted infra secrets (cc-ci-secrets submodule)
bridge/ !testme webhook listener source
runner/ run_recipe_ci.py + shared pytest harness
dashboard/ results overview generator
tests/<recipe>/ per-recipe install/upgrade/backup tests + playwright/
docs/ install, enroll-recipe, secrets, architecture, runbook, baseline
All .nix code lives under nix/; flake.nix/flake.lock stay at the repo root so the build
reference (nixos-rebuild switch --flake '…#cc-ci') is unchanged.
Docs
docs/install.md— rebuild the server from scratch (D8)docs/testing.md— test architecture: generic lifecycle suite + layered recipe overlays (override/extend, discovery precedence, custom install-steps hook)docs/enroll-recipe.md— add a recipe under CI (D5)docs/secrets.md— secret model + rotation (D6)docs/architecture.md,docs/runbook.md— design + debugging failed runsdocs/baseline.md— bootstrap snapshot / rollback reference
Linting & formatting
The codebase is kept formatted + lint-clean by a single entrypoint, run from the pinned lint
devshell so local and CI use identical tool versions:
nix develop .#lint --command bash scripts/lint.sh # check-only (what CI runs)
nix develop .#lint --command bash scripts/lint.sh --fix # auto-format + apply fixes
Covers Nix (nixpkgs-fmt · statix · deadnix), Python (ruff lint+format), Shell
(shellcheck · shfmt), and YAML (yamllint). Config lives in ruff.toml / .yamllint.yaml;
tool/strictness choices are in machine-docs/DECISIONS.md. CI enforces it: the lint step in the
.drone.yml push pipeline runs the same command and fails the build on any unclean file, so
keep commits clean (--fix before pushing).
Loop state (autonomous build)
The multi-agent loop state lives under machine-docs/: STATUS.md (phase/blockers),
BACKLOG.md (work + adversary findings), REVIEW.md (independent verification), JOURNAL.md
(build log), DECISIONS.md (architecture choices) — plus the phase-namespaced *-1b.md / *-1c.md
variants. See the build plan for the two-loop Builder/Adversary protocol.