autonomic-bot 4b38b66fa5 fix(2): lasuite-drive Q3.2a — gate upgrade redeploy on collabora-ready + plumb DEPLOY_TIMEOUT
Q3.2a run 1: Part A (install-time OIDC) GREEN — deploy-count=1, install/backup/restore/custom +
OIDC test all PASS. BUT upgrade tier FAILED: the in-place `abra app deploy --chaos` redeploy landed
on a STILL-BOOTING collabora (coolwsd ~2min boot: 1300+ l10n files + RSA keygen) and SIGTERMed it
mid-init ("Shutdown requested while starting up", forced exit 70) → abra aborted the deploy. The
install wait_healthy returns on container 1/1 while coolwsd is still loading. Fixes (plan §C
readiness-gating, no test weakened):

- tests/lasuite-drive/ops.py::pre_upgrade — wait for collabora WOPI discovery (/hosting/discovery
  on collabora-<domain>) → 200 BEFORE the chaos redeploy, so it replaces a ready collabora cleanly.
- runner/harness/lifecycle.chaos_redeploy + generic.perform_upgrade + run_recipe_ci._perform_op —
  plumb the recipe DEPLOY_TIMEOUT to the upgrade chaos redeploy (was abra.deploy's 900s default,
  while the .env internal TIMEOUT is 1500s → Python could SIGKILL abra mid-wait on the slow
  collabora/onlyoffice reconverge). Mirrors the install deploy_app timeout plumbing.

Also (operator naming change 2026-05-29): renamed `--extra-tests` -> `--extra` in DEFERRED.md +
BACKLOG-2.md Build-backlog section. 3 refs remain in BACKLOG-2 Adversary-findings section
(241/248/292, closed findings) — left for the Adversary (single-writer); orchestrator updated
IDEAS.md/plan-sso-dep-testing.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 10:37:55 +01:00

cc-ci — Co-op Cloud recipe CI server

Comment !testme on a PR in an enrolled Co-op Cloud recipe repo and cc-ci deploys the recipe at that commit onto a real single-node Docker Swarm, runs install / upgrade / backup-restore tests (Python + Playwright) end-to-end, and reports a live, tail-able run with pass/fail back to the PR.

This repo declares the entire server as a NixOS flake and holds the test harness, the per-recipe test trees, and the docs to enroll a recipe or rebuild the box from scratch.

Status: under active autonomous construction. See machine-docs/STATUS.md for the live phase and plan.md-driven milestones in machine-docs/BACKLOG.md. Definition of Done is D1D10 (see the build plan).

Layout

flake.nix              NixOS entry point + devshells (stays at root; build ref #cc-ci)
nix/hosts/cc-ci/       the cc-ci machine config
nix/modules/           drone, comment-bridge, swarm, dashboard, secrets (Nix modules)
secrets/               sops-encrypted infra secrets (cc-ci-secrets submodule)
bridge/                !testme webhook listener source
runner/                run_recipe_ci.py + shared pytest harness
dashboard/             results overview generator
tests/<recipe>/        per-recipe install/upgrade/backup tests + playwright/
docs/                  install, enroll-recipe, secrets, architecture, runbook, baseline

All .nix code lives under nix/; flake.nix/flake.lock stay at the repo root so the build reference (nixos-rebuild switch --flake '…#cc-ci') is unchanged.

Docs

  • docs/install.md — rebuild the server from scratch (D8)
  • docs/testing.md — test architecture: generic lifecycle suite + layered recipe overlays (override/extend, discovery precedence, custom install-steps hook)
  • docs/enroll-recipe.md — add a recipe under CI (D5)
  • docs/secrets.md — secret model + rotation (D6)
  • docs/architecture.md, docs/runbook.md — design + debugging failed runs
  • docs/baseline.md — bootstrap snapshot / rollback reference

Linting & formatting

The codebase is kept formatted + lint-clean by a single entrypoint, run from the pinned lint devshell so local and CI use identical tool versions:

nix develop .#lint --command bash scripts/lint.sh         # check-only (what CI runs)
nix develop .#lint --command bash scripts/lint.sh --fix   # auto-format + apply fixes

Covers Nix (nixpkgs-fmt · statix · deadnix), Python (ruff lint+format), Shell (shellcheck · shfmt), and YAML (yamllint). Config lives in ruff.toml / .yamllint.yaml; tool/strictness choices are in machine-docs/DECISIONS.md. CI enforces it: the lint step in the .drone.yml push pipeline runs the same command and fails the build on any unclean file, so keep commits clean (--fix before pushing).

Loop state (autonomous build)

The multi-agent loop state lives under machine-docs/: STATUS.md (phase/blockers), BACKLOG.md (work + adversary findings), REVIEW.md (independent verification), JOURNAL.md (build log), DECISIONS.md (architecture choices) — plus the phase-namespaced *-1b.md / *-1c.md variants. See the build plan for the two-loop Builder/Adversary protocol.

Description
Co-op Cloud recipe CI server (autonomous build)
Readme 19 MiB
Languages
Python 91.5%
Nix 5.9%
Shell 2.6%