Files
cc-ci-orchestrator/cc-ci-plan/plan-mirror-enroll-all-recipes.md
autonomic-bot ad2ade842c plan(mirror): remove the operator deploy gate — loops deploy+verify autonomously
The gate existed because a wrong-target nixos-rebuild #cc-ci once dropped
the cc-ci server into emergency mode. That footgun is fixed (be4f451 maps
#cc-ci -> the Hetzner host config), and deploying cc-ci is the loops'
normal operation, so Phase 4 now runs autonomously with verify + rollback
as the safety net.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:38:59 +00:00

5.9 KiB

Plan — mirror + enroll ALL recipes (then resume per-recipe debugging)

Status: ACTIVE — loops implementing (phase mirror). Live-host deploy is autonomous (gate removed 2026-06-02). Owner: Builder + Adversary loops. Created: 2026-06-02. Author: Claude Sonnet 4.6 orchestrator session.

Goal & rationale

Get every recipe mirrored in recipe-maintainers/<recipe> AND enrolled in the !testme bridge, so all of them are CI-triggerable, before resuming debugging of individual recipes (matrix-synapse re-run failure, ghost backup PR, etc.). Operator directive: "make sure all recipes are mirrored and enrolled before we continue debugging particular recipes."

Target end-state: 19 recipes — the 18 with tests/ coverage today, plus hedgedoc (operator chose "add a test suite" for it) — each mirrored, enrolled in POLL_REPOS, and test-covered.

Current state (surveyed 2026-06-02)

Canonical set = recipes with a tests/<recipe>/ dir = 18: bluesky-pds, cryptpad, custom-html, custom-html-tiny, discourse, ghost, immich, keycloak, lasuite-docs, lasuite-drive, lasuite-meet, mailu, matrix-synapse, mattermost-lts, mumble, n8n, plausible, uptime-kuma. (+ hedgedoc, enrolled but no tests — see Phase 2.)

Dimension State
Enrolled in bridge POLL_REPOS (9) custom-html, custom-html-tiny, keycloak, cryptpad, matrix-synapse, lasuite-docs, lasuite-meet, n8n, uptime-kuma (+ hedgedoc, + cc-ci)
NOT enrolled (9) bluesky-pds, discourse, ghost, immich, lasuite-drive, mailu, mattermost-lts, mumble, plausible
Mirror missing (3) lasuite-drive, mailu, mumble (all real recipes — verified)
Enrolled but untested hedgedoc (mirror+enrollment exist, no tests/hedgedoc/)

Where things live:

  • Bridge enrollment: recipe-maintainers/cc-cinix/modules/bridge.nix, the POLL_REPOS= CSV (~line 43).
  • Tests: recipe-maintainers/cc-citests/<recipe>/ (template: recipe_meta.py, functional/test_*.py, PARITY.md).
  • Mirror create + main-sync logic: recipe-upgrade/open-recipe-pr.sh (create at lines 53-70, force-sync at 75-77).
  • Live deploy target: nixos-rebuild switch --flake .#cc-ci on the cc-ci host (now safe — be4f451 mapped #cc-ci → the Hetzner host config).

Phases

Phase 0 — pre-flight (no writes)

  • Confirm each of lasuite-drive, mailu, mumble resolves via abra recipe fetch <recipe> on the cc-ci host (upstream exists). All three have tests/ so they were exercised in phase 2; expected to pass.
  • Snapshot current POLL_REPOS and the live bridge unit state for rollback reference.

Phase 1 — create the 3 missing mirrors

For each of lasuite-drive, mailu, mumble: create recipe-maintainers/<recipe> (Gitea API) and force-sync its main to true upstream main. Reuse the create+sync path in open-recipe-pr.sh (run on the cc-ci host with bot creds), or --reconcile-only after the repo exists. No PRs opened.

Phase 2 — author the hedgedoc test suite

hedgedoc is enrolled+mirrored but has no tests/hedgedoc/. Author one mirroring a simple recipe (template = tests/uptime-kuma/): recipe_meta.py, functional/test_*.py (health-check + a content/branding probe at minimum), PARITY.md. Open a cc-ci PR for the new suite; verify it green via !testme before relying on it. (This is the larger sub-task; can be delegated to a Builder session.)

Phase 3 — enroll the 9 unenrolled recipes

Edit nix/modules/bridge.nix POLL_REPOS to add: bluesky-pds, discourse, ghost, immich, lasuite-drive, mailu, mattermost-lts, mumble, plausible. Confirm each has a tests/<recipe>/ (all 9 do). Commit to the cc-ci product repo. Final POLL_REPOS = cc-ci + all 19 recipes.

Phase 4 — deploy to the live cc-ci host

cd /root/cc-ci && nixos-rebuild switch --flake .#cc-ci on the cc-ci host to restart the bridge with the new poll set. The loops deploy this themselves — it's their normal operation, and #cc-ci → the correct Hetzner host config since be4f451, so the prior wrong-target footgun (the emergency-mode incident) is gone. Procedure: sync /root/cc-ci to the committed head first, rebuild, then verify the rebuild succeeded (ssh cc-ci reachable, bridge active, poll set = all 19 recipes). Roll back (nixos-rebuild switch --rollback) and record a finding if anything regresses. Note: /root/cc-ci is operator-synced — if for some reason the host repo can't be synced to head, claim + flag it rather than deploy a stale tree.

Phase 5 — verify !testme triggerability

For 2-3 newly-enrolled recipes, post !testme on an open PR (or a scratch PR) and confirm a Drone build starts and reports back. Spot-check the bridge poll log shows all 19 repos.

Phase 6 — resume per-recipe debugging (was blocked on the above)

Only after Phases 1-5: pick up the deferred per-recipe work — matrix-synapse upgrade re-run failure, ghost backup PRs (#1 reopened, #2 upgrade), discourse bitnamilegacy re-pin, immich/mattermost/plausible backup fixes, etc. (See DEFERRED.md + the build-audit summary.)

Risks & rollback

  • Live-host rebuild (Phase 4): the highest-impact step, but the wrong-target footgun is fixed (#cc-ci → Hetzner config, be4f451) and deploying cc-ci is the loops' normal operation, so it runs autonomously. Safety net: verify after rebuild and nixos-rebuild switch --rollback on any regression.
  • Bridge poll widening: more repos polled = more API calls; negligible at 19 repos. A bad recipe enrollment can't break others (per-recipe runs are isolated).
  • hedgedoc tests (Phase 2): authoring risk only; gated by its own !testme-green PR before trust.

Open items / decisions

  • hedgedoc: author tests (operator-chosen). Scope it as its own PR.
  • bluesky-pds #1 open PR looks like a recipe-create-pr smoke-test artifact — close separately (flagged to @notplants).
  • Host self-service rebuild path for cc-ci is still a gap (Phase 4 depends on an operator-synced /root/cc-ci); worth a durable fix later.