The gate existed because a wrong-target nixos-rebuild #cc-ci once dropped the cc-ci server into emergency mode. That footgun is fixed (be4f451 maps #cc-ci -> the Hetzner host config), and deploying cc-ci is the loops' normal operation, so Phase 4 now runs autonomously with verify + rollback as the safety net. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
93 lines
5.9 KiB
Markdown
93 lines
5.9 KiB
Markdown
# Plan — mirror + enroll ALL recipes (then resume per-recipe debugging)
|
|
|
|
**Status:** ACTIVE — loops implementing (phase `mirror`). Live-host deploy is autonomous (gate removed 2026-06-02).
|
|
**Owner:** Builder + Adversary loops.
|
|
**Created:** 2026-06-02. **Author:** Claude Sonnet 4.6 orchestrator session.
|
|
|
|
## Goal & rationale
|
|
|
|
Get **every** recipe mirrored in `recipe-maintainers/<recipe>` AND enrolled in the `!testme` bridge,
|
|
so all of them are CI-triggerable, **before** resuming debugging of individual recipes (matrix-synapse
|
|
re-run failure, ghost backup PR, etc.). Operator directive: "make sure all recipes are mirrored and
|
|
enrolled before we continue debugging particular recipes."
|
|
|
|
Target end-state: **19 recipes** — the 18 with `tests/` coverage today, **plus hedgedoc** (operator
|
|
chose "add a test suite" for it) — each mirrored, enrolled in `POLL_REPOS`, and test-covered.
|
|
|
|
## Current state (surveyed 2026-06-02)
|
|
|
|
Canonical set = recipes with a `tests/<recipe>/` dir = **18**:
|
|
`bluesky-pds, cryptpad, custom-html, custom-html-tiny, discourse, ghost, immich, keycloak,
|
|
lasuite-docs, lasuite-drive, lasuite-meet, mailu, matrix-synapse, mattermost-lts, mumble, n8n,
|
|
plausible, uptime-kuma`. (+ hedgedoc, enrolled but no tests — see Phase 2.)
|
|
|
|
| Dimension | State |
|
|
|---|---|
|
|
| **Enrolled** in bridge `POLL_REPOS` (9) | custom-html, custom-html-tiny, keycloak, cryptpad, matrix-synapse, lasuite-docs, lasuite-meet, n8n, uptime-kuma (+ hedgedoc, + cc-ci) |
|
|
| **NOT enrolled** (9) | bluesky-pds, discourse, ghost, immich, lasuite-drive, mailu, mattermost-lts, mumble, plausible |
|
|
| **Mirror missing** (3) | lasuite-drive, mailu, mumble (all real recipes — verified) |
|
|
| **Enrolled but untested** | hedgedoc (mirror+enrollment exist, no `tests/hedgedoc/`) |
|
|
|
|
Where things live:
|
|
- Bridge enrollment: `recipe-maintainers/cc-ci` → `nix/modules/bridge.nix`, the `POLL_REPOS=` CSV (~line 43).
|
|
- Tests: `recipe-maintainers/cc-ci` → `tests/<recipe>/` (template: `recipe_meta.py`, `functional/test_*.py`, `PARITY.md`).
|
|
- Mirror create + main-sync logic: `recipe-upgrade/open-recipe-pr.sh` (create at lines 53-70, force-sync at 75-77).
|
|
- Live deploy target: `nixos-rebuild switch --flake .#cc-ci` on the cc-ci host (now safe — `be4f451` mapped `#cc-ci` → the Hetzner host config).
|
|
|
|
## Phases
|
|
|
|
### Phase 0 — pre-flight (no writes)
|
|
- Confirm each of `lasuite-drive, mailu, mumble` resolves via `abra recipe fetch <recipe>` on the cc-ci
|
|
host (upstream exists). All three have `tests/` so they were exercised in phase 2; expected to pass.
|
|
- Snapshot current `POLL_REPOS` and the live bridge unit state for rollback reference.
|
|
|
|
### Phase 1 — create the 3 missing mirrors
|
|
For each of `lasuite-drive, mailu, mumble`: create `recipe-maintainers/<recipe>` (Gitea API) and
|
|
force-sync its `main` to true upstream `main`. Reuse the create+sync path in `open-recipe-pr.sh`
|
|
(run on the cc-ci host with bot creds), or `--reconcile-only` after the repo exists. **No PRs opened.**
|
|
|
|
### Phase 2 — author the hedgedoc test suite
|
|
hedgedoc is enrolled+mirrored but has no `tests/hedgedoc/`. Author one mirroring a simple recipe
|
|
(template = `tests/uptime-kuma/`): `recipe_meta.py`, `functional/test_*.py` (health-check + a
|
|
content/branding probe at minimum), `PARITY.md`. Open a cc-ci PR for the new suite; verify it green
|
|
via `!testme` before relying on it. (This is the larger sub-task; can be delegated to a Builder session.)
|
|
|
|
### Phase 3 — enroll the 9 unenrolled recipes
|
|
Edit `nix/modules/bridge.nix` `POLL_REPOS` to add: `bluesky-pds, discourse, ghost, immich,
|
|
lasuite-drive, mailu, mattermost-lts, mumble, plausible`. Confirm each has a `tests/<recipe>/` (all 9
|
|
do). Commit to the cc-ci product repo. Final `POLL_REPOS` = cc-ci + all 19 recipes.
|
|
|
|
### Phase 4 — deploy to the live cc-ci host
|
|
`cd /root/cc-ci && nixos-rebuild switch --flake .#cc-ci` on the cc-ci host to restart the bridge with
|
|
the new poll set. **The loops deploy this themselves** — it's their normal operation, and `#cc-ci` →
|
|
the correct Hetzner host config since `be4f451`, so the prior wrong-target footgun (the emergency-mode
|
|
incident) is gone. Procedure: sync `/root/cc-ci` to the committed head first, rebuild, then verify the
|
|
rebuild succeeded (`ssh cc-ci` reachable, bridge active, poll set = all 19 recipes). **Roll back**
|
|
(`nixos-rebuild switch --rollback`) and record a finding if anything regresses. Note: `/root/cc-ci` is
|
|
operator-synced — if for some reason the host repo can't be synced to head, claim + flag it rather than
|
|
deploy a stale tree.
|
|
|
|
### Phase 5 — verify `!testme` triggerability
|
|
For 2-3 newly-enrolled recipes, post `!testme` on an open PR (or a scratch PR) and confirm a Drone
|
|
build starts and reports back. Spot-check the bridge poll log shows all 19 repos.
|
|
|
|
### Phase 6 — resume per-recipe debugging (was blocked on the above)
|
|
Only after Phases 1-5: pick up the deferred per-recipe work — matrix-synapse upgrade re-run failure,
|
|
ghost backup PRs (#1 reopened, #2 upgrade), discourse bitnamilegacy re-pin, immich/mattermost/plausible
|
|
backup fixes, etc. (See `DEFERRED.md` + the build-audit summary.)
|
|
|
|
## Risks & rollback
|
|
- **Live-host rebuild (Phase 4):** the highest-impact step, but the wrong-target footgun is fixed
|
|
(`#cc-ci` → Hetzner config, `be4f451`) and deploying cc-ci is the loops' normal operation, so it runs
|
|
autonomously. Safety net: verify after rebuild and `nixos-rebuild switch --rollback` on any regression.
|
|
- **Bridge poll widening:** more repos polled = more API calls; negligible at 19 repos. A bad recipe
|
|
enrollment can't break others (per-recipe runs are isolated).
|
|
- **hedgedoc tests (Phase 2):** authoring risk only; gated by its own `!testme`-green PR before trust.
|
|
|
|
## Open items / decisions
|
|
- hedgedoc: **author tests** (operator-chosen). Scope it as its own PR.
|
|
- `bluesky-pds #1` open PR looks like a `recipe-create-pr` smoke-test artifact — close separately
|
|
(flagged to @notplants).
|
|
- Host self-service rebuild path for cc-ci is still a gap (Phase 4 depends on an operator-synced
|
|
`/root/cc-ci`); worth a durable fix later.
|