Earlier perl substitution missed the multi-line upgrade and restore run_lifecycle_tier calls (still passed `target` = VERSION env, None for !testme runs), so perform_upgrade got head_ref=None for upgrade tier → re-checkout skipped → chaos redeploy of leftover prev checkout (vacuous prev→prev that 'passed' via the chaos-label move fallback). Verified e2e on hedgedoc (install,upgrade; commit pending push): upgrade→PR-head: head_ref=09bf4d54 chaos-version=09bf4d54 version=3.0.9+1.10.7→3.0.10+1.10.8 deploy-count=1, install/upgrade=pass, clean teardown. The chaos-version label deterministically matches head_ref — direct proof PR-head code was deployed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
94 lines
5.6 KiB
Markdown
94 lines
5.6 KiB
Markdown
# Enrolling a recipe under cc-ci (D5)
|
|
|
|
Adding a recipe is a small, repeatable, **no-harness-surgery** operation:
|
|
|
|
## 1. Make the recipe available on the mirror
|
|
|
|
Recipes under test live on the private mirror `git.autonomic.zone/recipe-maintainers/<recipe>`,
|
|
synced from upstream `git.coopcloud.tech`. If not yet mirrored, mirror it (abra fetch + push to the
|
|
org) — see the recipe mirror+PR flow (plan §4.1). A recipe may ship its own `tests/` dir in its repo;
|
|
those are discovered and run against the live app (D4 — see below).
|
|
|
|
## 2. Add the per-recipe test tree in this repo
|
|
|
|
```
|
|
tests/<recipe>/
|
|
├── recipe_meta.py # optional per-recipe harness config (see below)
|
|
├── install_steps.sh # optional custom install-steps hook (pre-deploy setup)
|
|
├── ops.py # optional pre-op seed hooks (pre_install/pre_upgrade/pre_backup/pre_restore)
|
|
├── test_install.py # optional install overlay (runs ADDITIVELY alongside generic)
|
|
├── test_upgrade.py # optional upgrade overlay (runs ADDITIVELY alongside generic)
|
|
├── test_backup.py # optional backup overlay (runs ADDITIVELY alongside generic)
|
|
└── test_restore.py # optional restore overlay (runs ADDITIVELY alongside generic)
|
|
```
|
|
|
|
**A recipe is testable with ZERO config:** with no overlay files, the **generic lifecycle suite**
|
|
runs (install/upgrade/backup/restore) against a single shared deployment — see `docs/testing.md` for
|
|
the full model (deploy-once, additive generic+overlay, the chaos PR-head upgrade, the HC2 repo-local
|
|
allowlist, the install-steps hook). The per-recipe dir only holds the bits where the recipe needs
|
|
*more* than the generic.
|
|
|
|
To add recipe-specific coverage, drop a `tests/<recipe>/test_<op>.py` **overlay** — it runs
|
|
**ALONGSIDE** the generic for that op (HC3 additive, Phase 1e); the generic floor is never silently
|
|
dropped. Overlays are **assertion-only** against the shared live deployment (the `live_app` fixture;
|
|
they never perform the op or deploy/teardown — the orchestrator owns those). If the overlay needs to
|
|
SEED pre-op state (data-continuity markers, the backup→restore divergence), put `pre_<op>(domain,
|
|
meta)` callables in `tests/<recipe>/ops.py` — the orchestrator runs them BEFORE the op. Copy an
|
|
existing recipe (`tests/custom-html/` simple/volume marker; `tests/keycloak/` admin-API; `tests/
|
|
matrix-synapse/` `db`-service psql marker). **Do not edit the shared `tests/conftest.py` /
|
|
`runner/harness/` to add a recipe** — set per-recipe knobs in `recipe_meta.py`:
|
|
|
|
```python
|
|
HEALTH_PATH = "/realms/master" # path that returns a healthy status (default "/")
|
|
HEALTH_OK = (200,) # acceptable status codes (default 200/301/302)
|
|
DEPLOY_TIMEOUT = 600 # seconds for services to converge (default 600)
|
|
HTTP_TIMEOUT = 600 # seconds for the app to answer (default 300)
|
|
BACKUP_CAPABLE = True # override backup-capability auto-detect (default: scan compose)
|
|
EXTRA_ENV = {"KEY": "value"} # or EXTRA_ENV(domain) -> dict; extra .env keys set at deploy
|
|
SKIP_GENERIC = ["upgrade"] # per-recipe opt-out from the generic floor for the listed ops
|
|
# ("all"/"*" = every op); rarely needed — generic is the floor
|
|
```
|
|
|
|
Useful `harness.lifecycle` helpers for overlays: `http_get`, `http_fetch`, `http_body`,
|
|
`exec_in_app` (use this for data markers — volume/DB, hardened with returncode+retry); the lifecycle
|
|
ops themselves are orchestrator-owned (you never call them from an overlay). The harness forces
|
|
`LETS_ENCRYPT_ENV=""` (no ACME), a unique short domain per run, and guarantees teardown.
|
|
|
|
## 3. Recipe-local tests (D4) — default-deny (HC2)
|
|
|
|
If the recipe's own repo contains `tests/test_*.py` / `install_steps.sh` / `ops.py`, the runner
|
|
snapshots them right after fetch — but per Phase 1e HC2 it executes them **only** for recipes on the
|
|
cc-ci approval allowlist `tests/repo-local-approved.txt` (default empty ⇒ default-deny). PR-author
|
|
code runs on the CI host with `/run/secrets/*` present, so adding a recipe to the allowlist is a
|
|
deliberate cc-ci-maintainer act (in a cc-ci PR, after reviewing that recipe's repo-local tests).
|
|
Without approval, only the cc-ci overlays in this repo + the generic floor run. Approved recipe-local
|
|
files receive env `CCCI_BASE_URL` (e.g. `https://<app>.ci.commoninternet.net/`) and `CCCI_APP_DOMAIN`.
|
|
|
|
## 4. Add the repo to the bridge poll list
|
|
|
|
The trigger is **polling** (primary): add the repo's full name to the comment-bridge `POLL_REPOS`
|
|
csv (`nix/modules/bridge.nix`) and `nixos-rebuild switch`. The bridge then polls that repo's open PRs
|
|
every 30s and fires a run on a new `!testme` comment from an authorized org member. This needs only
|
|
**read + comment** access — no webhook, no repo-admin.
|
|
|
|
`!testme` on a PR runs install/upgrade/backup + any recipe-local tests, and reports back to the PR.
|
|
|
|
### Optional: lower-latency webhook (admin-registered)
|
|
|
|
Polling already satisfies D1 (<60s). For lower latency an **admin** may *optionally* register a
|
|
Gitea `issue_comment` webhook (the bot does **not** self-register one — that needs repo-admin):
|
|
|
|
- URL `https://ci.commoninternet.net/hook`, content-type `application/json`, event `Issue Comment`,
|
|
secret = the shared webhook HMAC (`secrets/secrets.yaml` → `webhook_hmac`).
|
|
- The Gitea instance must allow the host (admin: add `ci.commoninternet.net` to the
|
|
`[webhook] ALLOWED_HOST_LIST`).
|
|
|
|
The webhook and poller are deduped by comment id, so a comment seen by both fires only once.
|
|
|
|
## Run locally
|
|
|
|
```sh
|
|
RECIPE=<recipe> PR=<n> REF=<sha-or-branch> SRC=recipe-maintainers/<recipe> \
|
|
STAGES=install,upgrade,backup cc-ci-run runner/run_recipe_ci.py
|
|
```
|