277 lines
17 KiB
Markdown
277 lines
17 KiB
Markdown
# Enrolling a recipe under cc-ci (D5)
|
|
|
|
Adding a recipe is a small, repeatable, **no-harness-surgery** operation:
|
|
|
|
## 1. Make the recipe available on the mirror
|
|
|
|
Recipes under test live on the private mirror `git.autonomic.zone/recipe-maintainers/<recipe>`,
|
|
synced from upstream `git.coopcloud.tech`. If not yet mirrored, mirror it (abra fetch + push to the
|
|
org) — see the recipe mirror+PR flow (plan §4.1). A recipe may ship its own `tests/` dir in its repo;
|
|
those are discovered and run against the live app (D4 — see below).
|
|
|
|
## 2. Add the per-recipe test tree in this repo
|
|
|
|
```
|
|
tests/<recipe>/
|
|
├── recipe_meta.py # optional per-recipe harness config (see below)
|
|
├── install_steps.sh # optional custom install-steps hook (pre-deploy setup + deps env wiring)
|
|
├── compose.ccci.yml # optional CI-only compose overlay (harness-copied, auto-chaos base deploy)
|
|
├── ops.py # optional pre_<op>(ctx) seed hooks (install/upgrade/backup/restore)
|
|
├── test_install.py # optional install overlay (runs ADDITIVELY alongside generic)
|
|
├── test_upgrade.py # optional upgrade overlay (runs ADDITIVELY alongside generic)
|
|
├── test_backup.py # optional backup overlay (runs ADDITIVELY alongside generic)
|
|
├── test_restore.py # optional restore overlay (runs ADDITIVELY alongside generic)
|
|
├── PARITY.md # Phase 2 P2: mapping table (recipe-maintainer tests → cc-ci tests)
|
|
└── custom/ # custom tier: parity ports + recipe-specific tests + browser flows
|
|
├── test_health_check.py # parity port of recipe-info/<recipe>/tests/health_check.py
|
|
├── test_<behavior>.py # ≥2 NEW recipe-specific tests
|
|
├── test_<flow>.py # browser/UI flows where relevant
|
|
└── …
|
|
```
|
|
|
|
**A recipe is testable with ZERO config:** with no overlay files, the **generic lifecycle suite**
|
|
runs (install/upgrade/backup/restore) against a single shared deployment — see `docs/testing.md` for
|
|
the full model (deploy-once, additive generic+overlay, the chaos PR-head upgrade, the HC2 repo-local
|
|
allowlist, the install-steps hook). The per-recipe dir only holds the bits where the recipe needs
|
|
*more* than the generic.
|
|
|
|
To add recipe-specific coverage, drop a `tests/<recipe>/test_<op>.py` **overlay** — it runs
|
|
**ALONGSIDE** the generic for that op (HC3 additive, Phase 1e); the generic floor is never silently
|
|
dropped. Overlays are **assertion-only** against the shared live deployment (the `live_app` fixture;
|
|
they never perform the op or deploy/teardown — the orchestrator owns those). If the overlay needs to
|
|
SEED pre-op state (data-continuity markers, the backup→restore divergence), put `pre_<op>(ctx)`
|
|
callables in `tests/<recipe>/ops.py` — the orchestrator runs them BEFORE the op (`ctx` is the
|
|
uniform `HookCtx` every hook receives — `docs/recipe-customization.md` §4.1). Copy an
|
|
existing recipe (`tests/custom-html/` simple/volume marker; `tests/keycloak/` admin-API; `tests/
|
|
matrix-synapse/` `db`-service psql marker). **Do not edit the shared `tests/conftest.py` /
|
|
`runner/harness/` to add a recipe** — set per-recipe knobs in `recipe_meta.py` (the COMPLETE key
|
|
reference is the generated table in `docs/recipe-customization.md` §4; unknown ALL-CAPS keys are
|
|
hard errors, recipe-private constants are underscore-prefixed `_FOO`):
|
|
|
|
```python
|
|
HEALTH_PATH = "/realms/master" # path that returns a healthy status (default "/")
|
|
HEALTH_OK = (200,) # acceptable status codes (default 200/301/302)
|
|
DEPLOY_TIMEOUT = 600 # seconds for services to converge (default 600)
|
|
HTTP_TIMEOUT = 600 # seconds for the app to answer (default 300)
|
|
BACKUP_CAPABLE = True # override backup-capability auto-detect (default: scan compose)
|
|
EXTRA_ENV = {"KEY": "value"} # or EXTRA_ENV(ctx) -> dict; extra .env keys set at deploy
|
|
```
|
|
|
|
Useful `harness.lifecycle` helpers for overlays: `http_get`, `http_fetch`, `http_body`,
|
|
`exec_in_app` (use this for data markers — volume/DB, hardened with returncode+retry); the lifecycle
|
|
ops themselves are orchestrator-owned (you never call them from an overlay). The harness forces
|
|
`LETS_ENCRYPT_ENV=""` (no ACME), a unique short domain per run, and guarantees teardown.
|
|
|
|
### 2.1 Phase-2 contract: parity port + recipe-specific functional tests + Playwright
|
|
|
|
Beyond the lifecycle overlays, each recipe carries (plan §4.1):
|
|
|
|
- **`PARITY.md`** — a mapping table from every `references/recipe-maintainer/recipe-info/<recipe>/
|
|
tests/*.py` to a comparable cc-ci test under `tests/<recipe>/custom/`, asserting the
|
|
*same thing* (not a renamed file). A deliberate non-port is documented in `DECISIONS.md` with
|
|
a technical reason — never a silent omission.
|
|
- **`custom/`** — parity-port tests + **≥2 NEW recipe-specific tests** that exercise the app's
|
|
characteristic behavior (per plan §4.3 — e.g. "create-an-object + read-it-back, and one more
|
|
that touches a distinctive feature"). Browser/UI flows live in the same folder too. Each
|
|
parity-port file carries a `SOURCE = "recipe-info/<recipe>/tests/<file>"` comment near the top
|
|
so audit is in-file.
|
|
|
|
The orchestrator's **custom** tier discovers `test_*.py` in canonical `tests/<recipe>/custom/`
|
|
(plus deprecated `functional/` / `playwright/` aliases during migration; discovery warns when it
|
|
uses them) and runs each as its own pytest against the same
|
|
`live_app` shared deployment. Lifecycle-named files (`test_install.py`/etc.) are **excluded**
|
|
from the custom tier even inside those subdirs (safety net against double-running).
|
|
|
|
### 2.2 Recipe-test dependencies — DEPS = [...] (Phase 2 Q2.3)
|
|
|
|
If your recipe needs other recipes deployed alongside it (an SSO provider, a database), declare
|
|
them in `recipe_meta.py`:
|
|
|
|
```python
|
|
DEPS = ["keycloak"] # one entry per dep recipe name (cc-ci tests/<dep>/ must exist + work)
|
|
```
|
|
|
|
The orchestrator (plan §4.2; install-time provisioning is the ONLY mode):
|
|
1. Reads `DEPS` and provisions every dep **BEFORE the single deploy** of the recipe under test —
|
|
each dep at a per-run domain `<dep[:4]>-<6hex>.ci.commoninternet.net` (the 6hex is hashed from
|
|
`parent_recipe + pr + ref + dep_recipe` so two recipes' deps of the same kind do not collide on
|
|
a single node), waited healthy using the dep's own `recipe_meta.py`.
|
|
2. Persists the full per-dep identity + SSO creds dict to `$CCCI_DEPS_FILE` (jq-readable JSON,
|
|
`{"<dep>": {"domain": ..., "realm": ..., "client_secret": ..., ...}}`).
|
|
3. Deploys the recipe under test — its `install_steps.sh` reads `$CCCI_DEPS_FILE` and wires
|
|
OIDC env into that ONE deploy (no post-deploy redeploy). A dep-provisioning failure does NOT
|
|
block the run: the recipe deploys alone, generic tiers run, and `requires_deps` tests skip
|
|
with a counted reason (F2-11).
|
|
4. Tears down the dep LAST in `finally` (reverse declaration order, with `verify=True` — leaked
|
|
deps fail the run loudly per §9 teardown sacred / F2-5 fix).
|
|
|
|
Tests access deps via the **`deps` pytest fixture** (`tests/conftest.py`) — entries expose
|
|
`.domain` plus the full creds dict (attribute or dict-style):
|
|
|
|
```python
|
|
@pytest.mark.requires_deps
|
|
def test_my_recipe_uses_keycloak(live_app, deps):
|
|
assert "keycloak" in deps, f"keycloak dep not deployed; {deps}"
|
|
kc_domain = deps["keycloak"].domain
|
|
…
|
|
```
|
|
|
|
Deploy-count guard: with deps the expected count is `1 + len(DEPS)` (the parent + one per dep).
|
|
The orchestrator computes this and fails the run on mismatch.
|
|
|
|
### 2.3 SSO setup — harness.sso (Phase 2 Q2.3)
|
|
|
|
For OIDC-dependent recipes, the shared `runner/harness/sso.py` provides:
|
|
|
|
```python
|
|
from harness import sso
|
|
|
|
creds = sso.setup_keycloak_realm(
|
|
kc_domain, # = deps["keycloak"].domain
|
|
realm="my-realm",
|
|
client_id="my-client",
|
|
redirect_uris=[f"https://{live_app}/*"],
|
|
web_origins=[f"https://{live_app}"],
|
|
)
|
|
# creds = {"realm", "client_id", "client_secret", "user", "password", "token_url", …}
|
|
|
|
sso.assert_discovery_endpoint(creds) # GET /.well-known/openid-configuration
|
|
token = sso.oidc_password_grant(creds) # exercises the OIDC password grant; returns JWT
|
|
```
|
|
|
|
`setup_keycloak_realm` is **idempotent** (409 → reset to known values) and uses **class-B
|
|
run-scoped secrets** (the generated `client_secret` + test-user password are destroyed when the
|
|
dep keycloak is torn down at run end, plan §4.4-B). **Note (F2-7):** the setup primitive is
|
|
keycloak-specific; when authentik comes online a parallel `setup_authentik_realm` will need to
|
|
land in `harness.sso`. The flow primitives (`oidc_password_grant`, `assert_discovery_endpoint`)
|
|
ARE provider-pluggable.
|
|
|
|
### 2.4 Non-HTTP, multi-service, and host-dependent recipes (Phase 2 Q4)
|
|
|
|
Not every recipe is a single HTTP app. `recipe_meta.py` + a few harness mechanisms cover the harder
|
|
shapes (proven on mumble, mailu, and the SSO-dependent suite):
|
|
|
|
- **`EXTRA_ENV`** — a dict **or** a `callable(ctx) -> dict`. The callable form derives values from
|
|
the per-run domain (`ctx.domain` — e.g. `MAIL_DOMAIN`/`HOSTNAMES` for mailu, `SANDBOX_DOMAIN` for
|
|
cryptpad). Applied at every deploy (`abra.env_set`), so a recipe enrolls with NO shared-harness change.
|
|
- **`READY_PROBE(ctx) -> [...]`** — readiness signals beyond replica-convergence + the app's
|
|
`HEALTH_PATH`. Two probe shapes:
|
|
- HTTP: `{"host": "...", "path": "/...", "ok": (200,)}` (e.g. lasuite-drive collabora WOPI discovery).
|
|
- **TCP**: `{"tcp_host": "127.0.0.1", "tcp_port": 64738, "stable": 3}` — polls a socket connect N
|
|
consecutive times. Use for non-HTTP services whose `HEALTH_PATH` reflects a sidecar, not the real
|
|
service (mumble: the mumble-web sidecar serves HTTP 200 while the voice server on 64738 is still
|
|
rebinding after an upgrade redeploy — the TCP probe gates the backup tier until the voice server is
|
|
actually up). Runs after install AND after the upgrade chaos redeploy.
|
|
- **`compose.ccci.yml`** (first-class at `tests/<recipe>/compose.ccci.yml`) — a CI-only compose
|
|
overlay the harness itself copies into the recipe checkout before the base deploy, automatically
|
|
using `--chaos` for that deploy (the untracked file would otherwise trip abra's pinned-deploy
|
|
clean-tree check). Reference it from `EXTRA_ENV`'s `COMPOSE_FILE`. Minimal, justified fallback
|
|
only (e.g. ghost's 15m `start_period` grace). `abra.recipe_checkout` force-checks-out (`-f`) so
|
|
the upgrade tier's re-checkout to PR-head overwrites such overlays cleanly.
|
|
- **`install_steps.sh`** (auto-discovered at `tests/<recipe>/install_steps.sh`) — runs after
|
|
`abra app new` + EXTRA_ENV + secret-generate, BEFORE the single deploy, with `CCCI_APP_DOMAIN` /
|
|
`CCCI_APP_ENV` / `CCCI_RECIPE` (and `CCCI_DEPS_FILE` when the recipe declares DEPS — deps are
|
|
always provisioned before the deploy). Use it to wire dep-derived env/secrets, seed config, etc.
|
|
|
|
**Non-HTTP protocol tests (mumble).** Reach a TCP service published `mode: host` (via a host-ports
|
|
overlay) at `127.0.0.1:<port>` — cc-ci runs tests on-host (cc-ci-run). mumble ships a stdlib protocol
|
|
client (`tests/mumble/custom/_mumble_proto.py`) doing the real TLS handshake → ServerSync; the
|
|
recipe-specific tests assert channel presence and config round-trips (a deploy-set `WELCOME_TEXT`/
|
|
`USERS` value surfaces over the protocol — version-independent, non-vacuous).
|
|
|
|
**In-container functional tests (mailu).** When network access to a service is constrained (mailu uses
|
|
`TLS_FLAVOR=notls` because certdumper needs traefik ACME which cc-ci does not run → dovecot refuses
|
|
plaintext auth over the network), exercise the app via `lifecycle.exec_in_app(domain, [...],
|
|
service="<svc>")` against the relevant container: e.g. `flask mailu user ...` (admin) to create a
|
|
mailbox, then a local `sendmail` inject (smtp) → `doveadm search` (imap) to prove real
|
|
postfix→rspamd→dovecot delivery. This hits the same stack the network path would, without the env
|
|
constraint.
|
|
|
|
**P4 when the recipe ships no backup (`backupbot`) labels.** `generic.backup_capable` auto-detects the
|
|
`backupbot.backup` label; recipes without it (mailu, drone) cleanly SKIP the backup/restore tiers —
|
|
P4 is genuinely N/A (nothing to back up), not a cut corner. Document it in `PARITY.md` + a `DEFERRED.md`
|
|
entry (the durable fix is a backupbot recipe-PR, like immich), and seek Adversary §7.1 sign-off.
|
|
|
|
## 3. Recipe-local tests (D4) — default-deny (HC2)
|
|
|
|
If the recipe's own repo contains `tests/test_*.py` / `install_steps.sh` / `ops.py`, the runner
|
|
snapshots them right after fetch — but per Phase 1e HC2 it executes them **only** for recipes on the
|
|
cc-ci approval allowlist `tests/repo-local-approved.txt` (default empty ⇒ default-deny). PR-author
|
|
code runs on the CI host with `/run/secrets/*` present, so adding a recipe to the allowlist is a
|
|
deliberate cc-ci-maintainer act (in a cc-ci PR, after reviewing that recipe's repo-local tests).
|
|
Without approval, only the cc-ci overlays in this repo + the generic floor run. Approved recipe-local
|
|
files receive env `CCCI_BASE_URL` (e.g. `https://<app>.ci.commoninternet.net/`) and `CCCI_APP_DOMAIN`.
|
|
|
|
## 4. Add the repo to the bridge poll list
|
|
|
|
The trigger is **polling** (primary): add the repo's full name to the comment-bridge `POLL_REPOS`
|
|
csv (`nix/modules/bridge.nix`) and `nixos-rebuild switch`. The bridge then polls that repo's open PRs
|
|
every 30s and fires a run on a new `!testme` comment from an authorized org member. This needs only
|
|
**read + comment** access — no webhook, no repo-admin.
|
|
|
|
`!testme` on a PR runs install/upgrade/backup + any recipe-local tests, and reports back to the PR.
|
|
|
|
### Optional: lower-latency webhook (admin-registered)
|
|
|
|
Polling already satisfies D1 (<60s). For lower latency an **admin** may *optionally* register a
|
|
Gitea `issue_comment` webhook (the bot does **not** self-register one — that needs repo-admin):
|
|
|
|
- URL `https://ci.commoninternet.net/hook`, content-type `application/json`, event `Issue Comment`,
|
|
secret = the shared webhook HMAC (`secrets/secrets.yaml` → `webhook_hmac`).
|
|
- The Gitea instance must allow the host (admin: add `ci.commoninternet.net` to the
|
|
`[webhook] ALLOWED_HOST_LIST`).
|
|
|
|
The webhook and poller are deduped by comment id, so a comment seen by both fires only once.
|
|
|
|
## Run locally
|
|
|
|
```sh
|
|
RECIPE=<recipe> PR=<n> REF=<sha-or-branch> SRC=recipe-maintainers/<recipe> \
|
|
STAGES=install,upgrade,backup,restore,custom cc-ci-run runner/run_recipe_ci.py
|
|
```
|
|
|
|
## Worked example — lasuite-docs (OIDC-dependent, Phase 2)
|
|
|
|
```
|
|
tests/lasuite-docs/
|
|
├── recipe_meta.py # HEALTH_PATH="/", DEPLOY_TIMEOUT=900, EXTRA_ENV(ctx) for cold-pull,
|
|
│ # DEPS=["keycloak"] ← Phase 2 dep declaration
|
|
├── install_steps.sh # wires OIDC env from $CCCI_DEPS_FILE into the single deploy
|
|
├── ops.py # pre_<op>(ctx) seed hooks (volume marker for backup/restore data-integrity)
|
|
├── test_install.py # lifecycle install overlay (Playwright frontend SPA load)
|
|
├── test_upgrade.py # lifecycle upgrade overlay (marker survives chaos redeploy)
|
|
├── test_backup.py # lifecycle backup overlay (marker captured)
|
|
├── test_restore.py # lifecycle restore overlay (marker restored to pre-mutation)
|
|
├── PARITY.md # parity-port mapping (P2)
|
|
└── custom/
|
|
├── test_health_check.py # parity port (SOURCE comment cites recipe-info file)
|
|
├── test_auth_required.py # specific: /api/v1.0/users/me/ → 401 without auth
|
|
└── test_oidc_with_keycloak.py # specific: full OIDC flow against the dep keycloak (uses
|
|
# harness.sso primitives + the `deps` fixture)
|
|
```
|
|
|
|
`!testme` on a lasuite-docs PR drives the orchestrator to:
|
|
1. Provision the per-run keycloak dep (`keyc-<6hex>.ci.commoninternet.net`), wait healthy, write
|
|
creds to `$CCCI_DEPS_FILE` — BEFORE the recipe deploy.
|
|
2. Deploy lasuite-docs (`lasu-<6hex>.ci.commoninternet.net`); `install_steps.sh` wires the OIDC
|
|
env into that one deploy.
|
|
3. Run install / upgrade / backup / restore + the 3 custom tests against the shared
|
|
deployment (custom tier).
|
|
4. Teardown lasuite-docs, then the keycloak dep (LAST), both with verify=True.
|
|
5. Print the run summary; non-zero exit code on any failure (DG4.1 deploy-count mismatch, tier
|
|
FAIL, dep teardown leak — all surfaced).
|
|
|
|
### Other shapes (concrete references)
|
|
|
|
- **TCP / voice recipe — `tests/mumble/`**: `recipe_meta.py` (EXTRA_ENV sets
|
|
`COMPOSE_FILE=compose.yml:compose.mumbleweb.yml` for the base; `UPGRADE_EXTRA_ENV` adds the
|
|
native `compose.host-ports.yml` at PR-head so 64738 is host-published on latest; private
|
|
`_WELCOME_TEXT_MARKER`/`_MAX_USERS` constants; `READY_PROBE(ctx)` TCP 64738 — phase-aware via
|
|
the live COMPOSE_FILE), `custom/_mumble_proto.py` + the protocol/config-round-trip
|
|
tests, `ops.py`/`test_backup.py`/`test_restore.py` (sqlite P4). See §2.4.
|
|
- **Multi-service, dep-less, in-container functional — `tests/mailu/`**: `recipe_meta.py`
|
|
(`EXTRA_ENV(ctx)` with `TLS_FLAVOR=notls` + `MAIL_DOMAIN`/`HOSTNAMES`/`TRAEFIK_STACK_NAME`),
|
|
`custom/_mailu.py` (flask-CLI helpers), `test_mailbox.py` (create→config-export read-back),
|
|
`test_mail_flow.py` (in-container sendmail→doveadm delivery). No backupbot → P4 N/A (PARITY.md +
|
|
DEFERRED.md). See §2.4.
|