Documents the Phase-2 Q4 patterns proven this session: EXTRA_ENV callable, READY_PROBE (HTTP+TCP), CHAOS_BASE_DEPLOY, recipe_checkout -f, install_steps overlay-drop; non-HTTP protocol tests (mumble host-ports + _mumble_proto), in-container functional tests (mailu flask/sendmail/doveadm under TLS_FLAVOR=notls), and P4-N/A when a recipe ships no backupbot label. Worked-example pointers to tests/mumble + tests/mailu. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
266 lines
16 KiB
Markdown
266 lines
16 KiB
Markdown
# Enrolling a recipe under cc-ci (D5)
|
|
|
|
Adding a recipe is a small, repeatable, **no-harness-surgery** operation:
|
|
|
|
## 1. Make the recipe available on the mirror
|
|
|
|
Recipes under test live on the private mirror `git.autonomic.zone/recipe-maintainers/<recipe>`,
|
|
synced from upstream `git.coopcloud.tech`. If not yet mirrored, mirror it (abra fetch + push to the
|
|
org) — see the recipe mirror+PR flow (plan §4.1). A recipe may ship its own `tests/` dir in its repo;
|
|
those are discovered and run against the live app (D4 — see below).
|
|
|
|
## 2. Add the per-recipe test tree in this repo
|
|
|
|
```
|
|
tests/<recipe>/
|
|
├── recipe_meta.py # optional per-recipe harness config (see below)
|
|
├── install_steps.sh # optional custom install-steps hook (pre-deploy setup)
|
|
├── ops.py # optional pre-op seed hooks (pre_install/pre_upgrade/pre_backup/pre_restore)
|
|
├── test_install.py # optional install overlay (runs ADDITIVELY alongside generic)
|
|
├── test_upgrade.py # optional upgrade overlay (runs ADDITIVELY alongside generic)
|
|
├── test_backup.py # optional backup overlay (runs ADDITIVELY alongside generic)
|
|
├── test_restore.py # optional restore overlay (runs ADDITIVELY alongside generic)
|
|
├── PARITY.md # Phase 2 P2: mapping table (recipe-maintainer tests → cc-ci tests)
|
|
├── functional/ # Phase 2 P3: parity ports + ≥2 NEW recipe-specific tests
|
|
│ ├── test_health_check.py # parity port of recipe-info/<recipe>/tests/health_check.py
|
|
│ ├── test_<behavior>.py # ≥2 NEW recipe-specific functional tests
|
|
│ └── …
|
|
└── playwright/ # Phase 2 P6: browser flows where the app's core UX is a UI
|
|
└── test_<flow>.py
|
|
```
|
|
|
|
**A recipe is testable with ZERO config:** with no overlay files, the **generic lifecycle suite**
|
|
runs (install/upgrade/backup/restore) against a single shared deployment — see `docs/testing.md` for
|
|
the full model (deploy-once, additive generic+overlay, the chaos PR-head upgrade, the HC2 repo-local
|
|
allowlist, the install-steps hook). The per-recipe dir only holds the bits where the recipe needs
|
|
*more* than the generic.
|
|
|
|
To add recipe-specific coverage, drop a `tests/<recipe>/test_<op>.py` **overlay** — it runs
|
|
**ALONGSIDE** the generic for that op (HC3 additive, Phase 1e); the generic floor is never silently
|
|
dropped. Overlays are **assertion-only** against the shared live deployment (the `live_app` fixture;
|
|
they never perform the op or deploy/teardown — the orchestrator owns those). If the overlay needs to
|
|
SEED pre-op state (data-continuity markers, the backup→restore divergence), put `pre_<op>(domain,
|
|
meta)` callables in `tests/<recipe>/ops.py` — the orchestrator runs them BEFORE the op. Copy an
|
|
existing recipe (`tests/custom-html/` simple/volume marker; `tests/keycloak/` admin-API; `tests/
|
|
matrix-synapse/` `db`-service psql marker). **Do not edit the shared `tests/conftest.py` /
|
|
`runner/harness/` to add a recipe** — set per-recipe knobs in `recipe_meta.py`:
|
|
|
|
```python
|
|
HEALTH_PATH = "/realms/master" # path that returns a healthy status (default "/")
|
|
HEALTH_OK = (200,) # acceptable status codes (default 200/301/302)
|
|
DEPLOY_TIMEOUT = 600 # seconds for services to converge (default 600)
|
|
HTTP_TIMEOUT = 600 # seconds for the app to answer (default 300)
|
|
BACKUP_CAPABLE = True # override backup-capability auto-detect (default: scan compose)
|
|
EXTRA_ENV = {"KEY": "value"} # or EXTRA_ENV(domain) -> dict; extra .env keys set at deploy
|
|
SKIP_GENERIC = ["upgrade"] # per-recipe opt-out from the generic floor for the listed ops
|
|
# ("all"/"*" = every op); rarely needed — generic is the floor
|
|
```
|
|
|
|
Useful `harness.lifecycle` helpers for overlays: `http_get`, `http_fetch`, `http_body`,
|
|
`exec_in_app` (use this for data markers — volume/DB, hardened with returncode+retry); the lifecycle
|
|
ops themselves are orchestrator-owned (you never call them from an overlay). The harness forces
|
|
`LETS_ENCRYPT_ENV=""` (no ACME), a unique short domain per run, and guarantees teardown.
|
|
|
|
### 2.1 Phase-2 contract: parity port + recipe-specific functional tests + Playwright
|
|
|
|
Beyond the lifecycle overlays, each recipe carries (plan §4.1):
|
|
|
|
- **`PARITY.md`** — a mapping table from every `references/recipe-maintainer/recipe-info/<recipe>/
|
|
tests/*.py` to a comparable cc-ci test under `tests/<recipe>/functional/`, asserting the
|
|
*same thing* (not a renamed file). A deliberate non-port is documented in `DECISIONS.md` with
|
|
a technical reason — never a silent omission.
|
|
- **`functional/`** — parity-port tests + **≥2 NEW recipe-specific functional tests** that
|
|
exercise the app's characteristic behavior (per plan §4.3 — e.g. "create-an-object +
|
|
read-it-back, and one more that touches a distinctive feature"). Each parity-port file carries
|
|
a `SOURCE = "recipe-info/<recipe>/tests/<file>"` comment near the top so audit is in-file.
|
|
- **`playwright/`** — browser flows where the recipe's core UX is a UI (P6).
|
|
|
|
The orchestrator's **custom** tier discovers `test_*.py` in `tests/<recipe>/{functional,playwright}/`
|
|
(recursive, via `runner/harness/discovery.custom_tests`) and runs each as its own pytest against
|
|
the same `live_app` shared deployment. Lifecycle-named files (`test_install.py`/etc.) are
|
|
**excluded** from the custom tier — they live at the top level and run as lifecycle overlays.
|
|
|
|
### 2.2 Recipe-test dependencies — DEPS = [...] (Phase 2 Q2.3)
|
|
|
|
If your recipe needs other recipes deployed alongside it (an SSO provider, a database), declare
|
|
them in `recipe_meta.py`:
|
|
|
|
```python
|
|
DEPS = ["keycloak"] # one entry per dep recipe name (cc-ci tests/<dep>/ must exist + work)
|
|
```
|
|
|
|
The orchestrator (plan §4.2):
|
|
1. Reads `DEPS` BEFORE deploying the recipe under test.
|
|
2. Deploys each dep at a per-run domain `<dep[:4]>-<6hex>.ci.commoninternet.net` (the 6hex is
|
|
hashed from `parent_recipe + pr + ref + dep_recipe` so two recipes' deps of the same kind do
|
|
not collide on a single node).
|
|
3. Waits each dep healthy using its own `recipe_meta.py` (HEALTH_PATH/HEALTH_OK/timeouts).
|
|
4. Persists `[{"recipe": "<dep>", "domain": "<dep-domain>"}, ...]` to `$CCCI_DEPS_FILE`.
|
|
5. Deploys + tests the recipe under test as usual.
|
|
6. Tears down the dep LAST in `finally` (reverse declaration order, with `verify=True` — leaked
|
|
deps fail the run loudly per §9 teardown sacred / F2-5 fix).
|
|
|
|
Tests access dep domains via the **`deps_apps` pytest fixture** (`tests/conftest.py`):
|
|
|
|
```python
|
|
def test_my_recipe_uses_keycloak(live_app, deps_apps):
|
|
assert "keycloak" in deps_apps, f"keycloak dep not deployed; {deps_apps}"
|
|
kc_domain = deps_apps["keycloak"]
|
|
…
|
|
```
|
|
|
|
Deploy-count guard: with deps the expected count is `1 + len(DEPS)` (the parent + one per dep).
|
|
The orchestrator computes this and fails the run on mismatch.
|
|
|
|
### 2.3 SSO setup — harness.sso (Phase 2 Q2.3)
|
|
|
|
For OIDC-dependent recipes, the shared `runner/harness/sso.py` provides:
|
|
|
|
```python
|
|
from harness import sso
|
|
|
|
creds = sso.setup_keycloak_realm(
|
|
kc_domain, # = deps_apps["keycloak"]
|
|
realm="my-realm",
|
|
client_id="my-client",
|
|
redirect_uris=[f"https://{live_app}/*"],
|
|
web_origins=[f"https://{live_app}"],
|
|
)
|
|
# creds = {"realm", "client_id", "client_secret", "user", "password", "token_url", …}
|
|
|
|
sso.assert_discovery_endpoint(creds) # GET /.well-known/openid-configuration
|
|
token = sso.oidc_password_grant(creds) # exercises the OIDC password grant; returns JWT
|
|
```
|
|
|
|
`setup_keycloak_realm` is **idempotent** (409 → reset to known values) and uses **class-B
|
|
run-scoped secrets** (the generated `client_secret` + test-user password are destroyed when the
|
|
dep keycloak is torn down at run end, plan §4.4-B). **Note (F2-7):** the setup primitive is
|
|
keycloak-specific; when authentik comes online a parallel `setup_authentik_realm` will need to
|
|
land in `harness.sso`. The flow primitives (`oidc_password_grant`, `assert_discovery_endpoint`)
|
|
ARE provider-pluggable.
|
|
|
|
### 2.4 Non-HTTP, multi-service, and host-dependent recipes (Phase 2 Q4)
|
|
|
|
Not every recipe is a single HTTP app. `recipe_meta.py` + a few harness mechanisms cover the harder
|
|
shapes (proven on mumble, mailu, and the SSO-dependent suite):
|
|
|
|
- **`EXTRA_ENV`** — a dict **or** a `callable(domain) -> dict`. The callable form derives values from
|
|
the per-run domain (e.g. `MAIL_DOMAIN`/`HOSTNAMES` for mailu, `SANDBOX_DOMAIN` for cryptpad). Applied
|
|
at every deploy (`abra.env_set`), so a recipe enrolls with NO shared-harness change.
|
|
- **`READY_PROBE(domain) -> [...]`** — readiness signals beyond replica-convergence + the app's
|
|
`HEALTH_PATH`. Two probe shapes:
|
|
- HTTP: `{"host": "...", "path": "/...", "ok": (200,)}` (e.g. lasuite-drive collabora WOPI discovery).
|
|
- **TCP**: `{"tcp_host": "127.0.0.1", "tcp_port": 64738, "stable": 3}` — polls a socket connect N
|
|
consecutive times. Use for non-HTTP services whose `HEALTH_PATH` reflects a sidecar, not the real
|
|
service (mumble: the mumble-web sidecar serves HTTP 200 while the voice server on 64738 is still
|
|
rebinding after an upgrade redeploy — the TCP probe gates the backup tier until the voice server is
|
|
actually up). Runs after install AND after the upgrade chaos redeploy.
|
|
- **`CHAOS_BASE_DEPLOY = True`** — make the pinned base deploy use `--chaos` (skips abra's clean-tree +
|
|
lint gates, still deploys the explicitly-checked-out pinned version, NOT latest). Needed when an
|
|
`install_steps.sh` adds an UNTRACKED file to the recipe checkout (e.g. mumble copies a
|
|
`compose.host-ports.yml` into versions that predate it) — abra's pinned-deploy clean-tree check would
|
|
otherwise FATA. `abra.recipe_checkout` force-checks-out (`-f`) so the upgrade tier's re-checkout to
|
|
PR-head overwrites such overlays cleanly.
|
|
- **`install_steps.sh`** (auto-discovered at `tests/<recipe>/install_steps.sh`) — runs after
|
|
`abra app new` + EXTRA_ENV + secret-generate, BEFORE the single deploy, with `CCCI_APP_DOMAIN` /
|
|
`CCCI_APP_ENV` / `CCCI_RECIPE` (and `CCCI_DEPS_FILE` when DEPS are provisioned at install). Use it to
|
|
drop a cc-ci-owned compose overlay into the checkout, wire dep-derived env/secrets, etc.
|
|
|
|
**Non-HTTP protocol tests (mumble).** Reach a TCP service published `mode: host` (via a host-ports
|
|
overlay) at `127.0.0.1:<port>` — cc-ci runs tests on-host (cc-ci-run). mumble ships a stdlib protocol
|
|
client (`tests/mumble/functional/_mumble_proto.py`) doing the real TLS handshake → ServerSync; the
|
|
recipe-specific tests assert channel presence and config round-trips (a deploy-set `WELCOME_TEXT`/
|
|
`USERS` value surfaces over the protocol — version-independent, non-vacuous).
|
|
|
|
**In-container functional tests (mailu).** When network access to a service is constrained (mailu uses
|
|
`TLS_FLAVOR=notls` because certdumper needs traefik ACME which cc-ci does not run → dovecot refuses
|
|
plaintext auth over the network), exercise the app via `lifecycle.exec_in_app(domain, [...],
|
|
service="<svc>")` against the relevant container: e.g. `flask mailu user ...` (admin) to create a
|
|
mailbox, then a local `sendmail` inject (smtp) → `doveadm search` (imap) to prove real
|
|
postfix→rspamd→dovecot delivery. This hits the same stack the network path would, without the env
|
|
constraint.
|
|
|
|
**P4 when the recipe ships no backup (`backupbot`) labels.** `generic.backup_capable` auto-detects the
|
|
`backupbot.backup` label; recipes without it (mailu, drone) cleanly SKIP the backup/restore tiers —
|
|
P4 is genuinely N/A (nothing to back up), not a cut corner. Document it in `PARITY.md` + a `DEFERRED.md`
|
|
entry (the durable fix is a backupbot recipe-PR, like immich), and seek Adversary §7.1 sign-off.
|
|
|
|
## 3. Recipe-local tests (D4) — default-deny (HC2)
|
|
|
|
If the recipe's own repo contains `tests/test_*.py` / `install_steps.sh` / `ops.py`, the runner
|
|
snapshots them right after fetch — but per Phase 1e HC2 it executes them **only** for recipes on the
|
|
cc-ci approval allowlist `tests/repo-local-approved.txt` (default empty ⇒ default-deny). PR-author
|
|
code runs on the CI host with `/run/secrets/*` present, so adding a recipe to the allowlist is a
|
|
deliberate cc-ci-maintainer act (in a cc-ci PR, after reviewing that recipe's repo-local tests).
|
|
Without approval, only the cc-ci overlays in this repo + the generic floor run. Approved recipe-local
|
|
files receive env `CCCI_BASE_URL` (e.g. `https://<app>.ci.commoninternet.net/`) and `CCCI_APP_DOMAIN`.
|
|
|
|
## 4. Add the repo to the bridge poll list
|
|
|
|
The trigger is **polling** (primary): add the repo's full name to the comment-bridge `POLL_REPOS`
|
|
csv (`nix/modules/bridge.nix`) and `nixos-rebuild switch`. The bridge then polls that repo's open PRs
|
|
every 30s and fires a run on a new `!testme` comment from an authorized org member. This needs only
|
|
**read + comment** access — no webhook, no repo-admin.
|
|
|
|
`!testme` on a PR runs install/upgrade/backup + any recipe-local tests, and reports back to the PR.
|
|
|
|
### Optional: lower-latency webhook (admin-registered)
|
|
|
|
Polling already satisfies D1 (<60s). For lower latency an **admin** may *optionally* register a
|
|
Gitea `issue_comment` webhook (the bot does **not** self-register one — that needs repo-admin):
|
|
|
|
- URL `https://ci.commoninternet.net/hook`, content-type `application/json`, event `Issue Comment`,
|
|
secret = the shared webhook HMAC (`secrets/secrets.yaml` → `webhook_hmac`).
|
|
- The Gitea instance must allow the host (admin: add `ci.commoninternet.net` to the
|
|
`[webhook] ALLOWED_HOST_LIST`).
|
|
|
|
The webhook and poller are deduped by comment id, so a comment seen by both fires only once.
|
|
|
|
## Run locally
|
|
|
|
```sh
|
|
RECIPE=<recipe> PR=<n> REF=<sha-or-branch> SRC=recipe-maintainers/<recipe> \
|
|
STAGES=install,upgrade,backup,restore,custom cc-ci-run runner/run_recipe_ci.py
|
|
```
|
|
|
|
## Worked example — lasuite-docs (OIDC-dependent, Phase 2)
|
|
|
|
```
|
|
tests/lasuite-docs/
|
|
├── recipe_meta.py # HEALTH_PATH="/", DEPLOY_TIMEOUT=900, EXTRA_ENV(domain) for cold-pull,
|
|
│ # DEPS=["keycloak"] ← Phase 2 dep declaration
|
|
├── ops.py # pre_<op> seed hooks (volume marker for backup/restore data-integrity)
|
|
├── test_install.py # lifecycle install overlay (Playwright frontend SPA load)
|
|
├── test_upgrade.py # lifecycle upgrade overlay (marker survives chaos redeploy)
|
|
├── test_backup.py # lifecycle backup overlay (marker captured)
|
|
├── test_restore.py # lifecycle restore overlay (marker restored to pre-mutation)
|
|
├── PARITY.md # parity-port mapping (P2)
|
|
└── functional/
|
|
├── test_health_check.py # parity port (SOURCE comment cites recipe-info file)
|
|
├── test_auth_required.py # specific: /api/v1.0/users/me/ → 401 without auth
|
|
└── test_oidc_with_keycloak.py # specific: full OIDC flow against the dep keycloak (uses
|
|
# harness.sso primitives + deps_apps["keycloak"])
|
|
```
|
|
|
|
`!testme` on a lasuite-docs PR drives the orchestrator to:
|
|
1. Deploy the per-run keycloak dep (`keyc-<6hex>.ci.commoninternet.net`) and wait healthy.
|
|
2. Deploy lasuite-docs (`lasu-<6hex>.ci.commoninternet.net`).
|
|
3. Run install / upgrade / backup / restore + the 3 functional tests against the shared
|
|
deployment (custom tier).
|
|
4. Teardown lasuite-docs, then the keycloak dep (LAST), both with verify=True.
|
|
5. Print the run summary; non-zero exit code on any failure (DG4.1 deploy-count mismatch, tier
|
|
FAIL, dep teardown leak — all surfaced).
|
|
|
|
### Other shapes (concrete references)
|
|
|
|
- **TCP / voice recipe — `tests/mumble/`**: `recipe_meta.py` (EXTRA_ENV sets
|
|
`COMPOSE_FILE=compose.yml:compose.mumbleweb.yml:compose.host-ports.yml`, `WELCOME_TEXT`/`USERS`
|
|
markers, `CHAOS_BASE_DEPLOY=True`, `READY_PROBE` TCP 64738), `install_steps.sh` (provides the
|
|
host-ports overlay to older versions), `functional/_mumble_proto.py` + the protocol/config-round-trip
|
|
tests, `ops.py`/`test_backup.py`/`test_restore.py` (sqlite P4). See §2.4.
|
|
- **Multi-service, dep-less, in-container functional — `tests/mailu/`**: `recipe_meta.py`
|
|
(`EXTRA_ENV(domain)` with `TLS_FLAVOR=notls` + `MAIL_DOMAIN`/`HOSTNAMES`/`TRAEFIK_STACK_NAME`),
|
|
`functional/_mailu.py` (flask-CLI helpers), `test_mailbox.py` (create→config-export read-back),
|
|
`test_mail_flow.py` (in-container sendmail→doveadm delivery). No backupbot → P4 N/A (PARITY.md +
|
|
DEFERRED.md). See §2.4.
|