Files
cc-ci/docs/enroll-recipe.md
autonomic-bot da558ca946
All checks were successful
continuous-integration/drone/push Build is passing
docs: P6 — rewrite customization docs to the restructured end state (rcust)
recipe-customization.md: review spec -> reference. Single registry-backed loader + validation
rules + HookCtx convention (§4); generated key table kept byte-identical (sync test); §5 end-state
shape (op_state/deps fixtures, ctx ops.py, placement rule, first-class compose.ccci.yml, no
setup_custom_tests.sh); §7 manifest block + dev-only CCCI_SKIP_GENERIC*; §8 rewritten as
restructure outcomes (R1/R2/R3/R5/R6/R7/R8 resolved + how, R4 mitigated by manifest, R9
rejected-by-decision); §9 index updated to the new symbols.

testing.md: install-time deps isolation replaces the setup_custom_tests step in the invariant
(generic still never depends on custom — failure isolation via requires_deps/F2-11); ops.py
example to pre_<op>(ctx); placement rule; generic opt-out now documented LOCAL-DEV-ONLY env with
CI !! warning (declarative SKIP_GENERIC gone); partial key list points at the generated table.

enroll-recipe.md: tree + worked examples updated (lasuite-docs install-time OIDC wiring +
install_steps.sh; mumble post-F2-14c shape — UPGRADE_EXTRA_ENV native overlay, private _
constants, no CHAOS_BASE_DEPLOY); deps fixture (entry.domain) replaces deps_apps; ctx hook
signatures; compose.ccci.yml first-class bullet; key list points at the generated table.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 19:07:41 +00:00

17 KiB

Enrolling a recipe under cc-ci (D5)

Adding a recipe is a small, repeatable, no-harness-surgery operation:

1. Make the recipe available on the mirror

Recipes under test live on the private mirror git.autonomic.zone/recipe-maintainers/<recipe>, synced from upstream git.coopcloud.tech. If not yet mirrored, mirror it (abra fetch + push to the org) — see the recipe mirror+PR flow (plan §4.1). A recipe may ship its own tests/ dir in its repo; those are discovered and run against the live app (D4 — see below).

2. Add the per-recipe test tree in this repo

tests/<recipe>/
├── recipe_meta.py      # optional per-recipe harness config (see below)
├── install_steps.sh    # optional custom install-steps hook (pre-deploy setup + deps env wiring)
├── compose.ccci.yml    # optional CI-only compose overlay (harness-copied, auto-chaos base deploy)
├── ops.py              # optional pre_<op>(ctx) seed hooks (install/upgrade/backup/restore)
├── test_install.py     # optional install overlay  (runs ADDITIVELY alongside generic)
├── test_upgrade.py     # optional upgrade overlay   (runs ADDITIVELY alongside generic)
├── test_backup.py      # optional backup overlay    (runs ADDITIVELY alongside generic)
├── test_restore.py     # optional restore overlay   (runs ADDITIVELY alongside generic)
├── PARITY.md           # Phase 2 P2: mapping table (recipe-maintainer tests → cc-ci tests)
├── functional/         # Phase 2 P3: parity ports + ≥2 NEW recipe-specific tests
│   ├── test_health_check.py    # parity port of recipe-info/<recipe>/tests/health_check.py
│   ├── test_<behavior>.py      # ≥2 NEW recipe-specific functional tests
│   └── …
└── playwright/         # Phase 2 P6: browser flows where the app's core UX is a UI
    └── test_<flow>.py

A recipe is testable with ZERO config: with no overlay files, the generic lifecycle suite runs (install/upgrade/backup/restore) against a single shared deployment — see docs/testing.md for the full model (deploy-once, additive generic+overlay, the chaos PR-head upgrade, the HC2 repo-local allowlist, the install-steps hook). The per-recipe dir only holds the bits where the recipe needs more than the generic.

To add recipe-specific coverage, drop a tests/<recipe>/test_<op>.py overlay — it runs ALONGSIDE the generic for that op (HC3 additive, Phase 1e); the generic floor is never silently dropped. Overlays are assertion-only against the shared live deployment (the live_app fixture; they never perform the op or deploy/teardown — the orchestrator owns those). If the overlay needs to SEED pre-op state (data-continuity markers, the backup→restore divergence), put pre_<op>(ctx) callables in tests/<recipe>/ops.py — the orchestrator runs them BEFORE the op (ctx is the uniform HookCtx every hook receives — docs/recipe-customization.md §4.1). Copy an existing recipe (tests/custom-html/ simple/volume marker; tests/keycloak/ admin-API; tests/ matrix-synapse/ db-service psql marker). Do not edit the shared tests/conftest.py / runner/harness/ to add a recipe — set per-recipe knobs in recipe_meta.py (the COMPLETE key reference is the generated table in docs/recipe-customization.md §4; unknown ALL-CAPS keys are hard errors, recipe-private constants are underscore-prefixed _FOO):

HEALTH_PATH = "/realms/master"   # path that returns a healthy status (default "/")
HEALTH_OK = (200,)               # acceptable status codes (default 200/301/302)
DEPLOY_TIMEOUT = 600             # seconds for services to converge (default 600)
HTTP_TIMEOUT = 600               # seconds for the app to answer (default 300)
BACKUP_CAPABLE = True            # override backup-capability auto-detect (default: scan compose)
EXTRA_ENV = {"KEY": "value"}     # or EXTRA_ENV(ctx) -> dict; extra .env keys set at deploy

Useful harness.lifecycle helpers for overlays: http_get, http_fetch, http_body, exec_in_app (use this for data markers — volume/DB, hardened with returncode+retry); the lifecycle ops themselves are orchestrator-owned (you never call them from an overlay). The harness forces LETS_ENCRYPT_ENV="" (no ACME), a unique short domain per run, and guarantees teardown.

2.1 Phase-2 contract: parity port + recipe-specific functional tests + Playwright

Beyond the lifecycle overlays, each recipe carries (plan §4.1):

  • PARITY.md — a mapping table from every references/recipe-maintainer/recipe-info/<recipe>/ tests/*.py to a comparable cc-ci test under tests/<recipe>/functional/, asserting the same thing (not a renamed file). A deliberate non-port is documented in DECISIONS.md with a technical reason — never a silent omission.
  • functional/ — parity-port tests + ≥2 NEW recipe-specific functional tests that exercise the app's characteristic behavior (per plan §4.3 — e.g. "create-an-object + read-it-back, and one more that touches a distinctive feature"). Each parity-port file carries a SOURCE = "recipe-info/<recipe>/tests/<file>" comment near the top so audit is in-file.
  • playwright/ — browser flows where the recipe's core UX is a UI (P6).

The orchestrator's custom tier discovers test_*.py in tests/<recipe>/{functional,playwright}/ ONLY (the placement rule, via runner/harness/discovery.custom_tests — a top-level test_*.py is a lifecycle overlay and nothing else) and runs each as its own pytest against the same live_app shared deployment. Lifecycle-named files (test_install.py/etc.) are excluded from the custom tier even inside those subdirs (safety net against double-running).

2.2 Recipe-test dependencies — DEPS = [...] (Phase 2 Q2.3)

If your recipe needs other recipes deployed alongside it (an SSO provider, a database), declare them in recipe_meta.py:

DEPS = ["keycloak"]  # one entry per dep recipe name (cc-ci tests/<dep>/ must exist + work)

The orchestrator (plan §4.2; install-time provisioning is the ONLY mode):

  1. Reads DEPS and provisions every dep BEFORE the single deploy of the recipe under test — each dep at a per-run domain <dep[:4]>-<6hex>.ci.commoninternet.net (the 6hex is hashed from parent_recipe + pr + ref + dep_recipe so two recipes' deps of the same kind do not collide on a single node), waited healthy using the dep's own recipe_meta.py.
  2. Persists the full per-dep identity + SSO creds dict to $CCCI_DEPS_FILE (jq-readable JSON, {"<dep>": {"domain": ..., "realm": ..., "client_secret": ..., ...}}).
  3. Deploys the recipe under test — its install_steps.sh reads $CCCI_DEPS_FILE and wires OIDC env into that ONE deploy (no post-deploy redeploy). A dep-provisioning failure does NOT block the run: the recipe deploys alone, generic tiers run, and requires_deps tests skip with a counted reason (F2-11).
  4. Tears down the dep LAST in finally (reverse declaration order, with verify=True — leaked deps fail the run loudly per §9 teardown sacred / F2-5 fix).

Tests access deps via the deps pytest fixture (tests/conftest.py) — entries expose .domain plus the full creds dict (attribute or dict-style):

@pytest.mark.requires_deps
def test_my_recipe_uses_keycloak(live_app, deps):
    assert "keycloak" in deps, f"keycloak dep not deployed; {deps}"
    kc_domain = deps["keycloak"].domain
    

Deploy-count guard: with deps the expected count is 1 + len(DEPS) (the parent + one per dep). The orchestrator computes this and fails the run on mismatch.

2.3 SSO setup — harness.sso (Phase 2 Q2.3)

For OIDC-dependent recipes, the shared runner/harness/sso.py provides:

from harness import sso

creds = sso.setup_keycloak_realm(
    kc_domain,                   # = deps["keycloak"].domain
    realm="my-realm",
    client_id="my-client",
    redirect_uris=[f"https://{live_app}/*"],
    web_origins=[f"https://{live_app}"],
)
# creds = {"realm", "client_id", "client_secret", "user", "password", "token_url", …}

sso.assert_discovery_endpoint(creds)         # GET /.well-known/openid-configuration
token = sso.oidc_password_grant(creds)       # exercises the OIDC password grant; returns JWT

setup_keycloak_realm is idempotent (409 → reset to known values) and uses class-B run-scoped secrets (the generated client_secret + test-user password are destroyed when the dep keycloak is torn down at run end, plan §4.4-B). Note (F2-7): the setup primitive is keycloak-specific; when authentik comes online a parallel setup_authentik_realm will need to land in harness.sso. The flow primitives (oidc_password_grant, assert_discovery_endpoint) ARE provider-pluggable.

2.4 Non-HTTP, multi-service, and host-dependent recipes (Phase 2 Q4)

Not every recipe is a single HTTP app. recipe_meta.py + a few harness mechanisms cover the harder shapes (proven on mumble, mailu, and the SSO-dependent suite):

  • EXTRA_ENV — a dict or a callable(ctx) -> dict. The callable form derives values from the per-run domain (ctx.domain — e.g. MAIL_DOMAIN/HOSTNAMES for mailu, SANDBOX_DOMAIN for cryptpad). Applied at every deploy (abra.env_set), so a recipe enrolls with NO shared-harness change.
  • READY_PROBE(ctx) -> [...] — readiness signals beyond replica-convergence + the app's HEALTH_PATH. Two probe shapes:
    • HTTP: {"host": "...", "path": "/...", "ok": (200,)} (e.g. lasuite-drive collabora WOPI discovery).
    • TCP: {"tcp_host": "127.0.0.1", "tcp_port": 64738, "stable": 3} — polls a socket connect N consecutive times. Use for non-HTTP services whose HEALTH_PATH reflects a sidecar, not the real service (mumble: the mumble-web sidecar serves HTTP 200 while the voice server on 64738 is still rebinding after an upgrade redeploy — the TCP probe gates the backup tier until the voice server is actually up). Runs after install AND after the upgrade chaos redeploy.
  • compose.ccci.yml (first-class at tests/<recipe>/compose.ccci.yml) — a CI-only compose overlay the harness itself copies into the recipe checkout before the base deploy, automatically using --chaos for that deploy (the untracked file would otherwise trip abra's pinned-deploy clean-tree check). Reference it from EXTRA_ENV's COMPOSE_FILE. Minimal, justified fallback only (e.g. ghost's 15m start_period grace). abra.recipe_checkout force-checks-out (-f) so the upgrade tier's re-checkout to PR-head overwrites such overlays cleanly.
  • install_steps.sh (auto-discovered at tests/<recipe>/install_steps.sh) — runs after abra app new + EXTRA_ENV + secret-generate, BEFORE the single deploy, with CCCI_APP_DOMAIN / CCCI_APP_ENV / CCCI_RECIPE (and CCCI_DEPS_FILE when the recipe declares DEPS — deps are always provisioned before the deploy). Use it to wire dep-derived env/secrets, seed config, etc.

Non-HTTP protocol tests (mumble). Reach a TCP service published mode: host (via a host-ports overlay) at 127.0.0.1:<port> — cc-ci runs tests on-host (cc-ci-run). mumble ships a stdlib protocol client (tests/mumble/functional/_mumble_proto.py) doing the real TLS handshake → ServerSync; the recipe-specific tests assert channel presence and config round-trips (a deploy-set WELCOME_TEXT/ USERS value surfaces over the protocol — version-independent, non-vacuous).

In-container functional tests (mailu). When network access to a service is constrained (mailu uses TLS_FLAVOR=notls because certdumper needs traefik ACME which cc-ci does not run → dovecot refuses plaintext auth over the network), exercise the app via lifecycle.exec_in_app(domain, [...], service="<svc>") against the relevant container: e.g. flask mailu user ... (admin) to create a mailbox, then a local sendmail inject (smtp) → doveadm search (imap) to prove real postfix→rspamd→dovecot delivery. This hits the same stack the network path would, without the env constraint.

P4 when the recipe ships no backup (backupbot) labels. generic.backup_capable auto-detects the backupbot.backup label; recipes without it (mailu, drone) cleanly SKIP the backup/restore tiers — P4 is genuinely N/A (nothing to back up), not a cut corner. Document it in PARITY.md + a DEFERRED.md entry (the durable fix is a backupbot recipe-PR, like immich), and seek Adversary §7.1 sign-off.

3. Recipe-local tests (D4) — default-deny (HC2)

If the recipe's own repo contains tests/test_*.py / install_steps.sh / ops.py, the runner snapshots them right after fetch — but per Phase 1e HC2 it executes them only for recipes on the cc-ci approval allowlist tests/repo-local-approved.txt (default empty ⇒ default-deny). PR-author code runs on the CI host with /run/secrets/* present, so adding a recipe to the allowlist is a deliberate cc-ci-maintainer act (in a cc-ci PR, after reviewing that recipe's repo-local tests). Without approval, only the cc-ci overlays in this repo + the generic floor run. Approved recipe-local files receive env CCCI_BASE_URL (e.g. https://<app>.ci.commoninternet.net/) and CCCI_APP_DOMAIN.

4. Add the repo to the bridge poll list

The trigger is polling (primary): add the repo's full name to the comment-bridge POLL_REPOS csv (nix/modules/bridge.nix) and nixos-rebuild switch. The bridge then polls that repo's open PRs every 30s and fires a run on a new !testme comment from an authorized org member. This needs only read + comment access — no webhook, no repo-admin.

!testme on a PR runs install/upgrade/backup + any recipe-local tests, and reports back to the PR.

Optional: lower-latency webhook (admin-registered)

Polling already satisfies D1 (<60s). For lower latency an admin may optionally register a Gitea issue_comment webhook (the bot does not self-register one — that needs repo-admin):

  • URL https://ci.commoninternet.net/hook, content-type application/json, event Issue Comment, secret = the shared webhook HMAC (secrets/secrets.yamlwebhook_hmac).
  • The Gitea instance must allow the host (admin: add ci.commoninternet.net to the [webhook] ALLOWED_HOST_LIST).

The webhook and poller are deduped by comment id, so a comment seen by both fires only once.

Run locally

RECIPE=<recipe> PR=<n> REF=<sha-or-branch> SRC=recipe-maintainers/<recipe> \
  STAGES=install,upgrade,backup,restore,custom cc-ci-run runner/run_recipe_ci.py

Worked example — lasuite-docs (OIDC-dependent, Phase 2)

tests/lasuite-docs/
├── recipe_meta.py            # HEALTH_PATH="/", DEPLOY_TIMEOUT=900, EXTRA_ENV(ctx) for cold-pull,
│                             # DEPS=["keycloak"]  ← Phase 2 dep declaration
├── install_steps.sh          # wires OIDC env from $CCCI_DEPS_FILE into the single deploy
├── ops.py                    # pre_<op>(ctx) seed hooks (volume marker for backup/restore data-integrity)
├── test_install.py           # lifecycle install overlay (Playwright frontend SPA load)
├── test_upgrade.py           # lifecycle upgrade overlay (marker survives chaos redeploy)
├── test_backup.py            # lifecycle backup overlay (marker captured)
├── test_restore.py           # lifecycle restore overlay (marker restored to pre-mutation)
├── PARITY.md                 # parity-port mapping (P2)
└── functional/
    ├── test_health_check.py        # parity port (SOURCE comment cites recipe-info file)
    ├── test_auth_required.py       # specific: /api/v1.0/users/me/ → 401 without auth
    └── test_oidc_with_keycloak.py  # specific: full OIDC flow against the dep keycloak (uses
                                    # harness.sso primitives + the `deps` fixture)

!testme on a lasuite-docs PR drives the orchestrator to:

  1. Provision the per-run keycloak dep (keyc-<6hex>.ci.commoninternet.net), wait healthy, write creds to $CCCI_DEPS_FILE — BEFORE the recipe deploy.
  2. Deploy lasuite-docs (lasu-<6hex>.ci.commoninternet.net); install_steps.sh wires the OIDC env into that one deploy.
  3. Run install / upgrade / backup / restore + the 3 functional tests against the shared deployment (custom tier).
  4. Teardown lasuite-docs, then the keycloak dep (LAST), both with verify=True.
  5. Print the run summary; non-zero exit code on any failure (DG4.1 deploy-count mismatch, tier FAIL, dep teardown leak — all surfaced).

Other shapes (concrete references)

  • TCP / voice recipe — tests/mumble/: recipe_meta.py (EXTRA_ENV sets COMPOSE_FILE=compose.yml:compose.mumbleweb.yml for the base; UPGRADE_EXTRA_ENV adds the native compose.host-ports.yml at PR-head so 64738 is host-published on latest; private _WELCOME_TEXT_MARKER/_MAX_USERS constants; READY_PROBE(ctx) TCP 64738 — phase-aware via the live COMPOSE_FILE), functional/_mumble_proto.py + the protocol/config-round-trip tests, ops.py/test_backup.py/test_restore.py (sqlite P4). See §2.4.
  • Multi-service, dep-less, in-container functional — tests/mailu/: recipe_meta.py (EXTRA_ENV(ctx) with TLS_FLAVOR=notls + MAIL_DOMAIN/HOSTNAMES/TRAEFIK_STACK_NAME), functional/_mailu.py (flask-CLI helpers), test_mailbox.py (create→config-export read-back), test_mail_flow.py (in-container sendmail→doveadm delivery). No backupbot → P4 N/A (PARITY.md + DEFERRED.md). See §2.4.