diff --git a/docs/recipe-customization.md b/docs/recipe-customization.md new file mode 100644 index 0000000..f2ac579 --- /dev/null +++ b/docs/recipe-customization.md @@ -0,0 +1,353 @@ +# Recipe customization — review spec + +Status: REVIEW SPEC — describes the customization surface as it exists today (main), written so +the structure can be reviewed and potentially restructured. §8 lists known limitations and +restructuring candidates; everything before it is purely descriptive. + +Companion docs: `docs/testing.md` (test architecture / tier semantics), `docs/enroll-recipe.md` +(step-by-step enrollment). This doc is the **complete reference** for the two questions those docs +answer only partially: + +1. How are custom tests written for a particular recipe? +2. What are ALL the per-recipe CI settings, where do they live, and who reads them? + +--- + +## 1. The three customization surfaces + +A recipe customizes its CI through **three distinct mechanisms** (worth noticing for the +restructure review — they are three different config languages): + +| Surface | Form | Examples | +|---|---|---| +| **Declarative settings** | Python assignments in `tests//recipe_meta.py` | `DEPLOY_TIMEOUT = 1500`, `UPGRADE_BASE_VERSION = "2.3.1+..."` | +| **Code hooks** | Callables in `recipe_meta.py`, `ops.py` functions, shell hooks | `def READY_PROBE(domain): ...`, `pre_upgrade()`, `install_steps.sh` | +| **File presence** | A file existing at a discovered path changes behavior | `test_upgrade.py` overlay, `functional/test_*.py`, `compose.ccci.yml` | + +There is additionally a fourth, operator-facing surface: **environment variables** +(`CCCI_SKIP_GENERIC*`) that override declarative settings at run time (§4.4). + +## 2. Zero-config baseline + +A recipe with **no `tests//` directory at all** still gets the full generic floor: + +- deploy base version → INSTALL (generic `assert_serving`: HTTP on `/`, expect 200/301/302) +- chaos-upgrade to PR head → UPGRADE (generic `assert_upgraded`: version label matches head, converged, serving) +- BACKUP (generic `assert_backup_artifact`) — iff the recipe's compose files carry + `backupbot.backup` labels (auto-detected), else N/A +- RESTORE (generic `assert_restore_healthy`) +- CUSTOM tier: empty (no custom tests discovered) +- teardown + +Defaults: `HEALTH_PATH="/"`, `HEALTH_OK=(200,301,302)`, `DEPLOY_TIMEOUT=600`, `HTTP_TIMEOUT=300`. +Everything in this doc is opt-in deviation from that floor. The cardinal invariant +(docs/testing.md §1): the generic floor is **always on** and never depends on custom code; +custom is **additive** by default. + +## 3. The per-recipe tree — every file that can exist + +Two locations, with precedence and a security gate between them: + +- **cc-ci-owned**: `tests//` in this repo (trusted, maintainer-reviewed) +- **repo-local**: the recipe repo's own `tests/` dir (PR-author-controlled → **default-deny**, + consulted only when the recipe is listed in `tests/repo-local-approved.txt` — gate HC2, + centralized in `runner/harness/discovery.py`) + +``` +tests// # cc-ci side (repo-local mirrors the same shape) +├── recipe_meta.py # ALL declarative settings + meta callables (§4) +├── test_.py # lifecycle overlay assertions, op ∈ install|upgrade|backup|restore (§5.1) +├── ops.py # pre_(domain, meta) seed hooks (§5.2) +├── test_*.py # custom-tier tests (top-level, cross-cutting)(§5.3) +├── functional/test_*.py # custom tier: parity ports + recipe-specific (§5.3) +├── playwright/test_*.py # custom tier: UI flows (§5.3) +├── install_steps.sh # pre-deploy shell hook (§5.4) +├── setup_custom_tests.sh # deps/OIDC credential wiring hook (§5.5) +├── compose.ccci.yml # CI-only compose overlay (via install_steps) (§5.6) +└── PARITY.md # enrollment contract doc (human-read only) +``` + +Precedence (machine-docs/DECISIONS.md, implemented in `discovery.py`): + +- lifecycle overlay `test_.py`: repo-local **wins** over cc-ci (same-name collision); the + generic floor still runs additively alongside. +- custom tier `test_*.py`: **ALL** run, from both locations (no collision concept). +- `install_steps.sh`: repo-local > cc-ci, or none. +- `ops.py` pre-op hook: cc-ci wins; repo-local consulted only if approved. +- `recipe_meta.py`: cc-ci only — repo-local recipes cannot set CI settings (by design; the + settings surface stays maintainer-controlled). + +## 4. `recipe_meta.py` — complete settings reference + +The single settings file. Plain Python, `exec()`d by the harness (trusted, in-repo). A key is "set" +by a top-level assignment or `def`. Unknown names are ignored silently (a recipe may keep private +constants here, e.g. mumble's `WELCOME_TEXT_MARKER` — but see §8 R6: typos in real key names are +also silently ignored). + +**Loader column legend** — this is the structural finding for the review (§8 R1). There is no +single loader; six independent code paths each `exec()` the file and pick out their own keys: + +| # | Loader | Keys it sees | +|---|---|---| +| L1 | `runner/run_recipe_ci.py:_load_meta` (orchestrator) | 4 base + explicit 8-key allowlist | +| L2 | `tests/conftest.py:_recipe_meta` (pytest `meta` fixture) | 4 base keys ONLY | +| L3 | `runner/harness/lifecycle.py:_recipe_extra_env` | `EXTRA_ENV` only | +| L4 | `runner/harness/lifecycle.py:_recipe_meta_flag` | boolean flags by name (`CHAOS_BASE_DEPLOY`) | +| L5 | `runner/harness/deps.py:declared_deps` | `DEPS` only | +| L6 | `runner/harness/canonical.py:is_canonical_enrolled` | `WARM_CANONICAL` only | + +### 4.1 HTTP / health / timing (base 4 — seen by L1 AND L2) + +| Key | Type / default | Meaning | Used by | +|---|---|---|---| +| `HEALTH_PATH` | str, `"/"` | Path probed for serving/health checks | deploy wait (`lifecycle.py`), generic `assert_serving` | +| `HEALTH_OK` | tuple, `(200, 301, 302)` | Acceptable HTTP status codes for health | same | +| `DEPLOY_TIMEOUT` | int s, `600` | Max wait for swarm convergence per deploy | `lifecycle.py`, generic ops | +| `HTTP_TIMEOUT` | int s, `300` | Max wait for HTTP health after converged | same | + +Example: immich sets `DEPLOY_TIMEOUT = 1500`, `HTTP_TIMEOUT = 600` (ML containers are slow). + +### 4.2 Upgrade tier (loader L1) + +| Key | Type / default | Meaning | +|---|---|---| +| `UPGRADE_BASE_VERSION` | str (exact published tag), default `None` | **The "base pin"** — overrides the harness default base for the upgrade tier. Default base = `recipe_versions[-2]` (the previous published version); pin when that is not the PR's true predecessor (e.g. the PR is the first release on a new major, or the previous tag is known-broken). Must be an exact published tag — typos fail the base deploy. Consumed at `run_recipe_ci.py` (`prev = meta.get("UPGRADE_BASE_VERSION") or lifecycle.previous_version(recipe)`). Users: discourse, plausible. | +| `UPGRADE_EXTRA_ENV` | dict **or** callable `(domain) -> dict`, default `None` | Extra `.env` keys applied **after** the PR-head checkout, **before** the chaos redeploy (F2-14c) — for env vars that exist only at head (a new required setting introduced by the PR). Consumed in `generic.py:256`. User: mumble. | + +### 4.3 Every-deploy shaping (loaders L3/L4 — NOT in the L1 allowlist) + +| Key | Type / default | Meaning | +|---|---|---| +| `EXTRA_ENV` | dict **or** callable `(domain) -> dict`, default `{}` | Extra `.env` keys applied at **every** deploy (base install AND upgrade old-app). Callable form derives values from the per-run domain (e.g. cryptpad's `SANDBOX_DOMAIN`). Loaded by `lifecycle.py:_recipe_extra_env` (its own `exec()`). Users: cryptpad, discourse, ghost, matrix-synapse, mattermost-lts, mumble, plausible. | +| `CHAOS_BASE_DEPLOY` | bool, default `False` | Base deploy uses `--chaos` so it survives untracked files in the recipe checkout (required when `install_steps.sh` copies in a `compose.ccci.yml` overlay — §5.6; implicit coupling, see §8 R7). Loaded by `lifecycle.py:_recipe_meta_flag`. Users: discourse, ghost. | + +### 4.4 Skips and intentional N/A (loader L1) + +| Key | Type / default | Meaning | +|---|---|---| +| `SKIP_GENERIC` | list of op names or `"all"`/`"*"`, default `[]` | Suppress the generic floor for the listed ops (overlay becomes override instead of additive). Two env equivalents at run time: `CCCI_SKIP_GENERIC=1` (all ops), `CCCI_SKIP_GENERIC_=1` (one op). Currently set by **no enrolled recipe** (env form is the one used, ad hoc). | +| `EXPECTED_NA` | dict `{rung: reason}`, default `None` | Declares an N/A rung **intentional** (e.g. `{"backup": "stateless, nothing to back up"}`). Undeclared N/A is reported as an *unintentional coverage gap*. Both cap the achievable level — declaring does not un-cap, it only changes the report wording (`results.py`). User: custom-html-tiny. | +| `BACKUP_CAPABLE` | bool, default auto-detect | Overrides the backup-tier capability detection (scan of recipe compose files for `backupbot.backup` labels, `generic.py:34`). `False` forces N/A; `True` forces the tier on. Users: custom-html-bkp-bad/rst-bad (harness self-test recipes). | + +### 4.5 Readiness & data-verification hooks (loader L1, callable values) + +| Key | Type / default | Meaning | +|---|---|---| +| `READY_PROBE` | callable `(domain) -> [probe, ...]`, default `None` | Extra readiness probes run after install AND after upgrade, before that tier's assertions. Probe dicts: HTTP `{host, path, ok}` or TCP `{tcp_host, tcp_port, stable}` (`stable`: must stay connectable across 3 checks — for UDP-adjacent voice ports etc.). Consumed at `lifecycle.py:516`. Users: lasuite-drive, mumble (TCP voice port). | +| `BACKUP_VERIFY` | callable `(domain) -> bool`, default `None` | Post-backup data-capture check, retried — guards the truncated-dump race (backup snapshot taken before the seeded marker row hit disk). Return `False` → retry the backup, then fail. Users: discourse, ghost. | + +### 4.6 Dependencies / SSO (loaders L5 + L1) + +| Key | Type / default | Meaning | +|---|---|---| +| `DEPS` | list of recipe names, default `[]` | Dep recipes deployed alongside (e.g. `["keycloak"]`). Dep domain is `-<6hex>`, hashed from (parent, pr, ref, dep) — collision-free per run. Creds land in `$CCCI_DEPS_FILE` (JSON); tests use the `deps_apps` fixture; teardown deps LAST. Deploy-count guard becomes `1 + len(DEPS)`. Loaded by `deps.py:declared_deps`. Users: lasuite-docs/-drive/-meet. | +| `OIDC_AT_INSTALL` | bool, default `False` | Provision deps **before** the single base deploy so `install_steps.sh` can wire OIDC env into that one deploy (reads `$CCCI_DEPS_FILE`). Default (legacy) is post-deploy provisioning + a `setup_custom_tests.sh` redeploy. Consumed at `run_recipe_ci.py:514`. Users: lasuite-drive, lasuite-meet. | + +### 4.7 Warm-canonical enrollment (loader L6) + +| Key | Type / default | Meaning | +|---|---|---| +| `WARM_CANONICAL` | bool, default `False` | Enrolls the recipe in the warm/canonical app system (`docs/warm.md`): green COLD runs on LATEST advance the canonical snapshot; the nightly sweep iterates enrolled recipes. Loaded by `canonical.py:is_canonical_enrolled`. User: custom-html. | + +### 4.8 Cosmetic (BROKEN — see §8 R2) + +| Key | Type / default | Meaning | +|---|---|---| +| `SCREENSHOT` | callable `(page, domain, meta) -> None` | Drives Playwright to a safe post-login view for the results-card screenshot (default: landing page). **Currently unreachable from the CI path**: `screenshot.py:41` reads it from the meta dict the orchestrator passes (`run_recipe_ci.py:1056`), but the L1 allowlist never loads `SCREENSHOT`, so the hook is always `None`. No recipe sets it (consistent with it never having worked). | + +## 5. Writing custom tests & hooks + +### 5.1 Lifecycle overlay assertions — `test_.py` + +One pytest file per lifecycle op (`install` / `upgrade` / `backup` / `restore`). The +**orchestrator performs the op exactly once**; the overlay only *asserts* on the resulting state +(HC3 op/assertion split — overlays never deploy, never restore, never mutate). The generic floor +test runs additively against the same state. + +Conventions (see `tests/immich/test_backup.py` etc.): +- use the `live_app` fixture (asserts `CCCI_APP_DOMAIN` is set, yields the domain) +- use the `meta` fixture for HEALTH_*/timeouts (note: only the 4 base keys — §8 R3) +- read op context from `$CCCI_OP_STATE_FILE` (JSON written by the orchestrator after the op: + versions, artifact paths) +- execute in-container checks via `harness.lifecycle.exec_in_app(domain, service, cmd)` + +### 5.2 Pre-op seed hooks — `ops.py` + +`def pre_(domain, meta)` callables, imported and called by the orchestrator **before** +performing the op. This is where data gets seeded so the post-op overlay can assert on it: + +```python +# tests/immich/ops.py (pattern) +def pre_upgrade(domain, meta): _psql(domain, "INSERT ... 'upgrade-survives'") +def pre_backup(domain, meta): _psql(domain, "INSERT ... 'original'") +def pre_restore(domain, meta): _psql(domain, "DROP TABLE ci_marker") # damage, restore must undo +``` + +Seed → op → assert is the whole pattern: `pre_backup` writes a marker, the orchestrator backs up, +`pre_restore` destroys it, the orchestrator restores, `test_restore.py` asserts the marker is back. + +### 5.3 Custom tier — `functional/`, `playwright/`, top-level `test_*.py` + +All non-lifecycle `test_*.py` (discovery: `discovery.py:custom_tests`, recursive over the +top-level dir + `functional/` + `playwright/`; files named `test_.py` excluded). Run in the +CUSTOM tier, after restore, against the post-upgrade (PR-head) app. ALL discovered files run — +cc-ci's and (if HC2-approved) repo-local's, additively. + +Enrollment contract (`docs/enroll-recipe.md`): ≥2 NEW functional tests beyond ports of existing +upstream checks; ported tests carry `SOURCE:` comments. Playwright tests get the shared +browser/harness helpers (`harness.browser`); SSO recipes get `harness.sso` +(`setup_keycloak_realm` — idempotent, `oidc_password_grant` — provider-pluggable). + +Tests gate on deps via `CCCI_DEPS_READY` (skip-with-reason when `0`; the skip is counted and +fails the run if deps were declared but unprovisionable — `run_recipe_ci.py:816`). + +### 5.4 Pre-deploy shell hook — `install_steps.sh` + +Runs after `abra app new` + `EXTRA_ENV` application + secret generation, **before** the base +deploy. For setup that must precede the first deploy: writing extra config files into the recipe +checkout, copying in a `compose.ccci.yml` overlay (§5.6), editing `.env` beyond simple key=val. + +Env contract: `CCCI_APP_DOMAIN`, `CCCI_RECIPE`, `CCCI_APP_ENV` (path to the app's `.env`), and — +when `OIDC_AT_INSTALL` deps exist — `CCCI_DEPS_FILE`. Must locate the recipe checkout +ABRA_DIR-aware: `RECIPE_DIR="${ABRA_DIR:-${HOME}/.abra}/recipes/${CCCI_RECIPE}"` (per-run +`ABRA_DIR` since the concurrency restructure — a hardcoded `~/.abra` writes to the wrong tree). + +Graceful-generic rule: a recipe needing a hook but not shipping one simply fails the generic +install — a correct reported outcome, not a harness error. + +### 5.5 Deps credential wiring — `setup_custom_tests.sh` + +For legacy (post-deploy) deps provisioning: runs after deps are up, reads `$CCCI_DEPS_FILE` +(jq-readable JSON of dep creds/URLs), wires OIDC config via `abra app config set` + secrets, and +redeploys. With `OIDC_AT_INSTALL = True` this hook is unnecessary (wiring happens in +`install_steps.sh` before the only deploy) — preferred for new enrollments (one deploy, no +deploy-count exception). + +### 5.6 CI-only compose overlay — `compose.ccci.yml` + +Not auto-discovered: `install_steps.sh` copies it into the recipe checkout, and the recipe must +set `CHAOS_BASE_DEPLOY = True` so the base deploy (`--chaos`) tolerates the untracked file. +Policy: minimal, justified fallback only (ghost's is a 15m `start_period` grace — a literal, +because abra validates `start_period` before env substitution). The overlay is cc-ci-owned even +though it rides in the recipe checkout. + +### 5.7 Environment contract summary (what custom code can read) + +| Var | Set for | Meaning | +|---|---|---| +| `CCCI_APP_DOMAIN` | all tests + hooks | the app's per-run domain | +| `CCCI_BASE_URL` | approved repo-local code | `https://` | +| `CCCI_RECIPE`, `CCCI_APP_ENV` | `install_steps.sh` | recipe name, app `.env` path | +| `CCCI_OP_STATE_FILE` | overlay tests | JSON op context (versions, artifacts) | +| `CCCI_DEPS_FILE` | deps hooks + tests | JSON dep creds dict | +| `CCCI_DEPS_READY` / `CCCI_DEPS_NOT_READY_REASON` | custom tier | gate SSO tests, skip-with-reason | + +## 6. Run-model context (what the settings plug into) + +One deploy chain per run (full detail: `docs/testing.md` §2): + +``` +deploy BASE (UPGRADE_BASE_VERSION or recipe_versions[-2]; EXTRA_ENV; install_steps.sh; + CHAOS_BASE_DEPLOY?; OIDC_AT_INSTALL deps first?) + → INSTALL tier (READY_PROBE; generic + overlay asserts) + → pre_upgrade → chaos-deploy PR HEAD (UPGRADE_EXTRA_ENV) + → UPGRADE tier (READY_PROBE; version-label == head_ref) + → pre_backup → backup (BACKUP_CAPABLE; BACKUP_VERIFY) + → BACKUP tier + → pre_restore → restore + → RESTORE tier + → CUSTOM tier (functional/ + playwright/; deps via CCCI_DEPS_*) + → teardown (deps LAST) +``` + +Deploy-count guard (DG4.1): exactly `1 + len(DEPS)` deploys per run (chaos redeploys don't +count); the per-run counter file is keyed by run since the concurrency restructure. + +## 7. Local iteration + +``` +RECIPE= PR= REF= SRC=recipe-maintainers/ \ + STAGES=install,upgrade,backup,restore,custom \ + cc-ci-run runner/run_recipe_ci.py +``` + +(`docs/enroll-recipe.md` §5 for the full loop, including dep teardown caveats.) + +## 8. Known limitations & restructuring candidates + +The review section. Ordered by how much they'd shape a restructure. + +**R1 — Six divergent meta loaders (the core drift hazard).** §4's L1–L6: every loader re-`exec()`s +`recipe_meta.py` and cherry-picks its own keys. Adding a key means knowing *which* loader to touch +(or that you must extend the L1 allowlist — `SCREENSHOT` proves people don't, R2). Two conventions +coexist: L1's explicit allowlist vs L3–L6's ad-hoc `ns.get(...)` which silently bypasses it. +*Candidate:* one `harness.meta.load(recipe) -> RecipeMeta` with a declarative key registry +(name, type, default, validator, consumer) as the single source of truth; L1–L6 become lookups +into the one loaded object; the registry also generates §4 of this doc (kills doc drift, R5). + +**R2 — `SCREENSHOT` is a dead knob.** Fully implemented consumer (`screenshot.py`), documented +hook contract, never reachable: the orchestrator's allowlist omits it, so the dict passed at +`run_recipe_ci.py:1056` can never contain it. Direct evidence of R1. *Candidate:* fix trivially by +adding to the allowlist — or delete the hook path if post-login screenshots aren't wanted; decide +during the restructure. + +**R3 — The pytest `meta` fixture sees 4 keys.** `tests/conftest.py:_recipe_meta` loads only +HEALTH_*/timeouts. An overlay test wanting e.g. `EXPECTED_NA` or a recipe constant must re-exec +the file itself. Probably intended minimalism, but it's a third key-set to keep in sync. +*Folds into R1.* + +**R4 — Settings split across three config languages** (§1): recipe_meta keys, file-presence +(`install_steps.sh` existing changes deploy behavior), and run-time env (`CCCI_SKIP_GENERIC*`). +A reviewer asking "what does this recipe customize?" must check all three. *Candidate:* keep the +three surfaces (they serve different actors) but make the run header log a single resolved +"customization manifest" per run: every non-default key + every discovered hook file + every +CCCI_* override, in one block. + +**R5 — Reference-doc drift already happened.** `docs/testing.md` documents 6 meta keys, +`docs/enroll-recipe.md` shows others by example; neither is complete (18 keys exist). This doc is +now complete but handwritten — it will drift too. *Candidate:* generate the key table from the R1 +registry (test asserts doc ⊆ registry). + +**R6 — No schema validation / silent typos.** Unknown top-level names in `recipe_meta.py` are +ignored, which is load-bearing (recipes keep private constants there: mumble's +`WELCOME_TEXT_MARKER`, `MAX_USERS`). Consequence: misspelling `READY_PROBE` as `READINESS_PROBE` +silently disables the probe — the run goes green with less coverage, the worst failure mode for a +CI harness. *Candidate:* with the R1 registry, warn (not fail) on ALL-CAPS top-level names that +are not registered and not referenced by the recipe's own tests; or namespace private constants +(`_WELCOME_TEXT_MARKER`). + +**R7 — `compose.ccci.yml` ⇄ `CHAOS_BASE_DEPLOY` implicit coupling.** The overlay only works if +the recipe *also* sets the flag; forgetting it fails the base deploy with an abra +untracked-files error far from the cause. *Candidate:* if `install_steps.sh` exists alongside a +`compose.ccci.yml`, the harness could auto-enable chaos for the base deploy (or at least assert +the flag and fail with a pointed message). + +**R8 — `SKIP_GENERIC` (meta form) has zero users.** Only the env-var form is used, ad hoc. Either +the meta key earns its place (first real user) or it's surface to delete in the restructure. + +**R9 — `recipe_meta.py` is code, not config.** Five keys take callables (`EXTRA_ENV`, +`UPGRADE_EXTRA_ENV`, `READY_PROBE`, `BACKUP_VERIFY`, `SCREENSHOT`), so the file must stay an +`exec()`d Python module — it can't be validated as data, serialized into results, or diffed +declaratively. This is a real expressiveness need (cryptpad derives `SANDBOX_DOMAIN` from the +per-run domain), not an accident. *Candidate if restructuring:* split data keys (TOML-able, +schema-validated) from a `hooks.py` (callables only) — but weigh against the cost of two files +per recipe; the R1 registry gets most of the value without the split. + +## 9. File / symbol index + +| Concern | Where | +|---|---| +| Orchestrator meta loader (L1, allowlist) | `runner/run_recipe_ci.py:250` `_load_meta` | +| Pytest meta fixture (L2) | `tests/conftest.py` `_recipe_meta` | +| `EXTRA_ENV` loader (L3) | `runner/harness/lifecycle.py:114` `_recipe_extra_env` | +| Boolean-flag loader (L4) | `runner/harness/lifecycle.py:132` `_recipe_meta_flag` | +| `DEPS` loader (L5) | `runner/harness/deps.py:37` `declared_deps` | +| `WARM_CANONICAL` loader (L6) | `runner/harness/canonical.py:36` `is_canonical_enrolled` | +| Overlay/custom/hook discovery + HC2 gate | `runner/harness/discovery.py` | +| HC2 allowlist | `tests/repo-local-approved.txt` | +| Generic assertions + `BACKUP_CAPABLE` detect | `runner/harness/generic.py` | +| `READY_PROBE` / `CHAOS_BASE_DEPLOY` consumption | `runner/harness/lifecycle.py:516` / `:283` | +| `EXPECTED_NA` reporting | `runner/harness/results.py` | +| Dead `SCREENSHOT` consumer | `runner/harness/screenshot.py:36`, called `run_recipe_ci.py:1056` | +| Skip-generic logic (meta + env) | `runner/run_recipe_ci.py:285` | +| Worked examples | `tests/ghost/` (overlay+chaos), `tests/mumble/` (TCP probe, UPGRADE_EXTRA_ENV), `tests/lasuite-drive/` (DEPS+OIDC_AT_INSTALL), `tests/immich/` (ops.py seed pattern) |