# The cc-ci test architecture — generic suite + layered recipe overlays (Phase 1d) Every recipe gets a **generic lifecycle test suite for free**. Recipe-specific tests *layer on top* of the generic default rather than being the only thing that runs. So `!testme` is meaningful on **any** recipe immediately (zero config), and adding recipe-specific coverage is a thin overlay. ## The model: tiers against one shared deployment A run is a sequence of **tiers**. The orchestrator (`runner/run_recipe_ci.py`) deploys the app **once** and runs each tier against that single live deployment, then tears it down **once** in a `finally`. Lifecycle ops mutate the deployment **in place** — there is **no redeploy per tier** (asserted every run: `deploy-count = 1`). ``` deploy ONCE (base version: the previous published version when an upgrade tier will run and one exists — so upgrade is a real previous→target; else the target / current PR head) → INSTALL assertions (app already deployed: assert it really serves) → UPGRADE abra app upgrade in place → target; assert reconverge + serving + the deployment MOVED → BACKUP abra app backup create; assert a snapshot artifact (backup-capable recipes only) → RESTORE abra app restore; assert healthy + serving (backup-capable recipes only) → CUSTOM any non-lifecycle test_*.py (only if defined) teardown ONCE (in finally) ``` Each tier is its own `pytest` invocation, so the run reports **per-operation** pass / fail / skip (`install / upgrade / backup / restore / custom`). The shared live domain is passed to each tier in `CCCI_APP_DOMAIN` and exposed by the `live_app` fixture; **tiers are assertion-only and never deploy or tear down** (that is the orchestrator's job). ## The generic default (recipe-agnostic) Lives in the shared harness — `runner/harness/generic.py` + `tests/_generic/test_.py` — so there is no per-recipe copy-paste: - **install** (`generic.assert_serving`) — services converged (the app's *own* replicas are N/N) **and** a real HTTP(S) response in `HEALTH_OK` (which excludes 404, so a Traefik unmatched-router fallback fails) **and** the body isn't Traefik's default 404 page. A bounded poll (no bare `sleep`) so a state-mutating op settles, while a persistent failure still fails within the timeout. A CA-verified TLS handshake is also run as an **infra cert sanity check** (catches a lapsed/mis-rotated wildcard); it does **not** distinguish app-vs-fallback (Traefik serves the wildcard zone-wide) — that's the converged + non-404 check. - **upgrade** (`generic.do_upgrade`) — `abra app upgrade` in place to the target, then assert serving **and that the deployment actually moved** (the `coop-cloud..version` label and/or image changed). The move-assertion makes a vacuous no-op upgrade impossible to pass. - **backup** (`generic.do_backup`) — `abra app backup create`; assert a snapshot artifact was produced (the `snapshot_id` in the create output). Honest limit: the generic verifies the *mechanism*, not app-specific data integrity (that's an overlay, below). - **restore** (`generic.do_restore`) — `abra app restore`; assert the app is healthy + serving after. **Backup-capability** is auto-detected: a recipe is backup-capable iff a `compose*.yml` carries a truthy `backupbot.backup` label (override with `BACKUP_CAPABLE` in `recipe_meta.py`). For non-backup-capable recipes the backup/restore tiers are a clean **N/A skip** — not a failure. ## Recipe overlays — override or extend (the generic is always the default) Convention: a recipe-specific tier is a file named exactly `test_install.py` / `test_upgrade.py` / `test_backup.py` / `test_restore.py`. **If present it OVERRIDES the generic for that op; if absent, the generic runs** (the invariant). Discovery looks in two locations, with this precedence: ``` repo-local /tests/test_.py (upstream-authoritative; wins same-name collisions) > cc-ci tests//test_.py (CI-curated overlay) > generic tests/_generic/test_.py (the floor; always present) ``` - **Override** — a present `test_.py` replaces the generic assertions for that op. - **Extend by composition** — an overlay may `from harness import generic` and call `generic.assert_serving(...)` / `generic.do_upgrade(...)` / `do_backup` / `do_restore`, then add its own recipe-specific assertions. (This is how every overlay reuses the generic op + serving check and layers data-continuity on top — no separate "extend" mechanism needed.) - **Custom (non-lifecycle) `test_*.py`** — any other `test_*.py` (e.g. `test_sso.py`) is **opt-in and additive**: it has no generic equivalent and runs only when present, discovered from **both** locations. Lifecycle names are excluded from the custom set. Overlays are **assertion-only** and run against the shared deployment via the `live_app` fixture (so deploy-count stays 1). A data-continuity overlay reads/writes the app's *volume/DB* (via `lifecycle.exec_in_app`, robust to the serving layer), e.g.: - `test_upgrade.py`: seed a marker → `generic.do_upgrade(...)` → assert the marker survived. - `test_backup.py`: seed "original" → `generic.do_backup(...)` → mutate to "mutated". - `test_restore.py`: `generic.do_restore(...)` → assert the marker is back to "original" (the backup tier's mutation persists on the shared deployment until the restore tier runs). See `tests/custom-html/` (volume marker) and `tests/keycloak/`, `tests/matrix-synapse/`, `tests/lasuite-docs/` (admin-API / `db`-service markers) for worked examples. ## Custom install-steps hook (and the graceful-generic rule) Some recipes need setup the generic flow won't do (pre-seed content, set an env/secret, run a one-off command). Provide a shell hook — `tests//install_steps.sh` (cc-ci) or repo-local `tests/install_steps.sh` (repo-local wins). The orchestrator runs it during the install tier **after `abra app new` + env defaults, before `abra app deploy`**, with env: - `CCCI_APP_DOMAIN` — the run's app domain - `CCCI_RECIPE` — the recipe name - `CCCI_APP_ENV` — path to the app's `.env` (for `abra`-side edits) **Graceful-generic rule:** a recipe with **no** hook still attempts the generic install. A recipe that genuinely needs a step will **fail the generic install — and that's the correct, reported outcome** (per-op `install: fail`); the fix is to add the step, not to special-case the harness. Worked example: `tests/custom-html-tiny/install_steps.sh` seeds an `index.html` into the static server's content volume — without it the generic install fails 404, with it it passes. ## How to add a recipe overlay (zero → some coverage) 1. The recipe is already testable with **zero config** — enrol it (poll list + mirror) and the generic suite runs (`docs/enroll-recipe.md`). 2. To add recipe-specific coverage, drop a `tests//test_.py` overlay (copy an existing one, e.g. `tests/keycloak/test_upgrade.py`). Reuse the generic op via `generic.do_(...)` and add your assertions. Read/write app state through `lifecycle.exec_in_app` (volume/DB), not HTTP, for data checks. Set per-recipe knobs (health path, timeouts) in `recipe_meta.py`. 3. If the recipe needs setup before it can serve, add `tests//install_steps.sh`. 4. Never weaken or skip an assertion to make a run pass — a red tier is information. Per-recipe config (`tests//recipe_meta.py`, all optional): ```python HEALTH_PATH = "/realms/master" # path that returns a healthy status (default "/") HEALTH_OK = (200,) # acceptable status codes (default 200/301/302) DEPLOY_TIMEOUT = 600 # seconds for services to converge (default 600) HTTP_TIMEOUT = 600 # seconds for the app to answer (default 300) BACKUP_CAPABLE = True # override backup-capability auto-detection (default: scan compose) EXTRA_ENV = {"KEY": "value"} # or EXTRA_ENV(domain) -> dict; extra .env keys set at deploy ``` The harness self-tests for discovery/precedence live in `tests/unit/` (run: `cc-ci-run -m pytest tests/unit`); they are never picked up as overlays/custom tests.