All checks were successful
continuous-integration/drone/push Build is passing
a) compose.ccci.yml is FIRST-CLASS: the harness auto-copies tests/<recipe>/ compose.ccci.yml into the run's recipe checkout (ABRA_DIR-aware, lifecycle. provide_ccci_overlay) and auto-chaoses the pinned base deploy on its presence (kills the R7 implicit coupling). ghost/discourse install_steps.sh (copy-only boilerplate) deleted; CHAOS_BASE_DEPLOY removed from both metas + the registry. b) install-time deps wiring is the ONLY mode: deps with DEPS provision BEFORE the single deploy; legacy post-deploy provisioning + the setup_custom_tests.sh invocation machinery deleted. lasuite-docs migrated to install_steps.sh OIDC wiring (same env names/values as the old hook — only the timing moved); lasuite-drive's remaining post-deploy MinIO bucket one-shot moved to ops.py pre_install; both setup_custom_tests.sh files deleted; OIDC_AT_INSTALL removed from drive/meet metas + the registry. c) SKIP_GENERIC meta key deleted (zero users). Env form CCCI_SKIP_GENERIC* stays as the documented dev-only escape hatch; when active in a drone CI run the orchestrator prints a loud !! warning (manifest embedding lands in P5). d) conftest cleanup: dead pre-deploy-once fixtures deployed/deployed_app deleted (zero users), app_domain + _short + _wait_healthy dropped (only users were the deleted fixtures); deps_apps+deps_creds consolidated into ONE deps fixture (entries expose .domain etc. as attributes; dict access intact); the 6 lasuite test files renamed deps_creds->deps (fixture name only — assertions and flows byte-identical). requires_deps marker + F2-11 skip-report plumbing unchanged. Registry is now exactly the 14 final keys; docs §4 table regenerated. Stale setup_custom_tests/OIDC_AT_INSTALL prose in docstrings/comments/assert MESSAGES updated (no assert logic or expected value touched). Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 175 passed; scripts/lint.sh -> PASS.
384 lines
25 KiB
Markdown
384 lines
25 KiB
Markdown
# Recipe customization — review spec
|
||
|
||
Status: REVIEW SPEC — describes the customization surface as it exists today (main), written so
|
||
the structure can be reviewed and potentially restructured. §8 lists known limitations and
|
||
restructuring candidates; everything before it is purely descriptive.
|
||
|
||
Companion docs: `docs/testing.md` (test architecture / tier semantics), `docs/enroll-recipe.md`
|
||
(step-by-step enrollment). This doc is the **complete reference** for the two questions those docs
|
||
answer only partially:
|
||
|
||
1. How are custom tests written for a particular recipe?
|
||
2. What are ALL the per-recipe CI settings, where do they live, and who reads them?
|
||
|
||
---
|
||
|
||
## 1. The three customization surfaces
|
||
|
||
A recipe customizes its CI through **three distinct mechanisms** (worth noticing for the
|
||
restructure review — they are three different config languages):
|
||
|
||
| Surface | Form | Examples |
|
||
|---|---|---|
|
||
| **Declarative settings** | Python assignments in `tests/<recipe>/recipe_meta.py` | `DEPLOY_TIMEOUT = 1500`, `UPGRADE_BASE_VERSION = "2.3.1+..."` |
|
||
| **Code hooks** | Callables in `recipe_meta.py`, `ops.py` functions, shell hooks | `def READY_PROBE(domain): ...`, `pre_upgrade()`, `install_steps.sh` |
|
||
| **File presence** | A file existing at a discovered path changes behavior | `test_upgrade.py` overlay, `functional/test_*.py`, `compose.ccci.yml` |
|
||
|
||
There is additionally a fourth, operator-facing surface: **environment variables**
|
||
(`CCCI_SKIP_GENERIC*`) that override declarative settings at run time (§4.4).
|
||
|
||
## 2. Zero-config baseline
|
||
|
||
A recipe with **no `tests/<recipe>/` directory at all** still gets the full generic floor:
|
||
|
||
- deploy base version → INSTALL (generic `assert_serving`: HTTP on `/`, expect 200/301/302)
|
||
- chaos-upgrade to PR head → UPGRADE (generic `assert_upgraded`: version label matches head, converged, serving)
|
||
- BACKUP (generic `assert_backup_artifact`) — iff the recipe's compose files carry
|
||
`backupbot.backup` labels (auto-detected), else N/A
|
||
- RESTORE (generic `assert_restore_healthy`)
|
||
- CUSTOM tier: empty (no custom tests discovered)
|
||
- teardown
|
||
|
||
Defaults: `HEALTH_PATH="/"`, `HEALTH_OK=(200,301,302)`, `DEPLOY_TIMEOUT=600`, `HTTP_TIMEOUT=300`.
|
||
Everything in this doc is opt-in deviation from that floor. The cardinal invariant
|
||
(docs/testing.md §1): the generic floor is **always on** and never depends on custom code;
|
||
custom is **additive** by default.
|
||
|
||
## 3. The per-recipe tree — every file that can exist
|
||
|
||
Two locations, with precedence and a security gate between them:
|
||
|
||
- **cc-ci-owned**: `tests/<recipe>/` in this repo (trusted, maintainer-reviewed)
|
||
- **repo-local**: the recipe repo's own `tests/` dir (PR-author-controlled → **default-deny**,
|
||
consulted only when the recipe is listed in `tests/repo-local-approved.txt` — gate HC2,
|
||
centralized in `runner/harness/discovery.py`)
|
||
|
||
```
|
||
tests/<recipe>/ # cc-ci side (repo-local mirrors the same shape)
|
||
├── recipe_meta.py # ALL declarative settings + meta callables (§4)
|
||
├── test_<op>.py # lifecycle overlay assertions, op ∈ install|upgrade|backup|restore (§5.1)
|
||
├── ops.py # pre_<op>(domain, meta) seed hooks (§5.2)
|
||
├── test_*.py # custom-tier tests (top-level, cross-cutting)(§5.3)
|
||
├── functional/test_*.py # custom tier: parity ports + recipe-specific (§5.3)
|
||
├── playwright/test_*.py # custom tier: UI flows (§5.3)
|
||
├── install_steps.sh # pre-deploy shell hook (§5.4)
|
||
├── setup_custom_tests.sh # deps/OIDC credential wiring hook (§5.5)
|
||
├── compose.ccci.yml # CI-only compose overlay (via install_steps) (§5.6)
|
||
└── PARITY.md # enrollment contract doc (human-read only)
|
||
```
|
||
|
||
Precedence (machine-docs/DECISIONS.md, implemented in `discovery.py`):
|
||
|
||
- lifecycle overlay `test_<op>.py`: repo-local **wins** over cc-ci (same-name collision); the
|
||
generic floor still runs additively alongside.
|
||
- custom tier `test_*.py`: **ALL** run, from both locations (no collision concept).
|
||
- `install_steps.sh`: repo-local > cc-ci, or none.
|
||
- `ops.py` pre-op hook: cc-ci wins; repo-local consulted only if approved.
|
||
- `recipe_meta.py`: cc-ci only — repo-local recipes cannot set CI settings (by design; the
|
||
settings surface stays maintainer-controlled).
|
||
|
||
## 4. `recipe_meta.py` — complete settings reference
|
||
|
||
The single settings file. Plain Python, `exec()`d by the harness (trusted, in-repo). A key is "set"
|
||
by a top-level assignment or `def`. Unknown names are ignored silently (a recipe may keep private
|
||
constants here, e.g. mumble's `WELCOME_TEXT_MARKER` — but see §8 R6: typos in real key names are
|
||
also silently ignored).
|
||
|
||
**Loader column legend** — this is the structural finding for the review (§8 R1). There is no
|
||
single loader; six independent code paths each `exec()` the file and pick out their own keys:
|
||
|
||
| # | Loader | Keys it sees |
|
||
|---|---|---|
|
||
| L1 | `runner/run_recipe_ci.py:_load_meta` (orchestrator) | 4 base + explicit 8-key allowlist |
|
||
| L2 | `tests/conftest.py:_recipe_meta` (pytest `meta` fixture) | 4 base keys ONLY |
|
||
| L3 | `runner/harness/lifecycle.py:_recipe_extra_env` | `EXTRA_ENV` only |
|
||
| L4 | `runner/harness/lifecycle.py:_recipe_meta_flag` | boolean flags by name (`CHAOS_BASE_DEPLOY`) |
|
||
| L5 | `runner/harness/deps.py:declared_deps` | `DEPS` only |
|
||
| L6 | `runner/harness/canonical.py:is_canonical_enrolled` | `WARM_CANONICAL` only |
|
||
|
||
> **Restructure status (rcust P1):** the six loaders above are HISTORY — they have been replaced by
|
||
> the single registry-backed loader `runner/harness/meta.py::load(recipe) -> RecipeMeta` (the only
|
||
> `exec()` of `recipe_meta.py`). Unknown ALL-CAPS keys / type mismatches are now hard errors;
|
||
> underscore-prefixed names are recipe-private. The authoritative key reference is the generated
|
||
> table below; the per-loader subsections §4.1–§4.8 are retained for context until the P6 doc
|
||
> rewrite.
|
||
|
||
<!-- META-TABLE-START -->
|
||
|
||
_This table is GENERATED from the `runner/harness/meta.py` KEYS registry by `scripts/gen-meta-docs.py` — do not edit by hand (a unit test pins the sync)._
|
||
|
||
| Key | Type | Default | Meaning |
|
||
|---|---|---|---|
|
||
| `HEALTH_PATH` | `str` | `'/'` | Path probed for serving/health checks (deploy wait + generic `assert_serving`). |
|
||
| `HEALTH_OK` | `tuple[int]` | `(200, 301, 302)` | Acceptable HTTP status codes for health. |
|
||
| `DEPLOY_TIMEOUT` | `int` | `600` | Max seconds to wait for swarm convergence per deploy. |
|
||
| `HTTP_TIMEOUT` | `int` | `300` | Max seconds to wait for HTTP health after convergence. |
|
||
| `BACKUP_CAPABLE` | `bool` | `None` | Override the backup-tier capability auto-detect (compose `backupbot.backup` labels). `False` forces N/A; `True` forces the tier on; unset = auto-detect. |
|
||
| `EXPECTED_NA` | `dict` | `None` | Declare an N/A rung intentional: `{rung: reason}`. The cap stands either way; only the report wording changes. |
|
||
| `READY_PROBE` | `hook` | `None` | Callable returning extra readiness probes, run after install AND after upgrade: HTTP `{host, path, ok}` or TCP `{tcp_host, tcp_port, stable}`. |
|
||
| `UPGRADE_BASE_VERSION` | `str` | `None` | Exact published tag overriding the upgrade tier's base (default: `recipe_versions[-2]`). |
|
||
| `BACKUP_VERIFY` | `hook` | `None` | Callable `-> bool` post-backup data-capture check; `False` re-runs the backup (truncated-dump race guard), retried up to 3 attempts. |
|
||
| `UPGRADE_EXTRA_ENV` | `dict_or_hook` | `None` | Extra `.env` keys applied after the PR-head checkout, before the chaos redeploy (env that exists only at head). |
|
||
| `EXTRA_ENV` | `dict_or_hook` | `{}` | Extra `.env` keys applied at EVERY deploy (base install AND upgrade old-app). Callable form derives values from the per-run domain. |
|
||
| `DEPS` | `list[str]` | `[]` | Dep recipes deployed/provisioned alongside (e.g. `["keycloak"]`); creds land in `$CCCI_DEPS_FILE`. |
|
||
| `WARM_CANONICAL` | `bool` | `False` | Enroll the recipe in the warm/canonical app system (docs/warm.md): green cold runs on LATEST advance the canonical snapshot. |
|
||
| `SCREENSHOT` | `hook` | `None` | Callable driving Playwright to a safe, credential-free post-login view for the results-card screenshot (default: landing page). |
|
||
|
||
<!-- META-TABLE-END -->
|
||
|
||
### 4.1 HTTP / health / timing (base 4 — seen by L1 AND L2)
|
||
|
||
| Key | Type / default | Meaning | Used by |
|
||
|---|---|---|---|
|
||
| `HEALTH_PATH` | str, `"/"` | Path probed for serving/health checks | deploy wait (`lifecycle.py`), generic `assert_serving` |
|
||
| `HEALTH_OK` | tuple, `(200, 301, 302)` | Acceptable HTTP status codes for health | same |
|
||
| `DEPLOY_TIMEOUT` | int s, `600` | Max wait for swarm convergence per deploy | `lifecycle.py`, generic ops |
|
||
| `HTTP_TIMEOUT` | int s, `300` | Max wait for HTTP health after converged | same |
|
||
|
||
Example: immich sets `DEPLOY_TIMEOUT = 1500`, `HTTP_TIMEOUT = 600` (ML containers are slow).
|
||
|
||
### 4.2 Upgrade tier (loader L1)
|
||
|
||
| Key | Type / default | Meaning |
|
||
|---|---|---|
|
||
| `UPGRADE_BASE_VERSION` | str (exact published tag), default `None` | **The "base pin"** — overrides the harness default base for the upgrade tier. Default base = `recipe_versions[-2]` (the previous published version); pin when that is not the PR's true predecessor (e.g. the PR is the first release on a new major, or the previous tag is known-broken). Must be an exact published tag — typos fail the base deploy. Consumed at `run_recipe_ci.py` (`prev = meta.get("UPGRADE_BASE_VERSION") or lifecycle.previous_version(recipe)`). Users: discourse, plausible. |
|
||
| `UPGRADE_EXTRA_ENV` | dict **or** callable `(domain) -> dict`, default `None` | Extra `.env` keys applied **after** the PR-head checkout, **before** the chaos redeploy (F2-14c) — for env vars that exist only at head (a new required setting introduced by the PR). Consumed in `generic.py:256`. User: mumble. |
|
||
|
||
### 4.3 Every-deploy shaping (loaders L3/L4 — NOT in the L1 allowlist)
|
||
|
||
| Key | Type / default | Meaning |
|
||
|---|---|---|
|
||
| `EXTRA_ENV` | dict **or** callable `(domain) -> dict`, default `{}` | Extra `.env` keys applied at **every** deploy (base install AND upgrade old-app). Callable form derives values from the per-run domain (e.g. cryptpad's `SANDBOX_DOMAIN`). Loaded by `lifecycle.py:_recipe_extra_env` (its own `exec()`). Users: cryptpad, discourse, ghost, matrix-synapse, mattermost-lts, mumble, plausible. |
|
||
| `CHAOS_BASE_DEPLOY` | bool, default `False` | Base deploy uses `--chaos` so it survives untracked files in the recipe checkout (required when `install_steps.sh` copies in a `compose.ccci.yml` overlay — §5.6; implicit coupling, see §8 R7). Loaded by `lifecycle.py:_recipe_meta_flag`. Users: discourse, ghost. |
|
||
|
||
### 4.4 Skips and intentional N/A (loader L1)
|
||
|
||
| Key | Type / default | Meaning |
|
||
|---|---|---|
|
||
| `SKIP_GENERIC` | list of op names or `"all"`/`"*"`, default `[]` | Suppress the generic floor for the listed ops (overlay becomes override instead of additive). Two env equivalents at run time: `CCCI_SKIP_GENERIC=1` (all ops), `CCCI_SKIP_GENERIC_<OP>=1` (one op). Currently set by **no enrolled recipe** (env form is the one used, ad hoc). |
|
||
| `EXPECTED_NA` | dict `{rung: reason}`, default `None` | Declares an N/A rung **intentional** (e.g. `{"backup": "stateless, nothing to back up"}`). Undeclared N/A is reported as an *unintentional coverage gap*. Both cap the achievable level — declaring does not un-cap, it only changes the report wording (`results.py`). User: custom-html-tiny. |
|
||
| `BACKUP_CAPABLE` | bool, default auto-detect | Overrides the backup-tier capability detection (scan of recipe compose files for `backupbot.backup` labels, `generic.py:34`). `False` forces N/A; `True` forces the tier on. Users: custom-html-bkp-bad/rst-bad (harness self-test recipes). |
|
||
|
||
### 4.5 Readiness & data-verification hooks (loader L1, callable values)
|
||
|
||
| Key | Type / default | Meaning |
|
||
|---|---|---|
|
||
| `READY_PROBE` | callable `(domain) -> [probe, ...]`, default `None` | Extra readiness probes run after install AND after upgrade, before that tier's assertions. Probe dicts: HTTP `{host, path, ok}` or TCP `{tcp_host, tcp_port, stable}` (`stable`: must stay connectable across 3 checks — for UDP-adjacent voice ports etc.). Consumed at `lifecycle.py:516`. Users: lasuite-drive, mumble (TCP voice port). |
|
||
| `BACKUP_VERIFY` | callable `(domain) -> bool`, default `None` | Post-backup data-capture check, retried — guards the truncated-dump race (backup snapshot taken before the seeded marker row hit disk). Return `False` → retry the backup, then fail. Users: discourse, ghost. |
|
||
|
||
### 4.6 Dependencies / SSO (loaders L5 + L1)
|
||
|
||
| Key | Type / default | Meaning |
|
||
|---|---|---|
|
||
| `DEPS` | list of recipe names, default `[]` | Dep recipes deployed alongside (e.g. `["keycloak"]`). Dep domain is `<dep[:4]>-<6hex>`, hashed from (parent, pr, ref, dep) — collision-free per run. Creds land in `$CCCI_DEPS_FILE` (JSON); tests use the `deps_apps` fixture; teardown deps LAST. Deploy-count guard becomes `1 + len(DEPS)`. Loaded by `deps.py:declared_deps`. Users: lasuite-docs/-drive/-meet. |
|
||
| `OIDC_AT_INSTALL` | bool, default `False` | Provision deps **before** the single base deploy so `install_steps.sh` can wire OIDC env into that one deploy (reads `$CCCI_DEPS_FILE`). Default (legacy) is post-deploy provisioning + a `setup_custom_tests.sh` redeploy. Consumed at `run_recipe_ci.py:514`. Users: lasuite-drive, lasuite-meet. |
|
||
|
||
### 4.7 Warm-canonical enrollment (loader L6)
|
||
|
||
| Key | Type / default | Meaning |
|
||
|---|---|---|
|
||
| `WARM_CANONICAL` | bool, default `False` | Enrolls the recipe in the warm/canonical app system (`docs/warm.md`): green COLD runs on LATEST advance the canonical snapshot; the nightly sweep iterates enrolled recipes. Loaded by `canonical.py:is_canonical_enrolled`. User: custom-html. |
|
||
|
||
### 4.8 Cosmetic (BROKEN — see §8 R2)
|
||
|
||
| Key | Type / default | Meaning |
|
||
|---|---|---|
|
||
| `SCREENSHOT` | callable `(page, domain, meta) -> None` | Drives Playwright to a safe post-login view for the results-card screenshot (default: landing page). **Currently unreachable from the CI path**: `screenshot.py:41` reads it from the meta dict the orchestrator passes (`run_recipe_ci.py:1056`), but the L1 allowlist never loads `SCREENSHOT`, so the hook is always `None`. No recipe sets it (consistent with it never having worked). |
|
||
|
||
## 5. Writing custom tests & hooks
|
||
|
||
### 5.1 Lifecycle overlay assertions — `test_<op>.py`
|
||
|
||
One pytest file per lifecycle op (`install` / `upgrade` / `backup` / `restore`). The
|
||
**orchestrator performs the op exactly once**; the overlay only *asserts* on the resulting state
|
||
(HC3 op/assertion split — overlays never deploy, never restore, never mutate). The generic floor
|
||
test runs additively against the same state.
|
||
|
||
Conventions (see `tests/immich/test_backup.py` etc.):
|
||
- use the `live_app` fixture (asserts `CCCI_APP_DOMAIN` is set, yields the domain)
|
||
- use the `meta` fixture for HEALTH_*/timeouts (note: only the 4 base keys — §8 R3)
|
||
- read op context from `$CCCI_OP_STATE_FILE` (JSON written by the orchestrator after the op:
|
||
versions, artifact paths)
|
||
- execute in-container checks via `harness.lifecycle.exec_in_app(domain, service, cmd)`
|
||
|
||
### 5.2 Pre-op seed hooks — `ops.py`
|
||
|
||
`def pre_<op>(domain, meta)` callables, imported and called by the orchestrator **before**
|
||
performing the op. This is where data gets seeded so the post-op overlay can assert on it:
|
||
|
||
```python
|
||
# tests/immich/ops.py (pattern)
|
||
def pre_upgrade(domain, meta): _psql(domain, "INSERT ... 'upgrade-survives'")
|
||
def pre_backup(domain, meta): _psql(domain, "INSERT ... 'original'")
|
||
def pre_restore(domain, meta): _psql(domain, "DROP TABLE ci_marker") # damage, restore must undo
|
||
```
|
||
|
||
Seed → op → assert is the whole pattern: `pre_backup` writes a marker, the orchestrator backs up,
|
||
`pre_restore` destroys it, the orchestrator restores, `test_restore.py` asserts the marker is back.
|
||
|
||
### 5.3 Custom tier — `functional/`, `playwright/`, top-level `test_*.py`
|
||
|
||
All non-lifecycle `test_*.py` (discovery: `discovery.py:custom_tests`, recursive over the
|
||
top-level dir + `functional/` + `playwright/`; files named `test_<op>.py` excluded). Run in the
|
||
CUSTOM tier, after restore, against the post-upgrade (PR-head) app. ALL discovered files run —
|
||
cc-ci's and (if HC2-approved) repo-local's, additively.
|
||
|
||
Enrollment contract (`docs/enroll-recipe.md`): ≥2 NEW functional tests beyond ports of existing
|
||
upstream checks; ported tests carry `SOURCE:` comments. Playwright tests get the shared
|
||
browser/harness helpers (`harness.browser`); SSO recipes get `harness.sso`
|
||
(`setup_keycloak_realm` — idempotent, `oidc_password_grant` — provider-pluggable).
|
||
|
||
Tests gate on deps via `CCCI_DEPS_READY` (skip-with-reason when `0`; the skip is counted and
|
||
fails the run if deps were declared but unprovisionable — `run_recipe_ci.py:816`).
|
||
|
||
### 5.4 Pre-deploy shell hook — `install_steps.sh`
|
||
|
||
Runs after `abra app new` + `EXTRA_ENV` application + secret generation, **before** the base
|
||
deploy. For setup that must precede the first deploy: writing extra config files into the recipe
|
||
checkout, copying in a `compose.ccci.yml` overlay (§5.6), editing `.env` beyond simple key=val.
|
||
|
||
Env contract: `CCCI_APP_DOMAIN`, `CCCI_RECIPE`, `CCCI_APP_ENV` (path to the app's `.env`), and —
|
||
when `OIDC_AT_INSTALL` deps exist — `CCCI_DEPS_FILE`. Must locate the recipe checkout
|
||
ABRA_DIR-aware: `RECIPE_DIR="${ABRA_DIR:-${HOME}/.abra}/recipes/${CCCI_RECIPE}"` (per-run
|
||
`ABRA_DIR` since the concurrency restructure — a hardcoded `~/.abra` writes to the wrong tree).
|
||
|
||
Graceful-generic rule: a recipe needing a hook but not shipping one simply fails the generic
|
||
install — a correct reported outcome, not a harness error.
|
||
|
||
### 5.5 Deps credential wiring — `setup_custom_tests.sh`
|
||
|
||
For legacy (post-deploy) deps provisioning: runs after deps are up, reads `$CCCI_DEPS_FILE`
|
||
(jq-readable JSON of dep creds/URLs), wires OIDC config via `abra app config set` + secrets, and
|
||
redeploys. With `OIDC_AT_INSTALL = True` this hook is unnecessary (wiring happens in
|
||
`install_steps.sh` before the only deploy) — preferred for new enrollments (one deploy, no
|
||
deploy-count exception).
|
||
|
||
### 5.6 CI-only compose overlay — `compose.ccci.yml`
|
||
|
||
Not auto-discovered: `install_steps.sh` copies it into the recipe checkout, and the recipe must
|
||
set `CHAOS_BASE_DEPLOY = True` so the base deploy (`--chaos`) tolerates the untracked file.
|
||
Policy: minimal, justified fallback only (ghost's is a 15m `start_period` grace — a literal,
|
||
because abra validates `start_period` before env substitution). The overlay is cc-ci-owned even
|
||
though it rides in the recipe checkout.
|
||
|
||
### 5.7 Environment contract summary (what custom code can read)
|
||
|
||
| Var | Set for | Meaning |
|
||
|---|---|---|
|
||
| `CCCI_APP_DOMAIN` | all tests + hooks | the app's per-run domain |
|
||
| `CCCI_BASE_URL` | approved repo-local code | `https://<domain>` |
|
||
| `CCCI_RECIPE`, `CCCI_APP_ENV` | `install_steps.sh` | recipe name, app `.env` path |
|
||
| `CCCI_OP_STATE_FILE` | overlay tests | JSON op context (versions, artifacts) |
|
||
| `CCCI_DEPS_FILE` | deps hooks + tests | JSON dep creds dict |
|
||
| `CCCI_DEPS_READY` / `CCCI_DEPS_NOT_READY_REASON` | custom tier | gate SSO tests, skip-with-reason |
|
||
|
||
## 6. Run-model context (what the settings plug into)
|
||
|
||
One deploy chain per run (full detail: `docs/testing.md` §2):
|
||
|
||
```
|
||
deploy BASE (UPGRADE_BASE_VERSION or recipe_versions[-2]; EXTRA_ENV; install_steps.sh;
|
||
CHAOS_BASE_DEPLOY?; OIDC_AT_INSTALL deps first?)
|
||
→ INSTALL tier (READY_PROBE; generic + overlay asserts)
|
||
→ pre_upgrade → chaos-deploy PR HEAD (UPGRADE_EXTRA_ENV)
|
||
→ UPGRADE tier (READY_PROBE; version-label == head_ref)
|
||
→ pre_backup → backup (BACKUP_CAPABLE; BACKUP_VERIFY)
|
||
→ BACKUP tier
|
||
→ pre_restore → restore
|
||
→ RESTORE tier
|
||
→ CUSTOM tier (functional/ + playwright/; deps via CCCI_DEPS_*)
|
||
→ teardown (deps LAST)
|
||
```
|
||
|
||
Deploy-count guard (DG4.1): exactly `1 + len(DEPS)` deploys per run (chaos redeploys don't
|
||
count); the per-run counter file is keyed by run since the concurrency restructure.
|
||
|
||
## 7. Local iteration
|
||
|
||
```
|
||
RECIPE=<recipe> PR=<n> REF=<sha> SRC=recipe-maintainers/<recipe> \
|
||
STAGES=install,upgrade,backup,restore,custom \
|
||
cc-ci-run runner/run_recipe_ci.py
|
||
```
|
||
|
||
(`docs/enroll-recipe.md` §5 for the full loop, including dep teardown caveats.)
|
||
|
||
## 8. Known limitations & restructuring candidates
|
||
|
||
The review section. Ordered by how much they'd shape a restructure.
|
||
|
||
**R1 — Six divergent meta loaders (the core drift hazard).** §4's L1–L6: every loader re-`exec()`s
|
||
`recipe_meta.py` and cherry-picks its own keys. Adding a key means knowing *which* loader to touch
|
||
(or that you must extend the L1 allowlist — `SCREENSHOT` proves people don't, R2). Two conventions
|
||
coexist: L1's explicit allowlist vs L3–L6's ad-hoc `ns.get(...)` which silently bypasses it.
|
||
*Candidate:* one `harness.meta.load(recipe) -> RecipeMeta` with a declarative key registry
|
||
(name, type, default, validator, consumer) as the single source of truth; L1–L6 become lookups
|
||
into the one loaded object; the registry also generates §4 of this doc (kills doc drift, R5).
|
||
|
||
**R2 — `SCREENSHOT` is a dead knob.** Fully implemented consumer (`screenshot.py`), documented
|
||
hook contract, never reachable: the orchestrator's allowlist omits it, so the dict passed at
|
||
`run_recipe_ci.py:1056` can never contain it. Direct evidence of R1. *Candidate:* fix trivially by
|
||
adding to the allowlist — or delete the hook path if post-login screenshots aren't wanted; decide
|
||
during the restructure.
|
||
|
||
**R3 — The pytest `meta` fixture sees 4 keys.** `tests/conftest.py:_recipe_meta` loads only
|
||
HEALTH_*/timeouts. An overlay test wanting e.g. `EXPECTED_NA` or a recipe constant must re-exec
|
||
the file itself. Probably intended minimalism, but it's a third key-set to keep in sync.
|
||
*Folds into R1.*
|
||
|
||
**R4 — Settings split across three config languages** (§1): recipe_meta keys, file-presence
|
||
(`install_steps.sh` existing changes deploy behavior), and run-time env (`CCCI_SKIP_GENERIC*`).
|
||
A reviewer asking "what does this recipe customize?" must check all three. *Candidate:* keep the
|
||
three surfaces (they serve different actors) but make the run header log a single resolved
|
||
"customization manifest" per run: every non-default key + every discovered hook file + every
|
||
CCCI_* override, in one block.
|
||
|
||
**R5 — Reference-doc drift already happened.** `docs/testing.md` documents 6 meta keys,
|
||
`docs/enroll-recipe.md` shows others by example; neither is complete (18 keys exist). This doc is
|
||
now complete but handwritten — it will drift too. *Candidate:* generate the key table from the R1
|
||
registry (test asserts doc ⊆ registry).
|
||
|
||
**R6 — No schema validation / silent typos.** Unknown top-level names in `recipe_meta.py` are
|
||
ignored, which is load-bearing (recipes keep private constants there: mumble's
|
||
`WELCOME_TEXT_MARKER`, `MAX_USERS`). Consequence: misspelling `READY_PROBE` as `READINESS_PROBE`
|
||
silently disables the probe — the run goes green with less coverage, the worst failure mode for a
|
||
CI harness. *Candidate:* with the R1 registry, warn (not fail) on ALL-CAPS top-level names that
|
||
are not registered and not referenced by the recipe's own tests; or namespace private constants
|
||
(`_WELCOME_TEXT_MARKER`).
|
||
|
||
**R7 — `compose.ccci.yml` ⇄ `CHAOS_BASE_DEPLOY` implicit coupling.** The overlay only works if
|
||
the recipe *also* sets the flag; forgetting it fails the base deploy with an abra
|
||
untracked-files error far from the cause. *Candidate:* if `install_steps.sh` exists alongside a
|
||
`compose.ccci.yml`, the harness could auto-enable chaos for the base deploy (or at least assert
|
||
the flag and fail with a pointed message).
|
||
|
||
**R8 — `SKIP_GENERIC` (meta form) has zero users.** Only the env-var form is used, ad hoc. Either
|
||
the meta key earns its place (first real user) or it's surface to delete in the restructure.
|
||
|
||
**R9 — `recipe_meta.py` is code, not config.** Five keys take callables (`EXTRA_ENV`,
|
||
`UPGRADE_EXTRA_ENV`, `READY_PROBE`, `BACKUP_VERIFY`, `SCREENSHOT`), so the file must stay an
|
||
`exec()`d Python module — it can't be validated as data, serialized into results, or diffed
|
||
declaratively. This is a real expressiveness need (cryptpad derives `SANDBOX_DOMAIN` from the
|
||
per-run domain), not an accident. *Candidate if restructuring:* split data keys (TOML-able,
|
||
schema-validated) from a `hooks.py` (callables only) — but weigh against the cost of two files
|
||
per recipe; the R1 registry gets most of the value without the split.
|
||
|
||
## 9. File / symbol index
|
||
|
||
| Concern | Where |
|
||
|---|---|
|
||
| Orchestrator meta loader (L1, allowlist) | `runner/run_recipe_ci.py:250` `_load_meta` |
|
||
| Pytest meta fixture (L2) | `tests/conftest.py` `_recipe_meta` |
|
||
| `EXTRA_ENV` loader (L3) | `runner/harness/lifecycle.py:114` `_recipe_extra_env` |
|
||
| Boolean-flag loader (L4) | `runner/harness/lifecycle.py:132` `_recipe_meta_flag` |
|
||
| `DEPS` loader (L5) | `runner/harness/deps.py:37` `declared_deps` |
|
||
| `WARM_CANONICAL` loader (L6) | `runner/harness/canonical.py:36` `is_canonical_enrolled` |
|
||
| Overlay/custom/hook discovery + HC2 gate | `runner/harness/discovery.py` |
|
||
| HC2 allowlist | `tests/repo-local-approved.txt` |
|
||
| Generic assertions + `BACKUP_CAPABLE` detect | `runner/harness/generic.py` |
|
||
| `READY_PROBE` / `CHAOS_BASE_DEPLOY` consumption | `runner/harness/lifecycle.py:516` / `:283` |
|
||
| `EXPECTED_NA` reporting | `runner/harness/results.py` |
|
||
| Dead `SCREENSHOT` consumer | `runner/harness/screenshot.py:36`, called `run_recipe_ci.py:1056` |
|
||
| Skip-generic logic (meta + env) | `runner/run_recipe_ci.py:285` |
|
||
| Worked examples | `tests/ghost/` (overlay+chaos), `tests/mumble/` (TCP probe, UPGRADE_EXTRA_ENV), `tests/lasuite-drive/` (DEPS+OIDC_AT_INSTALL), `tests/immich/` (ops.py seed pattern) |
|