Files
cc-ci-orchestrator/cc-ci-plan/recipe-custom-restructure-full-plan.md

18 KiB
Raw Blame History

Recipe-customization restructure — full implementation plan

Operator-approved direction (chat 2026-06-10). Reference spec of the CURRENT system: cc-ci docs/recipe-customization.md (commit 76a4b6b) — read it FIRST; its §8 R1R9 are the defects this plan fixes. Goals: one coherent customization system, one place/way to configure a recipe, easy custom tests, fix broken knobs, delete legacy paths, zero behavior regression for currently-working recipes.

Decisions locked (operator + orchestrator)

Question Decision
Loaders ONE loader runner/harness/meta.py::load(recipe) -> RecipeMeta backed by a declarative KEY registry. All six current loaders (spec §4 L1L6) migrate to it; exec() of recipe_meta.py happens in exactly one function.
Validation strictness Unknown ALL-CAPS top-level name or type mismatch = hard error (run fails fast at load; unit test catches at PR time). Underscore-prefixed names (_FOO) are recipe-private and exempt.
SCREENSHOT KEEP and FIX (it is currently dead — spec §8 R2). Reachable once the registry replaces the allowlist.
CHAOS_BASE_DEPLOY DELETE. tests/<recipe>/compose.ccci.yml becomes first-class: harness copies it into the checkout + auto-uses --chaos for the base deploy. Both users (ghost, discourse) use the flag only for this.
OIDC_AT_INSTALL DELETE. Install-time deps provisioning becomes the ONLY mode when DEPS is set; legacy post-deploy provisioning + setup_custom_tests.sh redeploy path removed. Migrate lasuite-docs.
SKIP_GENERIC (meta key) DELETE (zero users). Env form CCCI_SKIP_GENERIC* stays as a documented LOCAL-DEV-ONLY escape hatch, loudly surfaced in the run manifest.
Hook signatures All recipe callables take a single ctx (HookCtx) — EXTRA_ENV(ctx), UPGRADE_EXTRA_ENV(ctx), READY_PROBE(ctx), BACKUP_VERIFY(ctx), SCREENSHOT(page, ctx), pre_<op>(ctx). All users are in-repo; migrate them, no compat shim.
Fixtures Single tests/conftest.py. Final surface: recipe, meta (full RecipeMeta), live_app, op_state (NEW), deps (NEW — consolidates deps_apps+deps_creds). DELETE dead pre-deploy-once fixtures deployed/deployed_app (zero users) and app_domain if nothing else uses it (builder: grep).
Custom-test placement test_<op>.py top-level = lifecycle overlay; ALL custom tests under functional/ or playwright/. Discovery of top-level non-lifecycle test_*.py in recipe dirs is REMOVED (zero users today).
Docs Key reference table is GENERATED from the registry; a unit test asserts docs ⊆ registry sync.
recipe_meta stays Python NO data/hooks file split (spec §8 R9 considered, rejected — registry validation gets the value at lower cost). One file per recipe remains the single config place.
Landing one branch restructure/recipe-custom, one commit per phase, merged to main only after Adversary M1 PASS + lint green; real-CI regression sweep (M2) after merge.

Target shape (end state)

tests/<recipe>/
├── recipe_meta.py        # THE config: registry-validated keys + ctx-hooks (+ _private consts)
├── test_<op>.py          # lifecycle overlay asserts (op_state/live_app/meta fixtures)
├── ops.py                # pre_<op>(ctx) seed hooks
├── functional/  playwright/   # ALL custom tests
├── install_steps.sh      # pre-deploy shell hook (the only shell hook)
├── compose.ccci.yml      # first-class CI overlay (auto-copied, auto-chaos)
└── PARITY.md

One loader, one hook convention, one fixture file, one shell hook, one generated reference.


P1 — harness/meta.py: single loader + key registry

  1. New runner/harness/meta.py:
    • KEYS: registry of every key — name, type ("int"|"str"|"tuple[int]"|"bool"|"dict_or_hook"| "hook"|"list[str]"|"dict"), default, doc (one line each), optional validate(value). Final key set (14): HEALTH_PATH, HEALTH_OK, DEPLOY_TIMEOUT, HTTP_TIMEOUT, BACKUP_CAPABLE, EXPECTED_NA, READY_PROBE, UPGRADE_BASE_VERSION, BACKUP_VERIFY, UPGRADE_EXTRA_ENV, EXTRA_ENV, DEPS, WARM_CANONICAL, SCREENSHOT. (CHAOS_BASE_DEPLOY / OIDC_AT_INSTALL / SKIP_GENERIC are deleted in P2 — during P1 keep them registered with deprecated=True so P1 lands green before P2 removes them.)
    • load(recipe) -> RecipeMeta (frozen dataclass, attribute access): the ONLY exec() of tests/<recipe>/recipe_meta.py. Missing file → all defaults. Validation per the locked decision: unknown non-underscore ALL-CAPS name → MetaError listing the unknown name and nearest registered key; type mismatch → MetaError. Callables accepted only for hook-typed keys.
  2. Migrate ALL readers (spec §4 L1L6 + §9 index has exact locations):
    • run_recipe_ci.py::_load_metameta.load(); orchestrator loads ONCE and passes the object down (functions grow a meta param; stop re-exec'ing per call).
    • tests/conftest.py::_recipe_metameta.load() (fixture now returns full RecipeMeta).
    • lifecycle.py::_recipe_extra_env, lifecycle.py::_recipe_meta_flag → deleted; callers take meta.
    • deps.py::declared_depsmeta.DEPS; canonical.py::is_canonical_enrolledmeta.WARM_CANONICAL.
    • screenshot.py consumer unchanged — but now the dict/object it receives actually contains SCREENSHOT. This is the R2 fix; prove it with a unit test (meta with SCREENSHOT hook → _load_screenshot_hook returns it through the real orchestrator load path).
  3. Mumble private constants WELCOME_TEXT_MARKER, MAX_USERS_WELCOME_TEXT_MARKER, _MAX_USERS (fix their importers; builder: grep how mumble tests consume them).
  4. New unit tests tests/unit/test_meta.py:
    • every tests/*/recipe_meta.py in the repo loads clean through the registry (the typo gate);
    • unknown-key and wrong-type files (tmp fixtures) raise MetaError;
    • defaults match spec §2 baseline; underscore exemption; callable-on-data-key rejected.
  5. Doc generation: scripts/gen-meta-docs.py renders the registry to a markdown table between <!-- META-TABLE-START/END --> markers in docs/recipe-customization.md §4; unit test asserts the committed table == rendered output (drift fails CI).

P2 — Delete legacy keys & paths

a. compose.ccci.yml first-class. In the deploy path (after install_steps.sh, before base deploy): if tests/<recipe>/compose.ccci.yml exists, copy it into the recipe checkout (ABRA_DIR-aware) and use --chaos for the base deploy. Remove the copy boilerplate from ghost/discourse install_steps.sh (delete the hook file entirely if copying was all it did — builder: read both) and CHAOS_BASE_DEPLOY = True from both metas. Delete _recipe_meta_flag remnants. Keep the policy note (overlays = minimal justified fallback) in docs. b. Install-time deps only. Make the OIDC_AT_INSTALL=True code path the unconditional behavior for recipes with DEPS; delete the legacy post-deploy provisioning branch, the setup_custom_tests.sh invocation machinery, and the deploy-count exception for the legacy redeploy. Migrate lasuite-docs to install-time wiring (mirror what lasuite-drive/meet do in install_steps.sh reading $CCCI_DEPS_FILE). NOTE: lasuite-drive ALSO ships setup_custom_tests.sh despite OIDC_AT_INSTALL=True — builder must read both scripts and determine what they still do (realm/user provisioning is harness-side via harness.sso; env wiring belongs in install_steps.sh). Whatever remains necessary moves into install_steps.sh; then delete both setup_custom_tests.sh files and the key. c. SKIP_GENERIC meta key deleted; env CCCI_SKIP_GENERIC* documented dev-only; if set in a CI (drone) run, print a loud !! warning + record in the P5 manifest. d. Conftest cleanup: delete deployed, deployed_app (dead, zero users — verified), and app_domain if unused after their removal. Consolidate deps_apps + deps_creds into one deps fixture (entries expose .domain plus full creds; dict-style access fine). Migrate the 6 lasuite test files that use the old pair. Keep requires_deps marker + skip-report plumbing unchanged (F2-11 gate — do NOT weaken).

P3 — Uniform ctx hook convention

  1. harness/meta.py::HookCtx (frozen dataclass): .domain, .base_url, .meta (RecipeMeta), .deps (creds dict or None), .op (current lifecycle op or None). One constructor helper in the orchestrator; harness builds it at each hook call site.
  2. Convert call sites: EXTRA_ENV(ctx), UPGRADE_EXTRA_ENV(ctx), READY_PROBE(ctx), BACKUP_VERIFY(ctx), SCREENSHOT(page, ctx), ops.py pre_<op>(ctx). Dict-valued EXTRA_ENV/UPGRADE_EXTRA_ENV (non-callable) still allowed — only the callable signature changes.
  3. Migrate every in-repo user (spec §4 lists them per key; ops.py exists in ghost, discourse, immich, lasuite-*, mumble, others — builder: glob). Mechanical change; assertions and seeded values must remain byte-identical.
  4. Unit tests: hook invocation passes a ctx with correct fields; a legacy-signature callable (lambda domain: ...) raises a CLEAR MetaError naming the migration (no silent TypeError mid-run).

P4 — Custom-test ergonomics

  1. discovery.py::custom_tests: drop the top-level test_*.py glob for recipe dirs (keep functional/ + playwright/); lifecycle-name exclusion logic stays for safety. (Zero current users of top-level custom tests — verified; the change is doc'd as a placement RULE.)
  2. New fixtures in tests/conftest.py:
    • op_state — parse $CCCI_OP_STATE_FILE (skip with clear reason if unset/absent);
    • deps — from P2d. Migrate overlay tests that hand-parse CCCI_OP_STATE_FILE / env (builder: grep CCCI_OP_STATE_FILE|os.environ under tests/*/test_*.py + functional/) to fixtures. Tests should not read os.environ directly except via fixtures after this phase (grep-clean, excluding conftest itself).
  3. harness import surface: ensure from harness import lifecycle, sso, browser is the documented toolbox; no new module needed — docs job (P6).

P5 — Customization manifest

At run start (after meta load + discovery), print ONE block and embed the same dict in results.json under "customization":

===== customization manifest: <recipe> =====
meta (non-default): DEPLOY_TIMEOUT=1500 HTTP_TIMEOUT=600 DEPS=['keycloak'] ...
hooks: ops.py[pre_upgrade,pre_backup,pre_restore](cc-ci) install_steps.sh(cc-ci) compose.ccci.yml(cc-ci)
overlays: test_backup.py(cc-ci) test_restore.py(repo-local)
custom tests: functional/=5 playwright/=2 (cc-ci) functional/=1 (repo-local)
env overrides: CCCI_SKIP_GENERIC_BACKUP=1   !! dev-only override active in CI

Pure presentation + one results.json key — MUST NOT influence any verdict. Unit test: manifest for a synthetic recipe dir is complete + deterministic; results schema test updated.

P6 — Docs

  • docs/recipe-customization.md: §4 table generated (P1.5); §8 rewritten — R1/R2/R3/R6/R7/R8 resolved (say how), R4 mitigated by manifest, R9 rejected-by-decision; §3/§5 updated to the end-state shape (no setup_custom_tests.sh, placement rule, ctx hooks, fixtures incl. op_state/deps).
  • docs/testing.md + docs/enroll-recipe.md: remove their partial key lists (point at the generated table), update hook signatures, fixture names, lasuite-docs worked example, local-run instructions.
  • Keep all three docs' scope split: concepts (testing.md) / how-to (enroll-recipe.md) / reference+structure (recipe-customization.md).

Test suite additions (run inside pytest tests/unit -q — these are cheap/pure)

  • tests/unit/test_meta.py (P1.4) — registry, validation, all-recipes-load-clean, R2 proof.
  • ctx-hook tests (P3.4).
  • discovery placement tests (P4.1) — top-level custom no longer discovered; functional/playwright are.
  • manifest tests (P5).
  • doc-sync test (P1.5).
  • KEEP every existing unit test green; where a unit test pins old behavior being deleted (allowlist loaders, setup_custom_tests, deployed fixtures, top-level discovery), update it to pin the NEW behavior — never delete a test without a replacement that covers the successor path.

Roles, gates & Definition of Done (loop protocol — plan.md §6.1 applies)

Builder/Adversary loops, phase-namespaced state files: STATUS-rcust.md, REVIEW-rcust.md, BACKLOG-rcust.md, JOURNAL-rcust.md.

Builder (fable): implement P1→P6 on branch restructure/recipe-custom in YOUR clone; one commit per phase; before each commit ALL green: pytest tests/unit -q, scripts/lint.sh (tests/concurrency NOT required — untouched by this plan; run it once before M1 to prove that). Push the branch (NOT main — merge gated below). Claim gates via claim(rcust): ... commits + STATUS-rcust.md.

Adversary (opus): cold-verify from your own clone. For M1: check out the branch, run pytest tests/unit -q + pytest tests/concurrency -q + lint yourself, then adversarial review of the FULL diff. Hunt specifically:

  • coverage loss — the cardinal risk of this restructure. For EVERY migrated recipe, diff the resolved customization (old loaders' effective values vs new meta.load()) — write a throwaway script that computes both for all 21 recipe dirs and diffs; any delta is a finding.
  • assertion weakening in tests/<recipe>/ diffs: migrations must be mechanical (signatures, fixture names, underscore renames); any changed assert/expected-value = VETO.
  • deleted-code fallout: grep for dangling refs to _recipe_meta, _load_meta, _recipe_extra_env, _recipe_meta_flag, declared_deps, is_canonical_enrolled, OIDC_AT_INSTALL, CHAOS_BASE_DEPLOY, SKIP_GENERIC, setup_custom_tests, deps_apps, deps_creds, deployed_app.
  • validation gaps: craft a recipe_meta with a typo'd key / wrong type / callable-on-data-key — must MetaError, not silently pass.
  • R2 actually fixed end-to-end (orchestrator path delivers SCREENSHOT to screenshot.py).
  • HC2 gate integrity: repo-local default-deny unchanged; requires_deps skip-report (F2-11) unchanged; generic floor semantics unchanged. Findings to REVIEW-rcust.md; Adversary owns VETO.

Gates:

  • M1 — implementation verified. Branch complete (P1P6 + tests), unit+concurrency+lint green on the Adversary's cold clone, resolved-customization diff clean for all 21 recipes, adversarial diff review PASS in REVIEW-rcust.md.
  • M2 — merged + REAL-CI REGRESSION SWEEP (operator-required). After M1 PASS only, Builder merges to main (merge commit, never force) and confirms the push build green. Then EVERY enrolled recipe is re-run through the real harness and must match its pre-change baseline:
    1. Baseline matrix first (build it BEFORE merging, during M1): for each of the 21 recipe dirs record expected outcome from the most recent known-good evidence (dashboard/results history; the bad-canary recipes custom-html-bkp-bad/rst-bad are EXPECTED to fail at their designed tier — record the tier). Commit the matrix to STATUS-rcust.md.
    2. Canary suite on the merged main: cc-ci-run python -m pytest tests/regression/ -m canary -v — all seven canaries (green canaries pass, RED canaries still caught at the intended tier; this is the false-green detector).
    3. Per-recipe full runs on the merged main, every enrolled recipe: RECIPE=<r> PR=<n> REF=<sha> SRC=recipe-maintainers/<r> STAGES=install,upgrade,backup,restore,custom cc-ci-run runner/run_recipe_ci.py using each mirror's current default-branch head as REF (or the open PR head where one exists — immich#2, plausible#3 may also be exercised via !testme through drone for at least TWO recipes so the drone→harness path is covered too). Max 23 concurrent (respect runner capacity); teardown verified after each (zero leaked apps — janitor sweep clean).
    4. Verdict + evidence: per-recipe result level == baseline matrix. ALSO spot-grep run logs to prove customizations actually executed (no silent loss): mumble READY_PROBE tcp lines, cryptpad SANDBOX_DOMAIN in env, ghost/discourse BACKUP_VERIFY lines + overlay copy + chaos base deploy, lasuite-* deps provisioning + OIDC tests ran (skip-count 0), immich ops.py seeds, manifest block present in every log, screenshot.png present where capture succeeded. Evidence (run ids, log excerpts) in STATUS-rcust.md; Adversary independently re-checks a sample of ≥5 recipes' logs + ALL mismatches. Any regression vs baseline → fix-forward on main only for trivial breakage with Adversary approval; otherwise revert the merge (rollback below) and return to M1.

## DONE in STATUS-rcust.md only when M1 and M2 both show a fresh Adversary PASS in REVIEW-rcust.md.

Guardrails (builder + adversary MUST honor)

  • This plan DOES touch tests/<recipe>/ — but ONLY mechanically (signatures, fixture/key renames, underscore prefixes, install_steps consolidation). NEVER change an assertion, expected value, seeded marker, or test semantics. NEVER weaken recipe-test gates, the generic floor, HC2, or the F2-11 skip-report.
  • cc-ci main is touched ONLY by the M2 merge after M1 PASS (and gated fix-forwards in M2). Never force-push. NEVER merge or push recipe-mirror repos — !testme comments only.
  • No secrets in commits; reference .testenv / /run/secrets locations only.
  • Teardown all dev/test deploys on every exit path; M2 sweep must leave zero apps behind.
  • Do NOT touch concurrency machinery (lifetime.py, app locks, janitor, ABRA_DIR) beyond pass-through meta params.
  • Match repo commit style (feat(harness)/fix(...)/test(...)/docs:).
  • Stuck >2 cycles on the same defect → write it to BACKLOG-rcust.md + JOURNAL-rcust.md and ping the orchestrator via INBOX rather than thrashing.

Rollback

Single revert of the merge commit restores the six-loader world; recipe_meta.py edits ride in the same merge so the revert is self-consistent. No persistent state involved (config is all in-repo); re-running the M2 sweep after a revert re-validates the old world.