feat(harness): P3 — uniform ctx hook convention (rcust)

harness.meta.HookCtx (frozen): .domain, .base_url, .meta (RecipeMeta), .deps (provisioned dep creds from $CCCI_DEPS_FILE or None), .op (current lifecycle op or None); built via meta.hook_ctx() at each hook call site. All recipe callables now take ctx: EXTRA_ENV(ctx), UPGRADE_EXTRA_ENV(ctx), READY_PROBE(ctx), BACKUP_VERIFY(ctx), SCREENSHOT(page, ctx), ops.py pre_<op>(ctx). Dict-valued EXTRA_ENV/UPGRADE_EXTRA_ENV unchanged (only the callable signature moved). Call sites converted: deploy_app env shaping, perform_upgrade, wait_ready_probes (gains op=), _perform_op BACKUP_VERIFY, screenshot.capture, _run_pre_hook. Legacy signatures fail FAST with a clear migration message: the registry carries hook_params per hook key, enforced at meta.load() (MetaError names the old vs new signature); ops.py pre-op hooks get the same check at the orchestrator call site (meta.check_hook_signature) — no silent TypeError mid-run. Migrated every in-repo user mechanically (17 ops.py files; cryptpad/lasuite-*/ mailu EXTRA_ENV; mumble+lasuite-drive READY_PROBE; ghost/discourse BACKUP_VERIFY) — seeded values, probes and assertions byte-identical (domain -> ctx.domain; keycloak pre_restore's meta arg -> ctx.meta). Unit tests: hook_ctx field contract, ctx.deps from the run deps file, legacy- signature MetaError (READY_PROBE/EXTRA_ENV/SCREENSHOT + pre-op checker), ctx signatures accepted. Docs table regenerated (signature docs in key docs). Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 180 passed; scripts/lint.sh -> PASS.
2026-06-10 17:10:26 +00:00
parent 8cd72fd78d
commit fd02d9f4b8
34 changed files with 330 additions and 171 deletions
--- a/tests/lasuite-drive/ops.py
+++ b/tests/lasuite-drive/ops.py
@ -13,14 +13,14 @@ sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner")
 from harness import lifecycle  # noqa: E402


-def pre_install(domain, meta):
+def pre_install(ctx):
    """Post-deploy seed for the custom tier (the former setup_custom_tests.sh, moved here in rcust
    P2b — install_steps.sh runs PRE-deploy and cannot touch the live stack). The deploy alone does
    NOT create the MinIO bucket: `minio-createbuckets` is a `replicas:0` one-shot (restart_policy:
    none) that must be triggered. The MinIO storage test asserts the bucket exists, so trigger it
    here and poll. `--detach` is REQUIRED: the job creates the bucket then EXITS 0, so it never
    holds a steady 1/1 replica — a blocking scale would wait forever."""
-    stack = domain.replace(".", "_")
+    stack = ctx.domain.replace(".", "_")
    print("  pre_install: creating MinIO bucket via the minio-createbuckets one-shot", flush=True)
    subprocess.run(
        ["docker", "service", "scale", "--detach", f"{stack}_minio-createbuckets=1"],
@ -91,21 +91,21 @@ def _seed(domain, value):
    assert _psql(domain, "SELECT v FROM ci_marker;") == value


-def pre_upgrade(domain, meta):
+def pre_upgrade(ctx):
    # Gate the chaos redeploy on a fully-ready collabora (else it kills a still-booting coolwsd and
    # abra aborts the upgrade deploy — Q3.2a run 1). Then seed the data-integrity marker.
-    _wait_collabora_ready(domain)
-    _seed(domain, "upgrade-survives")
+    _wait_collabora_ready(ctx.domain)
+    _seed(ctx.domain, "upgrade-survives")


-def pre_backup(domain, meta):
-    _seed(domain, "original")
+def pre_backup(ctx):
+    _seed(ctx.domain, "original")


-def pre_restore(domain, meta):
+def pre_restore(ctx):
    # drop the marker table (diverge from the backup) so a successful restore is observable
-    _psql(domain, "DROP TABLE ci_marker;")
-    assert _psql(domain, "SELECT to_regclass('public.ci_marker');") in (
+    _psql(ctx.domain, "DROP TABLE ci_marker;")
+    assert _psql(ctx.domain, "SELECT to_regclass('public.ci_marker');") in (
        "",
        "NULL",
    ), "drop did not take"
--- a/tests/lasuite-drive/recipe_meta.py
+++ b/tests/lasuite-drive/recipe_meta.py
@ -31,18 +31,18 @@ DEPS = ["keycloak"]
 # pre_install (the former setup_custom_tests.sh, deleted in P2b).


-def READY_PROBE(domain):
+def READY_PROBE(ctx):
    """Readiness signals beyond replica-convergence + the app HEALTH_PATH (Q3.2/F2-12). collabora's
    coolwsd reports its container 1/1 'running' while still doing jail/config init, and its WOPI
    discovery endpoint 404s until ready — so the harness waits for `/hosting/discovery` → 200 on the
    collabora sibling host after the install deploy AND after the upgrade chaos redeploy. This is what
    makes the heavy prev→PR-head crossover reliably green (the new collabora 25.04.9.x finishes init
    within swarm's healthcheck retries; abra's own converge monitor was too impatient — F2-12)."""
-    label, _, rest = domain.partition(".")
-    return [{"host": f"collabora-{domain}", "path": "/hosting/discovery", "ok": (200,)}]
+    label, _, rest = ctx.domain.partition(".")
+    return [{"host": f"collabora-{ctx.domain}", "path": "/hosting/discovery", "ok": (200,)}]


-def EXTRA_ENV(domain):
+def EXTRA_ENV(ctx):
    # Two of lasuite-drive's services route on DOMAIN-DERIVED **nested** subdomains —
    # `MINIO_DOMAIN="minio.${DOMAIN}"` and `COLLABORA_DOMAIN="collabora.${DOMAIN}"`. The cc-ci
    # wildcard TLS cert is `*.ci.commoninternet.net` (single label only), so a 2-label name like
@ -52,8 +52,8 @@ def EXTRA_ENV(domain):
    # no cert/gateway change. See DECISIONS.md "Phase 2 — nested DOMAIN-derived subdomains".
    # `AWS_S3_DOMAIN_REPLACE` derives from MINIO_DOMAIN in-compose, so setting MINIO_DOMAIN is enough.
    return {
-        "MINIO_DOMAIN": f"minio-{domain}",
-        "COLLABORA_DOMAIN": f"collabora-{domain}",
+        "MINIO_DOMAIN": f"minio-{ctx.domain}",
+        "COLLABORA_DOMAIN": f"collabora-{ctx.domain}",
        # abra's internal per-deploy convergence timeout (recipe TIMEOUT env, default 300s) is too
        # short for this 12-service stack on a cold image cache (impress frontend/backend, minio,
        # postgres, redis, collabora ~1GB, onlyoffice ~2GB). Bump so abra waits long enough for