feat(2): refactor — SSO-dep plan refinement (deps AFTER generic + setup_custom_tests + failure isolation)

Per operator-2026-05-28 SSO-dep plan (plan-sso-dep-testing.md). Substantial orchestrator restructuring: NEW LIFECYCLE ORDER: 1. Recipe deploy ALONE (no deps). 2. install / upgrade / backup / restore — recipe-only generic tiers. 3. setup_custom_tests step (NEW): a. Deploy each declared dep + provision realm/client/test-user via harness.sso. b. Write $CCCI_DEPS_FILE in dict shape {dep_recipe: {domain, realm, client_id, client_secret, admin_user, admin_password, discovery_url, token_url, ...}}. c. Run tests/<recipe>/setup_custom_tests.sh hook (jq-readable; wires OIDC env via abra secret insert + .env edits + in-place 'abra app deploy --force --chaos'). 4. CUSTOM tier with deps-ready flag; @pytest.mark.requires_deps tests skip with 'deps-not-ready: <reason>' when setup_custom_tests fails. NON-deps custom tests still run normally — FAILURE ISOLATION (a DoD item per plan). 5. Teardown: recipe first, deps in reverse declaration order. Harness changes: - runner/run_recipe_ci.py: deps deploy moves from BEFORE recipe deploy to AFTER restore tier. Adds _enrich_deps_with_sso() + _run_setup_custom_tests_hook(). DG4.1 generalised to 'one abra app new per app' (recipe + each dep); in-place redeploys (\--force) don't count. - runner/harness/deps.py: write_run_state + load_run_state accept dict OR list shape; deps_as_dict() coerces either to a recipe→entry map. - runner/harness/sso.py: admin_password_inside() public re-export. - tests/conftest.py: deps_creds fixture (full creds dict); deps_apps fixture flattens to recipe→domain string. pytest_collection_modifyitems hook skips \@pytest.mark.requires_deps tests when CCCI_DEPS_READY=0. pytest_configure registers the marker. Recipe content: - tests/lasuite-docs/setup_custom_tests.sh: NEW hook reads $CCCI_DEPS_FILE via jq; inserts oidc_rpcs secret at BUMPED version (v1→v2) since abra app new -S generates v1 first and Swarm forbids overwriting; updates SECRET_OIDC_RPCS_VERSION in .env; writes 9 OIDC env vars (REALM/DISCOVERY/AUTH/TOKEN/USERINFO/LOGOUT/JWKS/CLIENT_ID/SCOPES); ensures trailing newline on .env so writes don't concatenate (caught a 'TIMEOUT=900OIDC_REALM=...' bug); triggers in-place 'abra app deploy --force --chaos --no-input'. - tests/lasuite-docs/functional/test_oidc_with_keycloak.py: refactored to consume deps_creds fixture (no longer calls setup_keycloak_realm itself — the orchestrator does it in setup_custom_tests). Marked \@pytest.mark.requires_deps. Cold-verifiable on cc-ci (log /root/ccci-refactor-lasuite-r5.log): RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py install: PASS, custom: 3 PASS incl. test_oidc_password_grant_against_dep_keycloak. deploy-count = 2 (expect 2) — DG4.1 generalised holds. Smoke regression: RECIPE=custom-html STAGES=install,custom → 5 PASS, deploy-count=1. Closes DEFERRED.md #5 (lasuite-docs OIDC parity ports via this plan). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 19:11:42 +01:00
parent 5832da4fd1
commit 41ede13042
7 changed files with 386 additions and 104 deletions
--- a/runner/harness/deps.py
+++ b/runner/harness/deps.py
@ -60,9 +60,17 @@ def dep_domain(parent_recipe: str, pr: str, ref: str | None, dep_recipe: str) ->
    return naming.app_domain(dep_recipe, pr, synthetic_ref)


-def write_run_state(deps_state: list[dict]) -> None:
-    """Write the deps state file ($CCCI_DEPS_FILE) so dependent tests can find their dep apps via
-    the `deps_apps` fixture. No-op if the env var isn't set."""
+def write_run_state(deps_state) -> None:
+    """Write the deps state file ($CCCI_DEPS_FILE). Two shapes supported (canonical=keyed dict):
+
+    1. **Legacy list-of-entries:** `[{"recipe": "<dep>", "domain": "<d>"}, ...]` (Q2.3 original).
+       Still accepted by `load_run_state` for backwards compat — `deps_apps` fixture flattens.
+    2. **NEW per-spec dict (operator-2026-05-28 SSO-dep plan §3.2):**
+       `{"<dep_recipe>": {"recipe": "<dep>", "domain": "<d>", "realm": "...",
+       "client_id": "...", "client_secret": "...", "admin_user": "...", "admin_password": "..."}}`.
+       The `setup_custom_tests.sh` per-recipe hook reads this via `jq` to wire OIDC env.
+
+    No-op if `$CCCI_DEPS_FILE` isn't set."""
    path = os.environ.get("CCCI_DEPS_FILE")
    if not path:
        return
@ -143,8 +151,9 @@ def teardown_deps(state: list[dict]) -> None:
        raise lifecycle.TeardownError("dep teardown failures: " + " ; ".join(errors))


-def load_run_state() -> list[dict]:
-    """Read the current run's deps state (used by the `deps_apps` fixture). Returns [] if unset."""
+def load_run_state():
+    """Read the current run's deps state. Returns the JSON content (list OR dict — both shapes
+    supported, see write_run_state). Returns [] if file is empty/unset."""
    path = os.environ.get("CCCI_DEPS_FILE")
    if not path or not os.path.exists(path):
        return []
@ -153,3 +162,15 @@ def load_run_state() -> list[dict]:
            return json.load(f) or []
    except (OSError, ValueError):
        return []
+
+
+def deps_as_dict(state) -> dict[str, dict]:
+    """Coerce either shape (legacy list or new dict) into a recipe→entry dict for the deps_apps
+    fixture + dependent-tests consumption."""
+    if isinstance(state, dict):
+        return state
+    out: dict[str, dict] = {}
+    for entry in state or []:
+        if isinstance(entry, dict) and entry.get("recipe"):
+            out[entry["recipe"]] = entry
+    return out
--- a/runner/harness/sso.py
+++ b/runner/harness/sso.py
@ -264,6 +264,12 @@ def assert_discovery_endpoint(creds: dict) -> dict:
 # ---------------------------------------------------------------------------


+def admin_password_inside(provider_domain: str) -> str:
+    """Read the abra-generated admin_password from inside the provider container.
+    Public re-export of the previously-private _kc_admin_password for the orchestrator wiring."""
+    return _kc_admin_password(provider_domain)
+
+
 def write_sso_creds(creds: dict) -> None:
    """Persist creds to $CCCI_SSO_CREDS_FILE for the dependent recipe's tests to read. The file is
    in /tmp (the runner's per-process tempdir) and deleted at run end alongside the deps file."""
--- a/runner/run_recipe_ci.py
+++ b/runner/run_recipe_ci.py
@ -279,6 +279,80 @@ def run_lifecycle_tier(
    return "pass" if rc_all == 0 else "fail"


+def _enrich_deps_with_sso(parent_recipe: str, parent_domain: str, deps_list) -> dict[str, dict]:
+    """For each dep, set up a fresh realm/client + test user via the harness's provider-specific
+    setup function, then return a recipe→entry dict carrying domain + admin + realm/client/user
+    info — the shape the `setup_custom_tests.sh` hook (and dependent tests) read.
+
+    Provider routing: today only `keycloak` is supported. authentik will need a parallel
+    `setup_authentik_realm` when an authentik-dep recipe enrolls (DEFERRED.md #9).
+    """
+    from harness import sso  # local import — sso may not be needed for dep-less runs
+
+    out: dict[str, dict] = {}
+    for entry in deps_list or []:
+        dep_recipe = entry.get("recipe")
+        dep_domain = entry.get("domain")
+        if not dep_recipe or not dep_domain:
+            continue
+        if dep_recipe != "keycloak":
+            # Provider not yet supported — record bare entry; setup_custom_tests.sh / tests will
+            # raise if they need realm/client info they don't see.
+            out[dep_recipe] = entry
+            continue
+        # The realm/client name uses the parent recipe name so collisions across parents are
+        # impossible on a shared keycloak (and the values are predictable for debugging).
+        realm = parent_recipe
+        client_id = parent_recipe
+        creds = sso.setup_keycloak_realm(
+            dep_domain,
+            realm=realm,
+            client_id=client_id,
+            redirect_uris=[f"https://{parent_domain}/*"],
+            web_origins=[f"https://{parent_domain}"],
+        )
+        out[dep_recipe] = {
+            "recipe": dep_recipe,
+            "domain": dep_domain,
+            "realm": creds["realm"],
+            "client_id": creds["client_id"],
+            "client_secret": creds["client_secret"],
+            "user": creds["user"],
+            "password": creds["password"],
+            "email": creds["email"],
+            "discovery_url": creds["discovery_url"],
+            "token_url": creds["token_url"],
+            "auth_url": creds["auth_url"],
+            "userinfo_url": creds["userinfo_url"],
+            "admin_user": "admin",
+            "admin_password": sso.admin_password_inside(dep_domain),
+        }
+    return out
+
+
+def _run_setup_custom_tests_hook(recipe: str, domain: str, deps_file: str) -> None:
+    """Run `tests/<recipe>/setup_custom_tests.sh` if present (operator-2026-05-28 SSO-dep plan
+    §3.2). The hook reads `$CCCI_DEPS_FILE`, sets OIDC env via `abra app config set` + secret
+    insert, and triggers an in-place `abra app deploy --force --chaos`. Failure here propagates
+    to mark deps-not-ready (caught in main())."""
+    path = os.path.join(ROOT, "tests", recipe, "setup_custom_tests.sh")
+    if not os.path.isfile(path):
+        # No hook = recipe doesn't need post-deps wiring; deps are deployed + creds available
+        # via deps_apps fixture as-is.
+        print(f"  setup_custom_tests: no hook at {os.path.relpath(path, ROOT)} (deps creds ready in $CCCI_DEPS_FILE)", flush=True)
+        return
+    print(f"  setup_custom_tests hook: {os.path.relpath(path, ROOT)}", flush=True)
+    rc = subprocess.run(
+        ["bash", path],
+        check=False,
+        env=dict(os.environ, CCCI_APP_DOMAIN=domain, CCCI_RECIPE=recipe, CCCI_DEPS_FILE=deps_file),
+    )
+    if rc.returncode != 0:
+        raise RuntimeError(
+            f"setup_custom_tests.sh exited {rc.returncode} (deps env not wired into parent)"
+        )
+
+
 def run_custom(recipe: str, repo_local: str | None, domain: str) -> str:
    """Run all discovered non-lifecycle custom test_*.py (both locations, additive). Returns
    'skip' if none defined, else 'pass'/'fail'."""
@ -344,59 +418,47 @@ def main() -> int:
    os.environ["CCCI_OP_STATE_FILE"] = statefile
    op_state: dict = {}

-    # Run-scoped dep state (Phase 2 Q2.3): if this recipe declares DEPS in recipe_meta, the
-    # orchestrator deploys each dep BEFORE the recipe under test, persists their per-run identity
-    # here for dependent tests to read via the `deps_apps` fixture, and tears them down LAST in
-    # finally (reverse order). Empty list when no deps declared.
+    # Run-scoped dep state (Phase 2 Q2.3, refined per operator-2026-05-28 SSO-dep plan §1):
+    # deps now deploy AFTER generic tiers (between RESTORE and CUSTOM) so a failed dep deploy
+    # cannot break the generic-tier signal. The `setup_custom_tests` step deploys each dep + runs
+    # `tests/<recipe>/setup_custom_tests.sh` to wire OIDC env via in-place redeploy.
+    # `$CCCI_DEPS_FILE` is written with the full creds dict the hook script needs (jq-readable).
    depsfile = os.path.join(tempfile.gettempdir(), f"ccci-deps-{domain}.json")
    with open(depsfile, "w") as f:
-        json.dump([], f)
+        json.dump({}, f)
    os.environ["CCCI_DEPS_FILE"] = depsfile
    declared = deps_mod.declared_deps(recipe)
    if declared:
-        print(f"\n===== DEPS: {declared} =====", flush=True)
-    deps_state: list[dict] = []
+        print(f"\n===== DEPS declared (deploy AFTER generic tiers): {declared} =====", flush=True)
+    deps_state: dict[str, dict] = {}  # new shape: recipe→entry dict (sso-dep plan §1)
+    deps_ready = True
+    deps_not_ready_reason: str = ""

    results: dict[str, str] = {}
    lifecycle.janitor()
-    dep_deploy_failed = False
    dep_teardown_error: str | None = None
    try:
-        # ---- deps deploy FIRST (sequentially), if declared (Q2.3) ----
-        if declared:
-            try:
-                # Build a per-dep meta map for readiness waits (timeouts/health-path/codes)
-                dep_metas = {d: _load_meta(d) for d in declared}
-                deps_state = deps_mod.deploy_deps(
-                    recipe, os.environ.get("PR", "0"), ref, declared, meta_for=dep_metas
-                )
-            except Exception as e:  # noqa: BLE001 — failed dep deploy is a recipe install failure
-                print(f"!! dep deploy failed: {_scrub(str(e))}", flush=True)
-                dep_deploy_failed = True
-        # ---- deploy ONCE + wait ready (the single deployment all tiers share) ----
-        if dep_deploy_failed:
+        # ---- deploy RECIPE FIRST, alone (no deps yet — generic tiers run recipe-only) ----
+        try:
+            lifecycle.deploy_app(
+                recipe,
+                domain,
+                version=base,
+                secrets=True,
+                install_steps_hook=hook,
+                deploy_timeout=int(meta.get("DEPLOY_TIMEOUT", 900)),
+            )
+            lifecycle.wait_healthy(
+                domain,
+                ok_codes=tuple(meta["HEALTH_OK"]),
+                path=meta["HEALTH_PATH"],
+                deploy_timeout=meta["DEPLOY_TIMEOUT"],
+                http_timeout=meta["HTTP_TIMEOUT"],
+            )
+            deploy_ok = True
+        except Exception as e:  # noqa: BLE001 — a failed deploy is a reported INSTALL failure
+            print(f"!! deploy/readiness failed: {e}", flush=True)
            deploy_ok = False
-        else:
-            try:
-                lifecycle.deploy_app(
-                    recipe,
-                    domain,
-                    version=base,
-                    secrets=True,
-                    install_steps_hook=hook,
-                    deploy_timeout=int(meta.get("DEPLOY_TIMEOUT", 900)),
-                )
-                lifecycle.wait_healthy(
-                    domain,
-                    ok_codes=tuple(meta["HEALTH_OK"]),
-                    path=meta["HEALTH_PATH"],
-                    deploy_timeout=meta["DEPLOY_TIMEOUT"],
-                    http_timeout=meta["HTTP_TIMEOUT"],
-                )
-                deploy_ok = True
-            except Exception as e:  # noqa: BLE001 — a failed deploy is a reported INSTALL failure, not a crash
-                print(f"!! deploy/readiness failed: {e}", flush=True)
-                deploy_ok = False

        # ---- INSTALL tier (always; additive generic + overlay, no op) ----
        if "install" in stages:
@ -433,8 +495,38 @@ def main() -> int:
                    if backup_cap
                    else "skip"
                )
+            # ---- setup_custom_tests step (NEW, operator-2026-05-28 SSO-dep plan §3.2) ----
+            # Deploy each declared dep + wire OIDC env into the parent app via the per-recipe
+            # setup_custom_tests.sh hook + in-place redeploy. Failure here marks deps-not-ready
+            # but does NOT abort the run — @pytest.mark.requires_deps tests skip with reason;
+            # non-deps custom tests still run normally.
+            if declared:
+                print("\n===== setup_custom_tests: deps + OIDC wiring =====", flush=True)
+                try:
+                    dep_metas = {d: _load_meta(d) for d in declared}
+                    deps_list = deps_mod.deploy_deps(
+                        recipe, os.environ.get("PR", "0"), ref, declared, meta_for=dep_metas
+                    )
+                    # Enrich each dep entry with SSO creds (realm/client/secret) by setting up a
+                    # keycloak realm per dep. The dict form is what setup_custom_tests.sh reads.
+                    deps_state = _enrich_deps_with_sso(recipe, domain, deps_list)
+                    deps_mod.write_run_state(deps_state)
+                    # Run the per-recipe post-deps hook (jq-driven OIDC wiring + in-place redeploy)
+                    _run_setup_custom_tests_hook(recipe, domain, depsfile)
+                except Exception as e:  # noqa: BLE001 — setup failure is ISOLATED to dep-marked tests
+                    deps_ready = False
+                    deps_not_ready_reason = _scrub(str(e))[:300]
+                    print(
+                        f"!! setup_custom_tests failed (deps-not-ready): {deps_not_ready_reason}",
+                        flush=True,
+                    )
+
            # ---- CUSTOM tier ----
            if "custom" in stages:
+                # Pass deps-ready state via env; conftest.py skips @pytest.mark.requires_deps
+                # tests when CCCI_DEPS_READY=0.
+                os.environ["CCCI_DEPS_READY"] = "1" if deps_ready else "0"
+                os.environ["CCCI_DEPS_NOT_READY_REASON"] = deps_not_ready_reason
                results["custom"] = run_custom(recipe, repo_local, domain)
        else:
            # install failed → the shared deployment is dead; remaining tiers cannot run on it.
@ -451,7 +543,13 @@ def main() -> int:
        if deps_state:
            print("\n===== DEPS teardown =====", flush=True)
            try:
-                deps_mod.teardown_deps(deps_state)
+                # teardown_deps accepts a list of entries; flatten the dict-shape state in
+                # declaration-reverse order so teardown sequencing matches §1's contract.
+                if isinstance(deps_state, dict):
+                    list_for_teardown = [deps_state[d] for d in declared if d in deps_state]
+                else:
+                    list_for_teardown = deps_state
+                deps_mod.teardown_deps(list_for_teardown)
            except lifecycle.TeardownError as e:
                dep_teardown_error = str(e)
                print(f"!! {dep_teardown_error}", flush=True)
@ -466,13 +564,22 @@ def main() -> int:
        os.remove(depsfile)

    # ---- per-op summary (DG6 feed) ----
-    # Phase 2 Q2.3: deps each `deploy_app` once, so the expected count = 1 (recipe under test) +
-    # len(deps). DG4.1 still holds — no extra deploys per recipe — just accommodates declared deps.
-    expected_deploy_count = 1 + len(deps_state)
+    # SSO-dep plan §1: DG4.1 generalised — one `abra app new` per app in the run (recipe + each
+    # dep). In-place reconfigure-and-redeploy (the setup_custom_tests step's
+    # `abra app deploy --force --chaos`) is NOT a fresh `app_new` and does NOT increment the
+    # count. So expected = 1 + (number of deps that actually got deployed).
+    deps_deployed_count = len(deps_state) if isinstance(deps_state, dict) else len(deps_state or [])
+    expected_deploy_count = 1 + deps_deployed_count
    print("\n===== RUN SUMMARY =====", flush=True)
    print(f"deploy-count = {deploy_count} (expect {expected_deploy_count})")
    if deps_state:
-        print(f"  deps deployed: {[d['recipe'] for d in deps_state]}")
+        deps_list_for_summary = (
+            list(deps_state.keys()) if isinstance(deps_state, dict)
+            else [d.get("recipe", "?") for d in deps_state]
+        )
+        print(f"  deps deployed: {deps_list_for_summary}")
+        if not deps_ready:
+            print(f"  deps-not-ready: {deps_not_ready_reason}")
    order = [s for s in ALL_STAGES if s in results]
    for op in order:
        print(f"  {op:8s}: {results[op]}")