feat(2): refactor — SSO-dep plan refinement (deps AFTER generic + setup_custom_tests + failure isolation)

Per operator-2026-05-28 SSO-dep plan (plan-sso-dep-testing.md). Substantial orchestrator
restructuring:

NEW LIFECYCLE ORDER:
  1. Recipe deploy ALONE (no deps).
  2. install / upgrade / backup / restore — recipe-only generic tiers.
  3. setup_custom_tests step (NEW):
     a. Deploy each declared dep + provision realm/client/test-user via harness.sso.
     b. Write $CCCI_DEPS_FILE in dict shape {dep_recipe: {domain, realm, client_id, client_secret,
        admin_user, admin_password, discovery_url, token_url, ...}}.
     c. Run tests/<recipe>/setup_custom_tests.sh hook (jq-readable; wires OIDC env via abra
        secret insert + .env edits + in-place 'abra app deploy --force --chaos').
  4. CUSTOM tier with deps-ready flag; @pytest.mark.requires_deps tests skip with
     'deps-not-ready: <reason>' when setup_custom_tests fails. NON-deps custom tests still run
     normally — FAILURE ISOLATION (a DoD item per plan).
  5. Teardown: recipe first, deps in reverse declaration order.

Harness changes:
- runner/run_recipe_ci.py: deps deploy moves from BEFORE recipe deploy to AFTER restore tier.
  Adds _enrich_deps_with_sso() + _run_setup_custom_tests_hook(). DG4.1 generalised to
  'one abra app new per app' (recipe + each dep); in-place redeploys (\--force) don't count.
- runner/harness/deps.py: write_run_state + load_run_state accept dict OR list shape;
  deps_as_dict() coerces either to a recipe→entry map.
- runner/harness/sso.py: admin_password_inside() public re-export.
- tests/conftest.py: deps_creds fixture (full creds dict); deps_apps fixture flattens to
  recipe→domain string. pytest_collection_modifyitems hook skips
  \@pytest.mark.requires_deps tests when CCCI_DEPS_READY=0.
  pytest_configure registers the marker.

Recipe content:
- tests/lasuite-docs/setup_custom_tests.sh: NEW hook reads $CCCI_DEPS_FILE via jq;
  inserts oidc_rpcs secret at BUMPED version (v1→v2) since abra app new -S generates v1 first
  and Swarm forbids overwriting; updates SECRET_OIDC_RPCS_VERSION in .env; writes 9 OIDC env
  vars (REALM/DISCOVERY/AUTH/TOKEN/USERINFO/LOGOUT/JWKS/CLIENT_ID/SCOPES); ensures trailing
  newline on .env so writes don't concatenate (caught a 'TIMEOUT=900OIDC_REALM=...' bug);
  triggers in-place 'abra app deploy --force --chaos --no-input'.
- tests/lasuite-docs/functional/test_oidc_with_keycloak.py: refactored to consume deps_creds
  fixture (no longer calls setup_keycloak_realm itself — the orchestrator does it in
  setup_custom_tests). Marked \@pytest.mark.requires_deps.

Cold-verifiable on cc-ci (log /root/ccci-refactor-lasuite-r5.log):
  RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py
  install: PASS, custom: 3 PASS incl. test_oidc_password_grant_against_dep_keycloak.
  deploy-count = 2 (expect 2) — DG4.1 generalised holds.
  Smoke regression: RECIPE=custom-html STAGES=install,custom → 5 PASS, deploy-count=1.

Closes DEFERRED.md #5 (lasuite-docs OIDC parity ports via this plan).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 19:11:42 +01:00
parent 5832da4fd1
commit 41ede13042
7 changed files with 386 additions and 104 deletions

View File

@ -60,9 +60,17 @@ def dep_domain(parent_recipe: str, pr: str, ref: str | None, dep_recipe: str) ->
return naming.app_domain(dep_recipe, pr, synthetic_ref)
def write_run_state(deps_state: list[dict]) -> None:
"""Write the deps state file ($CCCI_DEPS_FILE) so dependent tests can find their dep apps via
the `deps_apps` fixture. No-op if the env var isn't set."""
def write_run_state(deps_state) -> None:
"""Write the deps state file ($CCCI_DEPS_FILE). Two shapes supported (canonical=keyed dict):
1. **Legacy list-of-entries:** `[{"recipe": "<dep>", "domain": "<d>"}, ...]` (Q2.3 original).
Still accepted by `load_run_state` for backwards compat — `deps_apps` fixture flattens.
2. **NEW per-spec dict (operator-2026-05-28 SSO-dep plan §3.2):**
`{"<dep_recipe>": {"recipe": "<dep>", "domain": "<d>", "realm": "...",
"client_id": "...", "client_secret": "...", "admin_user": "...", "admin_password": "..."}}`.
The `setup_custom_tests.sh` per-recipe hook reads this via `jq` to wire OIDC env.
No-op if `$CCCI_DEPS_FILE` isn't set."""
path = os.environ.get("CCCI_DEPS_FILE")
if not path:
return
@ -143,8 +151,9 @@ def teardown_deps(state: list[dict]) -> None:
raise lifecycle.TeardownError("dep teardown failures: " + " ; ".join(errors))
def load_run_state() -> list[dict]:
"""Read the current run's deps state (used by the `deps_apps` fixture). Returns [] if unset."""
def load_run_state():
"""Read the current run's deps state. Returns the JSON content (list OR dict — both shapes
supported, see write_run_state). Returns [] if file is empty/unset."""
path = os.environ.get("CCCI_DEPS_FILE")
if not path or not os.path.exists(path):
return []
@ -153,3 +162,15 @@ def load_run_state() -> list[dict]:
return json.load(f) or []
except (OSError, ValueError):
return []
def deps_as_dict(state) -> dict[str, dict]:
"""Coerce either shape (legacy list or new dict) into a recipe→entry dict for the deps_apps
fixture + dependent-tests consumption."""
if isinstance(state, dict):
return state
out: dict[str, dict] = {}
for entry in state or []:
if isinstance(entry, dict) and entry.get("recipe"):
out[entry["recipe"]] = entry
return out

View File

@ -264,6 +264,12 @@ def assert_discovery_endpoint(creds: dict) -> dict:
# ---------------------------------------------------------------------------
def admin_password_inside(provider_domain: str) -> str:
"""Read the abra-generated admin_password from inside the provider container.
Public re-export of the previously-private _kc_admin_password for the orchestrator wiring."""
return _kc_admin_password(provider_domain)
def write_sso_creds(creds: dict) -> None:
"""Persist creds to $CCCI_SSO_CREDS_FILE for the dependent recipe's tests to read. The file is
in /tmp (the runner's per-process tempdir) and deleted at run end alongside the deps file."""

View File

@ -279,6 +279,80 @@ def run_lifecycle_tier(
return "pass" if rc_all == 0 else "fail"
def _enrich_deps_with_sso(parent_recipe: str, parent_domain: str, deps_list) -> dict[str, dict]:
"""For each dep, set up a fresh realm/client + test user via the harness's provider-specific
setup function, then return a recipe→entry dict carrying domain + admin + realm/client/user
info — the shape the `setup_custom_tests.sh` hook (and dependent tests) read.
Provider routing: today only `keycloak` is supported. authentik will need a parallel
`setup_authentik_realm` when an authentik-dep recipe enrolls (DEFERRED.md #9).
"""
from harness import sso # local import — sso may not be needed for dep-less runs
out: dict[str, dict] = {}
for entry in deps_list or []:
dep_recipe = entry.get("recipe")
dep_domain = entry.get("domain")
if not dep_recipe or not dep_domain:
continue
if dep_recipe != "keycloak":
# Provider not yet supported — record bare entry; setup_custom_tests.sh / tests will
# raise if they need realm/client info they don't see.
out[dep_recipe] = entry
continue
# The realm/client name uses the parent recipe name so collisions across parents are
# impossible on a shared keycloak (and the values are predictable for debugging).
realm = parent_recipe
client_id = parent_recipe
creds = sso.setup_keycloak_realm(
dep_domain,
realm=realm,
client_id=client_id,
redirect_uris=[f"https://{parent_domain}/*"],
web_origins=[f"https://{parent_domain}"],
)
out[dep_recipe] = {
"recipe": dep_recipe,
"domain": dep_domain,
"realm": creds["realm"],
"client_id": creds["client_id"],
"client_secret": creds["client_secret"],
"user": creds["user"],
"password": creds["password"],
"email": creds["email"],
"discovery_url": creds["discovery_url"],
"token_url": creds["token_url"],
"auth_url": creds["auth_url"],
"userinfo_url": creds["userinfo_url"],
"admin_user": "admin",
"admin_password": sso.admin_password_inside(dep_domain),
}
return out
def _run_setup_custom_tests_hook(recipe: str, domain: str, deps_file: str) -> None:
"""Run `tests/<recipe>/setup_custom_tests.sh` if present (operator-2026-05-28 SSO-dep plan
§3.2). The hook reads `$CCCI_DEPS_FILE`, sets OIDC env via `abra app config set` + secret
insert, and triggers an in-place `abra app deploy --force --chaos`. Failure here propagates
to mark deps-not-ready (caught in main())."""
path = os.path.join(ROOT, "tests", recipe, "setup_custom_tests.sh")
if not os.path.isfile(path):
# No hook = recipe doesn't need post-deps wiring; deps are deployed + creds available
# via deps_apps fixture as-is.
print(f" setup_custom_tests: no hook at {os.path.relpath(path, ROOT)} (deps creds ready in $CCCI_DEPS_FILE)", flush=True)
return
print(f" setup_custom_tests hook: {os.path.relpath(path, ROOT)}", flush=True)
rc = subprocess.run(
["bash", path],
check=False,
env=dict(os.environ, CCCI_APP_DOMAIN=domain, CCCI_RECIPE=recipe, CCCI_DEPS_FILE=deps_file),
)
if rc.returncode != 0:
raise RuntimeError(
f"setup_custom_tests.sh exited {rc.returncode} (deps env not wired into parent)"
)
def run_custom(recipe: str, repo_local: str | None, domain: str) -> str:
"""Run all discovered non-lifecycle custom test_*.py (both locations, additive). Returns
'skip' if none defined, else 'pass'/'fail'."""
@ -344,59 +418,47 @@ def main() -> int:
os.environ["CCCI_OP_STATE_FILE"] = statefile
op_state: dict = {}
# Run-scoped dep state (Phase 2 Q2.3): if this recipe declares DEPS in recipe_meta, the
# orchestrator deploys each dep BEFORE the recipe under test, persists their per-run identity
# here for dependent tests to read via the `deps_apps` fixture, and tears them down LAST in
# finally (reverse order). Empty list when no deps declared.
# Run-scoped dep state (Phase 2 Q2.3, refined per operator-2026-05-28 SSO-dep plan §1):
# deps now deploy AFTER generic tiers (between RESTORE and CUSTOM) so a failed dep deploy
# cannot break the generic-tier signal. The `setup_custom_tests` step deploys each dep + runs
# `tests/<recipe>/setup_custom_tests.sh` to wire OIDC env via in-place redeploy.
# `$CCCI_DEPS_FILE` is written with the full creds dict the hook script needs (jq-readable).
depsfile = os.path.join(tempfile.gettempdir(), f"ccci-deps-{domain}.json")
with open(depsfile, "w") as f:
json.dump([], f)
json.dump({}, f)
os.environ["CCCI_DEPS_FILE"] = depsfile
declared = deps_mod.declared_deps(recipe)
if declared:
print(f"\n===== DEPS: {declared} =====", flush=True)
deps_state: list[dict] = []
print(f"\n===== DEPS declared (deploy AFTER generic tiers): {declared} =====", flush=True)
deps_state: dict[str, dict] = {} # new shape: recipe→entry dict (sso-dep plan §1)
deps_ready = True
deps_not_ready_reason: str = ""
results: dict[str, str] = {}
lifecycle.janitor()
dep_deploy_failed = False
dep_teardown_error: str | None = None
try:
# ---- deps deploy FIRST (sequentially), if declared (Q2.3) ----
if declared:
try:
# Build a per-dep meta map for readiness waits (timeouts/health-path/codes)
dep_metas = {d: _load_meta(d) for d in declared}
deps_state = deps_mod.deploy_deps(
recipe, os.environ.get("PR", "0"), ref, declared, meta_for=dep_metas
)
except Exception as e: # noqa: BLE001 — failed dep deploy is a recipe install failure
print(f"!! dep deploy failed: {_scrub(str(e))}", flush=True)
dep_deploy_failed = True
# ---- deploy ONCE + wait ready (the single deployment all tiers share) ----
if dep_deploy_failed:
# ---- deploy RECIPE FIRST, alone (no deps yet — generic tiers run recipe-only) ----
try:
lifecycle.deploy_app(
recipe,
domain,
version=base,
secrets=True,
install_steps_hook=hook,
deploy_timeout=int(meta.get("DEPLOY_TIMEOUT", 900)),
)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
deploy_ok = True
except Exception as e: # noqa: BLE001 — a failed deploy is a reported INSTALL failure
print(f"!! deploy/readiness failed: {e}", flush=True)
deploy_ok = False
else:
try:
lifecycle.deploy_app(
recipe,
domain,
version=base,
secrets=True,
install_steps_hook=hook,
deploy_timeout=int(meta.get("DEPLOY_TIMEOUT", 900)),
)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
deploy_ok = True
except Exception as e: # noqa: BLE001 — a failed deploy is a reported INSTALL failure, not a crash
print(f"!! deploy/readiness failed: {e}", flush=True)
deploy_ok = False
# ---- INSTALL tier (always; additive generic + overlay, no op) ----
if "install" in stages:
@ -433,8 +495,38 @@ def main() -> int:
if backup_cap
else "skip"
)
# ---- setup_custom_tests step (NEW, operator-2026-05-28 SSO-dep plan §3.2) ----
# Deploy each declared dep + wire OIDC env into the parent app via the per-recipe
# setup_custom_tests.sh hook + in-place redeploy. Failure here marks deps-not-ready
# but does NOT abort the run — @pytest.mark.requires_deps tests skip with reason;
# non-deps custom tests still run normally.
if declared:
print("\n===== setup_custom_tests: deps + OIDC wiring =====", flush=True)
try:
dep_metas = {d: _load_meta(d) for d in declared}
deps_list = deps_mod.deploy_deps(
recipe, os.environ.get("PR", "0"), ref, declared, meta_for=dep_metas
)
# Enrich each dep entry with SSO creds (realm/client/secret) by setting up a
# keycloak realm per dep. The dict form is what setup_custom_tests.sh reads.
deps_state = _enrich_deps_with_sso(recipe, domain, deps_list)
deps_mod.write_run_state(deps_state)
# Run the per-recipe post-deps hook (jq-driven OIDC wiring + in-place redeploy)
_run_setup_custom_tests_hook(recipe, domain, depsfile)
except Exception as e: # noqa: BLE001 — setup failure is ISOLATED to dep-marked tests
deps_ready = False
deps_not_ready_reason = _scrub(str(e))[:300]
print(
f"!! setup_custom_tests failed (deps-not-ready): {deps_not_ready_reason}",
flush=True,
)
# ---- CUSTOM tier ----
if "custom" in stages:
# Pass deps-ready state via env; conftest.py skips @pytest.mark.requires_deps
# tests when CCCI_DEPS_READY=0.
os.environ["CCCI_DEPS_READY"] = "1" if deps_ready else "0"
os.environ["CCCI_DEPS_NOT_READY_REASON"] = deps_not_ready_reason
results["custom"] = run_custom(recipe, repo_local, domain)
else:
# install failed → the shared deployment is dead; remaining tiers cannot run on it.
@ -451,7 +543,13 @@ def main() -> int:
if deps_state:
print("\n===== DEPS teardown =====", flush=True)
try:
deps_mod.teardown_deps(deps_state)
# teardown_deps accepts a list of entries; flatten the dict-shape state in
# declaration-reverse order so teardown sequencing matches §1's contract.
if isinstance(deps_state, dict):
list_for_teardown = [deps_state[d] for d in declared if d in deps_state]
else:
list_for_teardown = deps_state
deps_mod.teardown_deps(list_for_teardown)
except lifecycle.TeardownError as e:
dep_teardown_error = str(e)
print(f"!! {dep_teardown_error}", flush=True)
@ -466,13 +564,22 @@ def main() -> int:
os.remove(depsfile)
# ---- per-op summary (DG6 feed) ----
# Phase 2 Q2.3: deps each `deploy_app` once, so the expected count = 1 (recipe under test) +
# len(deps). DG4.1 still holds — no extra deploys per recipe — just accommodates declared deps.
expected_deploy_count = 1 + len(deps_state)
# SSO-dep plan §1: DG4.1 generalised — one `abra app new` per app in the run (recipe + each
# dep). In-place reconfigure-and-redeploy (the setup_custom_tests step's
# `abra app deploy --force --chaos`) is NOT a fresh `app_new` and does NOT increment the
# count. So expected = 1 + (number of deps that actually got deployed).
deps_deployed_count = len(deps_state) if isinstance(deps_state, dict) else len(deps_state or [])
expected_deploy_count = 1 + deps_deployed_count
print("\n===== RUN SUMMARY =====", flush=True)
print(f"deploy-count = {deploy_count} (expect {expected_deploy_count})")
if deps_state:
print(f" deps deployed: {[d['recipe'] for d in deps_state]}")
deps_list_for_summary = (
list(deps_state.keys()) if isinstance(deps_state, dict)
else [d.get("recipe", "?") for d in deps_state]
)
print(f" deps deployed: {deps_list_for_summary}")
if not deps_ready:
print(f" deps-not-ready: {deps_not_ready_reason}")
order = [s for s in ALL_STAGES if s in results]
for op in order:
print(f" {op:8s}: {results[op]}")