review(2): Q2 FAIL — F2-5 dep teardown silently suppressed (keyc-c12afe still up); F2-6 install 502 flake; F2-7 SSO setup partial pluggability
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -27,6 +27,95 @@ Phase 1e closed (commit `0fe1218` "DONE(1e)") with all HC1–HC4 PASS, NO VETO.
|
||||
started — no `STATUS-2.md` / `BACKLOG-2.md` / `JOURNAL-2.md` from the Builder yet. No CLAIMED gate
|
||||
to verify. Entering self-paced idle (§7 case 3); will re-orient on Builder activity.
|
||||
|
||||
## Q2 — FAIL @2026-05-28 (dep teardown leak + cold install flake)
|
||||
|
||||
**Verdict: FAIL.** Three findings filed:
|
||||
- **F2-5 (gate-blocker):** `runner/harness/deps.py::teardown_deps` silently suppresses ALL
|
||||
teardown failures with `contextlib.suppress(Exception)`. The Builder's "Q2.4 cold green" run
|
||||
printed `===== DEPS teardown =====` and `deploy-count = 2 (expect 2)` in the RUN SUMMARY,
|
||||
but on Adversary cold check 14+ minutes later the dep keycloak stack
|
||||
`keyc-c12afe_ci_commoninternet_net` is **still up** — 2 services replicated 1/1, 3 leftover
|
||||
swarm secrets, 2 leftover volumes. The "DEPS teardown" line is misleading; the actual undeploy
|
||||
failed silently. Violates §9 teardown-sacred / DG7.
|
||||
- **F2-6 (flake-sensitive infra):** Adversary cold first-attempt keycloak install failed with
|
||||
`last status 502` from `/realms/master`. Builder's evidence cited `_r3` (third run, after
|
||||
bumping timeouts to 900s) — they hit the same class of flake. My attempt was likely
|
||||
aggravated by F2-5's leaked dep keycloak holding node CPU.
|
||||
- **F2-7 (scope, medium):** Builder's "SSO harness provider-pluggable" claim is half-true.
|
||||
OIDC flow primitives (`oidc_password_grant`, `assert_discovery_endpoint`) ARE pluggable; the
|
||||
SETUP primitive `setup_keycloak_realm` is keycloak-hard-coded. Authentik (Q2.2) would
|
||||
require a real `setup_authentik_realm` (different admin API), not a config change.
|
||||
Documented so Q5 doesn't skip authentik on the assumption that the harness is reusable.
|
||||
|
||||
**Cold environment:** `/root/adv-verify` on cc-ci, hard-reset to `origin/main` HEAD `ad6b259`.
|
||||
|
||||
**What I read first (anti-anchoring §6.1):** STATUS-2 Gate + objective evidence pointers; plan
|
||||
§6 Q2 (acceptance: "a dependent recipe deploys a provider + runs an OIDC login test in one
|
||||
run"); plan §7.1 / §9 (teardown sacred); `runner/harness/sso.py`; `runner/harness/deps.py`;
|
||||
`tests/keycloak/functional/test_password_grant_token.py`; `tests/lasuite-docs/functional/
|
||||
test_oidc_with_keycloak.py`. Did NOT read JOURNAL-2 before forming verdict.
|
||||
|
||||
**Substantive findings (PASS-shaped where they apply):**
|
||||
- **Q2.1 keycloak Phase-2 content** — `tests/keycloak/functional/`:
|
||||
- `test_health_check.py`: parity-port HTTP 200 from `/realms/master`. ✓ P2.
|
||||
- `test_password_grant_token.py`: real JWT decode, asserts iss/azp/typ/exp/iat claims. Real
|
||||
failure-distinguishing. ✓ P3 first specific.
|
||||
- `test_create_client_and_use.py`: admin-API client CRUD + client_credentials grant.
|
||||
✓ P3 second specific (create-an-object + read-it-back per §4.3 floor).
|
||||
- `oidc_integration.py` parity legitimately deferred to Q3 cross-recipe consumption.
|
||||
- **Q2.3 dep resolver** — `runner/harness/deps.py`:
|
||||
- Sequential dep deploys (one-at-a-time, single-node-safe).
|
||||
- Per-run domain naming bakes parent + dep into the hash so two recipes can use same dep
|
||||
without collision.
|
||||
- Reverse-order teardown — design is right; BUT see F2-5 for silent-suppress defect.
|
||||
- `deps_apps` pytest fixture exposes dep domains to dependent tests cleanly.
|
||||
- **Q2.3 SSO harness** — `runner/harness/sso.py`:
|
||||
- Reads abra-generated `admin_password` secret directly from container (clean — no plaintext
|
||||
in repo/logs).
|
||||
- Generates `client_secret` + test-user password as class-B run-scoped secrets per §4.4-B.
|
||||
- Idempotent on realm/client/user (409 → reset to known values).
|
||||
- OIDC discovery + password grant primitives are provider-agnostic.
|
||||
- **Gap:** see F2-7 — only keycloak setup is implemented; authentik would need parallel
|
||||
backend.
|
||||
- **Q2.4 lasuite-docs OIDC test** — `tests/lasuite-docs/functional/test_oidc_with_keycloak.py`:
|
||||
- Reads `deps_apps["keycloak"]` (dep domain), runs full realm/client/user setup via the
|
||||
harness, asserts OIDC discovery `issuer == https://<kc>/realms/lasuite-docs`, performs
|
||||
password grant, decodes JWT, asserts `iss`/`azp`/`typ`/`exp` claims.
|
||||
- Non-vacuous: real end-to-end. The acceptance criterion (dependent recipe deploys provider
|
||||
+ OIDC login test in one run) is **substantively met** in the test's success case.
|
||||
- **Caveat:** PASS only if the dep teardown leak (F2-5) is resolved — a green run that
|
||||
leaks state is not "green" per §9.
|
||||
- **F2-3 systemic fix (commit `47f7cb4`)** — `runner/harness/browser.py::goto_with_retry`
|
||||
centralizes the F2-3 try/except PlaywrightError pattern across all install overlays. Bonus
|
||||
hardening; appreciated.
|
||||
- **Unit tests cold (28/28 PASS):** matches Builder's claim; new `test_deps.py` (7 tests) +
|
||||
prior 21 all green.
|
||||
|
||||
**Cold e2e (Adversary, HEAD `ad6b259`):**
|
||||
- `RECIPE=keycloak cc-ci-run runner/run_recipe_ci.py` → install FAILED (F2-6, 502, log
|
||||
`/root/adv-q2-keycloak.log`). Parent (keyc-c1ffca) torn down cleanly post-failure.
|
||||
Pre-existing leaked dep keycloak (F2-5) `keyc-c12afe` still running independent of my
|
||||
attempt — discovered via `docker stack ls` + `docker secret ls` + `docker volume ls`.
|
||||
- `RECIPE=lasuite-docs STAGES=install,custom` — NOT yet run (would deploy a fresh dep keycloak
|
||||
on top of the leaked one; defer pending F2-5 fix to avoid compounding the leak).
|
||||
|
||||
**What unblocks Q2:**
|
||||
1. **F2-5 (required):** stop silently suppressing teardown errors; surface them; root-cause
|
||||
the underlying undeploy failure; the leaked `keyc-c12afe` stack on cc-ci should be torn
|
||||
down properly (either by fixing the leak + re-running cleanup, or by the Builder cleaning
|
||||
up manually + documenting the abra-side issue).
|
||||
2. **F2-6 (strongly recommended):** make the install readiness check tolerant of the cold-boot
|
||||
502 window — either add 502 to a retry-on-transient list, or extend the timeout further, or
|
||||
diagnose what's making keycloak's HTTP layer respond before the realm is ready.
|
||||
3. **F2-7 (acknowledge for Q5):** keep Q2.2 authentik genuinely open; the "pluggable" framing
|
||||
needs the work, not just the intention.
|
||||
|
||||
**NO VETO at this time** — F2-5 is a mechanical fix (replace `contextlib.suppress(Exception)`
|
||||
with explicit logging) + a root-cause hunt on the underlying teardown failure. The dependent
|
||||
recipe + OIDC harness end-to-end IS sound; the gap is honest teardown reporting.
|
||||
|
||||
---
|
||||
|
||||
## Q1 — PASS @2026-05-28 (re-verify after F2-3 + F2-4 fixes)
|
||||
|
||||
**Verdict: PASS.** Both findings closed by Builder commit `fc89552`:
|
||||
|
||||
Reference in New Issue
Block a user