review(2): Q2 PASS — F2-5 fix verified (verify=True teardown, leak gone); F2-6 collateral resolved; F2-7 stands as Q2.2/Q5 tracking

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 09:51:22 +01:00
parent 8021f19309
commit 116f7a9aa0
2 changed files with 71 additions and 3 deletions

View File

@ -97,7 +97,20 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md`
## Adversary findings
- [ ] **F2-5 [adversary] — Q2 dep teardown leak (gate-blocker)**
- [x] **F2-5 [adversary] — CLOSED @2026-05-28** by Builder commit `c6e94af`. `runner/harness/
deps.py::teardown_deps` now uses `lifecycle.teardown_app(verify=True)` so residuals raise
`TeardownError`; per-dep errors logged loudly (`!! dep <r> @ <d> teardown failed: ...`),
collected, and re-raised as a combined `TeardownError` after attempting all deps;
orchestrator's `finally` catches + reports in RUN SUMMARY + sets non-zero exit.
Adversary cold re-verify on `/root/adv-verify` @ HEAD `874bfbb`:
`RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py` →
install + custom PASS, deploy-count=2 (parent + dep), `DEPS teardown` succeeded clean,
`docker stack ls | grep -iE "keyc|lasuite"` post-run → **empty** (no leftover stack/volume/
secret). The fix correctly enforces §9 teardown sacred. Original FAIL detail retained
below for audit.
**Original FAIL context:** `runner/harness/deps.py::teardown_deps` wrapped
`lifecycle.teardown_app(domain, verify=False)`
`runner/harness/deps.py::teardown_deps` wraps `lifecycle.teardown_app(domain, verify=False)`
in `contextlib.suppress(Exception)`, silently swallowing all teardown failures. The
`===== DEPS teardown =====` print fires even when the underlying undeploy raises. On cold
@ -132,7 +145,15 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md`
teardown silently failed; the runtime state on cc-ci right now demonstrates this.
- Filed by Adversary @2026-05-28.
- [ ] **F2-6 [adversary] — keycloak install cold flake** — Adversary cold first-attempt from
- [x] **F2-6 [adversary] — CLOSED @2026-05-28** collateral resolution from F2-5 fix. After
F2-5's silent-suppress was removed and the leaked `keyc-c12afe` stack cleared, cold
retest from `/root/adv-verify` @ HEAD `874bfbb`: `RECIPE=keycloak STAGES=install,custom
cc-ci-run runner/run_recipe_ci.py` → install + custom PASS on the first attempt;
deploy-count=1; teardown clean. Confirms the original 502 flake was aggravated by the
F2-5 leak holding node CPU (~82%) during readiness convergence. No standalone keycloak
flake remains. Original FAIL context retained below.
**Original FAIL context:** Adversary cold first-attempt from
`/root/adv-verify` @ HEAD `ad6b259`: `RECIPE=keycloak cc-ci-run runner/run_recipe_ci.py` →
install FAILED with `deploy/readiness failed: keyc-c1ffca.ci.commoninternet.net: not
healthy over HTTPS /realms/master (last status 502)`. Parent recipe (keyc-c1ffca) was

View File

@ -27,7 +27,54 @@ Phase 1e closed (commit `0fe1218` "DONE(1e)") with all HC1HC4 PASS, NO VETO.
started — no `STATUS-2.md` / `BACKLOG-2.md` / `JOURNAL-2.md` from the Builder yet. No CLAIMED gate
to verify. Entering self-paced idle (§7 case 3); will re-orient on Builder activity.
## Q2 — FAIL @2026-05-28 (dep teardown leak + cold install flake)
## Q2 — PASS @2026-05-28 (re-verify after F2-5 fix + F2-6 collateral resolution)
**Verdict: PASS.** Builder commit `c6e94af` ("F2-5 — dep teardown verify=True, errors propagate
to run-fail") closes F2-5; F2-6 collaterally resolved.
**Cold environment:** `/root/adv-verify` on cc-ci, hard-reset to `origin/main` HEAD `874bfbb`.
**Re-verify (Adversary, cold):**
- **lasuite-docs (Q2.4 acceptance) + keycloak dep** —
`RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py`:
- install: generic `test_serving` PASS + cc-ci `test_serving_and_editor` PASS.
- custom: 3 PASS — `test_auth_required` + `test_lasuite_docs_returns_200` +
`test_oidc_password_grant_against_dep_keycloak`. The OIDC roundtrip exercises the full SSO
contract (realm/client/user setup → discovery → password grant → JWT iss/azp/typ/exp claims).
- deploy-count = **2** (expect 2: parent + 1 dep — DG4.1 honored for the new dep-aware count).
- `DEPS teardown` succeeded clean (no `!!` failure logs).
- **Post-run state:** `docker stack ls | grep -iE "keyc|lasuite"` → empty; volumes → empty;
secrets → empty. **No leak.** §9 teardown sacred enforced.
- **keycloak standalone** — `RECIPE=keycloak STAGES=install,custom`: install + custom PASS on
the first attempt; deploy-count=1; teardown clean. Confirms F2-6 was aggravated by F2-5's
resource leak (the leaked stack was at ~82% CPU during my earlier attempt); with the leak
gone, keycloak installs convergence in time.
- **Unit tests (28/28 PASS):** confirmed in earlier cold run; unchanged by this fix.
**F2-5 fix is correct:** `lifecycle.teardown_app(verify=True)` raises `TeardownError` on
residual containers/volumes/secrets; `teardown_deps` collects per-dep failures and re-raises a
combined error; orchestrator catches in `finally`, reports in RUN SUMMARY, exits non-zero. The
"DEPS teardown" line is now meaningful — if it prints without `!!` markers, the cleanup
actually succeeded.
**F2-7 (Q2.2 authentik / partial pluggability):** STANDS as open scope item — not a Q2 PASS
blocker (Q2.4 acceptance is met by keycloak alone; the harness's OIDC-flow primitives ARE
provider-agnostic). Authentik enrollment + a `setup_authentik_realm` backend remains required
work; tracked for Q5 catch-up so the "pluggable" framing is actually proven by a second
provider.
**Substantive PASS evidence reaffirmed from prior FAIL writeup:** Q2.1 keycloak content (parity
+ JWT password-grant + admin-API client CRUD), Q2.3 dep resolver (sequential deploys, reverse
teardown, per-run domain naming, deps_apps fixture), Q2.3 SSO harness (OIDC flow primitives
provider-agnostic, idempotent realm/client/user setup, secrets handled correctly), Q2.4
acceptance (dependent recipe + dep + full OIDC test in one run).
**No standing VETO.** Builder may advance to Q3 (already in flight per commit `874bfbb`
Q3.1 partial). F2-7 remains an open observation for Q2.2/Q5.
---
## Q2 — FAIL @2026-05-28 (dep teardown leak + cold install flake) — SUPERSEDED by PASS above
**Verdict: FAIL.** Three findings filed:
- **F2-5 (gate-blocker):** `runner/harness/deps.py::teardown_deps` silently suppresses ALL