From 5b344965575e28efb47193fcbe55eba944d5eed5 Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Thu, 28 May 2026 21:25:27 +0100 Subject: [PATCH] =?UTF-8?q?fix(2):=20F2-11=20=E2=80=94=20SSO-dep=20deps-no?= =?UTF-8?q?t-ready=20SKIP=20no=20longer=20yields=20GREEN=20!testme?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When a DEPS-declaring recipe's setup_custom_tests fails, its @requires_deps (SSO/OIDC) tests skip; a skip-only pytest file exits 0 so the run previously reported overall=0 (GREEN) while the only SSO test never ran (violates P7). Fix preserves generic-tier failure-isolation but corrects the green SIGNAL: - conftest.pytest_collection_modifyitems counts skipped requires_deps tests and appends to $CCCI_DEPS_SKIP_REPORT. - run_recipe_ci: sums the count, surfaces it in RUN SUMMARY, and new pure predicate sso_dep_unverified(declared, deps_ready, skipped) flips overall=1. - 7 new unit tests (tests/unit/test_f211_sso_skip.py). Verified deploy-free (rate-limit-independent): 35/35 unit PASS; cold real-test proof on lasuite-docs test_oidc_with_keycloak.py -> 1 skipped + skip-report==1 -> orchestrator would set overall=1. Full e2e deferred until Docker Hub rate limit lifts. Co-Authored-By: Claude Opus 4.8 (1M context) --- machine-docs/JOURNAL-2.md | 45 ++++++++++++ machine-docs/STATUS-2.md | 32 +++++++++ runner/run_recipe_ci.py | 43 +++++++++++- tests/conftest.py | 14 ++++ tests/unit/test_f211_sso_skip.py | 115 +++++++++++++++++++++++++++++++ 5 files changed, 248 insertions(+), 1 deletion(-) create mode 100644 tests/unit/test_f211_sso_skip.py diff --git a/machine-docs/JOURNAL-2.md b/machine-docs/JOURNAL-2.md index 532e77b..ef4467a 100644 --- a/machine-docs/JOURNAL-2.md +++ b/machine-docs/JOURNAL-2.md @@ -544,3 +544,48 @@ re-pulls that burn the anonymous quota; let the cache persist). Disk pressure mu by pruning ONLY truly-dangling images, or by the operator growing the cc-ci disk. (Also noted: recipe env is `ONLY_OFFICE_DOMAIN`, underscore — my EXTRA_ENV flattened COLLABORA/MINIO domains but not onlyoffice's; only matters for the WOPI/TLS path, to revisit when base converges.) + +## 2026-05-28 (later) — Gitea restored; consumed Adversary inbox; fixed F2-11 (SSO-skip-goes-green) + +Gitea (git.autonomic.zone) recovered ~21:08Z (orchestrator confirmed). Reconciled: `git pull --rebase` +(up to date), pushed my 2 queued local commits (1138d77 + 4a118ea → origin), then a 3rd pull picked up +the Adversary's `b941f55` (its outage-queued writes: F2-11 + REVIEW-2 idle checkpoint + BUILDER-INBOX). +Consumed + deleted BUILDER-INBOX. The 3 watchdog pings during the outage were phantoms (Adversary's +failed push retries) — nothing was lost. + +**Adversary's BUILDER-INBOX (digested):** DONE-gate warnings (F2-7 authentik, F2-9 cryptpad create-pad, +ghost §4.3 create-post floor, Q3.2 drive specifics, full P1–P8 Q5 re-verify) — all need deploys, so +gated on the Docker Hub rate limit. Plus **F2-11** (medium, not a VETO), which is pure code → fixed it +now (rate-limit-independent). + +**F2-11 — SSO-dep "deps-not-ready" SKIP must not yield a GREEN run.** Adversary cold-proved: when +`setup_custom_tests` fails for a DEPS-declaring recipe, `CCCI_DEPS_READY=0` → conftest skips every +`@requires_deps` test → a skip-only pytest file exits 0 → `run_custom` returns "pass" → `overall=0` → +`!testme` GREEN while the only SSO/OIDC test never ran. Violates P7. + +Why my fix is shaped this way: the failure-isolation design (a transient SSO-setup failure must not +break the *generic* tier signal) is correct and I kept it — generic tier results stand untouched. The +defect was only that the green SIGNAL was indistinguishable from "SSO verified." So I correct the +signal, not the isolation: +- `conftest.pytest_collection_modifyitems` now COUNTS the requires_deps tests it skips and appends the + count to `$CCCI_DEPS_SKIP_REPORT` (one line per pytest invocation; orchestrator sums across the + per-custom-file loop). Chose a filesystem report (not exit code) because pytest has no "fail on + skip" and a skip-only file legitimately exits 0 — the orchestrator already shares run-scoped temp + files with the pytest subprocess (depsfile/statefile/countfile), so this matches the pattern. +- `run_recipe_ci`: reads + sums the count, surfaces it in RUN SUMMARY (`custom: pass (N requires_deps + SKIPPED ... SSO UNVERIFIED)`), and a new pure predicate `sso_dep_unverified(declared, deps_ready, + skipped)` flips `overall=1` when a recipe declares DEPS + deps not ready + ≥1 requires_deps skipped. + Gated on skip>0 so a deps-declaring recipe with no requires_deps tests isn't false-failed. + +Verified (both deploy-free — rate-limit-independent): +1. `cc-ci-run -m pytest tests/unit -q` → **35 passed** (28 prior + 7 new in test_f211_sso_skip.py: + predicate truth table + conftest skip/record/append/noop-when-ready). +2. Cold real-test proof on cc-ci: `CCCI_DEPS_READY=0 CCCI_DEPS_SKIP_REPORT=/tmp/f211-skip.txt + cc-ci-run -m pytest tests/lasuite-docs/functional/test_oidc_with_keycloak.py -rs` → `1 skipped`, + `PYTEST_EXIT=0` (the hazard), but `/tmp/f211-skip.txt` now contains `1` → orchestrator would compute + `sso_dep_unverified(["keycloak"], False, 1)=True` → `overall=1`. Hazard closed. + +Full e2e (real deploy with a forced setup_custom_tests failure → observe overall=1) deferred to when +the Docker Hub rate limit lifts; the unit + cold-real-test proofs cover the predicate, the conftest +signal on real files, and the count flow — only the sequential read→sum→predicate→overall wiring is +unexercised by a live run, and it's straight-line code. diff --git a/machine-docs/STATUS-2.md b/machine-docs/STATUS-2.md index d087c49..b5603da 100644 --- a/machine-docs/STATUS-2.md +++ b/machine-docs/STATUS-2.md @@ -69,6 +69,38 @@ Remaining substantial: Q3.2 lasuite-drive (needs mirror), Q3.3 lasuite-meet (mir immich (needs mirror), Q4.2/Q4.5-7/Q4.9-10 (mostly need mirror). The mirror-and-enroll path is established (recipe-create-pr skill); pausing this sprint for Adversary cold-verify. +## Adversary findings — Builder response + +**F2-11 — FIXED, awaiting Adversary re-verify** (commit: `git log --oneline | grep 'F2-11'`). +SSO-dep "deps-not-ready" +SKIP no longer yields a GREEN `!testme`. +- **WHAT:** when a recipe declares `DEPS` and `setup_custom_tests` fails (deps not ready) so its + `@requires_deps` (SSO/OIDC) tests SKIP, the run now reports **FAIL** (`overall=1`), not green — + while generic-tier failure-isolation is preserved (install/upgrade/backup/restore results stand). +- **WHERE (code):** + - `tests/conftest.py::pytest_collection_modifyitems` — now counts the requires_deps tests it skips + and appends the count to `$CCCI_DEPS_SKIP_REPORT`. + - `runner/run_recipe_ci.py` — sets `CCCI_DEPS_SKIP_REPORT` (run-scoped temp, near `depsfile`); + after teardown sums the count into `requires_deps_skipped`; RUN SUMMARY annotates the custom tier + (`custom: pass (N requires_deps SKIPPED ... SSO UNVERIFIED)`); new pure predicate + `sso_dep_unverified(declared, deps_ready, requires_deps_skipped)` flips `overall=1`. + - `tests/unit/test_f211_sso_skip.py` — 7 new unit tests. +- **HOW to verify (both deploy-free, rate-limit-independent):** + 1. `ssh cc-ci 'cd /root/cc-ci && cc-ci-run -m pytest tests/unit -q'` → **EXPECTED: 35 passed** + (28 prior + 7 F2-11). + 2. Cold real-test signal proof: + `ssh cc-ci 'cd /root/cc-ci && rm -f /tmp/f211-skip.txt && CCCI_DEPS_READY=0 \ + CCCI_DEPS_NOT_READY_REASON=boom CCCI_DEPS_SKIP_REPORT=/tmp/f211-skip.txt \ + cc-ci-run -m pytest tests/lasuite-docs/functional/test_oidc_with_keycloak.py -rs; \ + cat /tmp/f211-skip.txt'` + → **EXPECTED:** `1 skipped`, pytest exit 0 (the hazard), and `/tmp/f211-skip.txt` == `1`. Since + lasuite-docs declares `DEPS=["keycloak"]`, the orchestrator computes + `sso_dep_unverified(["keycloak"], False, 1)=True` → `overall=1`. +- **NOT verified by a live run yet:** full e2e (real deploy with forced setup_custom_tests failure → + observe `overall=1`) is deferred until the Docker Hub rate limit (## Blocked) lifts. The two proofs + above cover the predicate, the conftest signal on real files, and the count flow; only the + straight-line read→sum→predicate→overall wiring is unexercised by a live deploy. + ## Gate **Gate: Q2 — Adversary PASS @2026-05-28** (REVIEW-2 `## Q2 — PASS @2026-05-28 (re-verify after F2-5 fix + F2-6 collateral resolution)`; cold e2e on `/root/adv-verify` HEAD `874bfbb`: diff --git a/runner/run_recipe_ci.py b/runner/run_recipe_ci.py index 175c8e3..c4689df 100644 --- a/runner/run_recipe_ci.py +++ b/runner/run_recipe_ci.py @@ -45,6 +45,17 @@ from harness import deps as deps_mod, discovery, generic, lifecycle, naming # n ALL_STAGES = ("install", "upgrade", "backup", "restore", "custom") +def sso_dep_unverified(declared, deps_ready: bool, requires_deps_skipped: int) -> bool: + """F2-11 gate predicate (pure, unit-tested). True when a recipe declares DEPS but its + setup_custom_tests failed (deps not ready) AND that caused ≥1 `requires_deps` (SSO/OIDC) test + to SKIP. In that case the recipe's characteristic SSO claim was NOT verified, so the run must + NOT report GREEN — even though a skip-only pytest file exits 0 and leaves every tier 'pass'. + Generic-tier failure-isolation is preserved (those results stand); only the green SIGNAL is + corrected. Gated on skip>0 so a deps-declaring recipe with no requires_deps tests isn't + false-failed.""" + return bool(declared) and not deps_ready and requires_deps_skipped > 0 + + def _truthy(v: str | None) -> bool: return str(v or "").strip().lower() in ("1", "true", "yes", "on") @@ -427,6 +438,11 @@ def main() -> int: with open(depsfile, "w") as f: json.dump({}, f) os.environ["CCCI_DEPS_FILE"] = depsfile + # F2-11: conftest appends the count of requires_deps tests it skips (deps-not-ready) here. + skipfile = os.path.join(tempfile.gettempdir(), f"ccci-depskip-{domain}.txt") + with contextlib.suppress(OSError): + os.remove(skipfile) + os.environ["CCCI_DEPS_SKIP_REPORT"] = skipfile declared = deps_mod.declared_deps(recipe) if declared: print(f"\n===== DEPS declared (deploy AFTER generic tiers): {declared} =====", flush=True) @@ -562,6 +578,15 @@ def main() -> int: os.remove(statefile) with contextlib.suppress(OSError): os.remove(depsfile) + # F2-11: sum the requires_deps skip counts conftest recorded across the custom files. + requires_deps_skipped = 0 + try: + with open(skipfile) as f: + requires_deps_skipped = sum(int(x) for x in f.read().split() if x.strip()) + except OSError: + pass + with contextlib.suppress(OSError): + os.remove(skipfile) # ---- per-op summary (DG6 feed) ---- # SSO-dep plan §1: DG4.1 generalised — one `abra app new` per app in the run (recipe + each @@ -582,7 +607,12 @@ def main() -> int: print(f" deps-not-ready: {deps_not_ready_reason}") order = [s for s in ALL_STAGES if s in results] for op in order: - print(f" {op:8s}: {results[op]}") + suffix = "" + # F2-11: annotate the custom tier when requires_deps (SSO) tests were skipped, so a reader + # of the summary can't mistake a green custom tier for "SSO verified". + if op == "custom" and requires_deps_skipped: + suffix = f" ({requires_deps_skipped} requires_deps SKIPPED: deps-not-ready — SSO UNVERIFIED)" + print(f" {op:8s}: {results[op]}{suffix}") overall = 0 if deploy_count != expected_deploy_count: @@ -597,6 +627,17 @@ def main() -> int: overall = 1 if any(v == "fail" for v in results.values()): overall = 1 + # F2-11: a deps-declaring recipe whose setup_custom_tests failed has NOT verified its SSO/OIDC + # claim — its requires_deps tests SKIPPED (a skip-only file exits 0, so without this the run + # would report GREEN). Fail the run for that recipe; generic-tier results above are untouched. + if sso_dep_unverified(declared, deps_ready, requires_deps_skipped): + print( + f"!! recipe declares DEPS={declared} but setup_custom_tests failed and " + f"{requires_deps_skipped} requires_deps (SSO) test(s) were SKIPPED — SSO claim NOT " + f"verified; failing run (F2-11). deps-not-ready: {deps_not_ready_reason}", + file=sys.stderr, + ) + overall = 1 if not results: print("no tiers ran", file=sys.stderr) return 1 diff --git a/tests/conftest.py b/tests/conftest.py index ad921d8..9575356 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -107,9 +107,23 @@ def pytest_collection_modifyitems(config, items): return reason = os.environ.get("CCCI_DEPS_NOT_READY_REASON", "(no reason given)") skip_mark = pytest.mark.skip(reason=f"deps-not-ready: {reason}") + skipped = 0 for item in items: if "requires_deps" in item.keywords: item.add_marker(skip_mark) + skipped += 1 + # F2-11: a skip-only pytest file exits 0, so without this the orchestrator can't tell + # "SSO verified" from "SSO test silently skipped because deps weren't ready". Record the count + # of requires_deps tests we skipped to a report file the orchestrator reads — it surfaces the + # count in RUN SUMMARY and FAILS the recipe's SSO claim (a green exit must not mask an unrun + # SSO test). Appended one line per pytest invocation (one per custom file); orchestrator sums. + report = os.environ.get("CCCI_DEPS_SKIP_REPORT") + if report and skipped: + try: + with open(report, "a") as f: + f.write(f"{skipped}\n") + except OSError: + pass def pytest_configure(config): diff --git a/tests/unit/test_f211_sso_skip.py b/tests/unit/test_f211_sso_skip.py new file mode 100644 index 0000000..f438893 --- /dev/null +++ b/tests/unit/test_f211_sso_skip.py @@ -0,0 +1,115 @@ +"""Unit tests for F2-11 — SSO-dep "deps-not-ready" SKIP must NOT yield a GREEN run. + +Two halves of the fix are tested without any real deploy: +1. `run_recipe_ci.sso_dep_unverified` — the pure gate predicate the orchestrator uses to flip + `overall` to fail when a deps-declaring recipe's SSO tests were skipped because deps weren't ready. +2. `conftest.pytest_collection_modifyitems` — when CCCI_DEPS_READY=0 it (a) skips every + `requires_deps` test and (b) records the skipped count to `$CCCI_DEPS_SKIP_REPORT` so the + orchestrator can surface it + gate on it. + +The end-to-end hazard (a real SSO-dep recipe going green on a skip) is exercised by e2e against +cc-ci; here we lock the decision logic + the conftest→orchestrator signal that drives it. +""" + +from __future__ import annotations + +import importlib.util +import os +import sys + +sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner")) +import run_recipe_ci # noqa: E402 + + +# ---- 1. the pure gate predicate ---- + +def test_sso_dep_unverified_true_when_declared_notready_and_skipped(): + """declares DEPS + deps not ready + ≥1 requires_deps test skipped → run must FAIL (F2-11).""" + assert run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=1) + assert run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=3) + + +def test_sso_dep_unverified_false_when_deps_ready(): + """deps ready (setup_custom_tests succeeded) → SSO tests actually ran → not a failure.""" + assert not run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=True, requires_deps_skipped=0) + + +def test_sso_dep_unverified_false_when_no_deps_declared(): + """A recipe with no DEPS can never trip the SSO-skip gate.""" + assert not run_recipe_ci.sso_dep_unverified([], deps_ready=False, requires_deps_skipped=0) + assert not run_recipe_ci.sso_dep_unverified(None, deps_ready=False, requires_deps_skipped=2) + + +def test_sso_dep_unverified_false_when_nothing_skipped(): + """Deps declared + not ready but ZERO requires_deps tests skipped → don't false-fail + (the recipe has no SSO-marked tests to have been masked).""" + assert not run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=0) + + +# ---- 2. conftest skip + record behavior ---- + +def _load_conftest(): + """Load tests/conftest.py under a private module name (avoid clashing with pytest's own + loaded `conftest`), so we can call pytest_collection_modifyitems directly with fakes.""" + path = os.path.join(os.path.dirname(__file__), "..", "conftest.py") + spec = importlib.util.spec_from_file_location("ccci_conftest_under_test", path) + mod = importlib.util.module_from_spec(spec) + spec.loader.exec_module(mod) + return mod + + +class _FakeItem: + def __init__(self, keywords): + # pytest `item.keywords` supports `in`; a dict suffices. + self.keywords = {k: True for k in keywords} + self.markers = [] + + def add_marker(self, mark): + self.markers.append(mark) + + +def test_conftest_skips_and_records_requires_deps_when_not_ready(tmp_path, monkeypatch): + conftest = _load_conftest() + report = tmp_path / "skip.txt" + monkeypatch.setenv("CCCI_DEPS_READY", "0") + monkeypatch.setenv("CCCI_DEPS_NOT_READY_REASON", "keycloak realm setup boom") + monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report)) + + sso1 = _FakeItem(["requires_deps"]) + sso2 = _FakeItem(["requires_deps"]) + plain = _FakeItem([]) # a non-deps custom test — must NOT be skipped + conftest.pytest_collection_modifyitems(config=None, items=[sso1, sso2, plain]) + + # Both requires_deps items got a skip marker; the plain one did not. + assert len(sso1.markers) == 1 and len(sso2.markers) == 1 + assert plain.markers == [] + # The skipped count was recorded for the orchestrator. + assert report.read_text().split() == ["2"] + + +def test_conftest_appends_across_invocations(tmp_path, monkeypatch): + """The orchestrator runs one pytest per custom file; counts must accumulate (append).""" + conftest = _load_conftest() + report = tmp_path / "skip.txt" + monkeypatch.setenv("CCCI_DEPS_READY", "0") + monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report)) + + conftest.pytest_collection_modifyitems(None, [_FakeItem(["requires_deps"])]) + conftest.pytest_collection_modifyitems(None, [_FakeItem(["requires_deps"]), _FakeItem(["requires_deps"])]) + + total = sum(int(x) for x in report.read_text().split()) + assert total == 3 + + +def test_conftest_noop_and_no_record_when_deps_ready(tmp_path, monkeypatch): + """deps ready → no skips, no report file written (early return).""" + conftest = _load_conftest() + report = tmp_path / "skip.txt" + monkeypatch.setenv("CCCI_DEPS_READY", "1") + monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report)) + + item = _FakeItem(["requires_deps"]) + conftest.pytest_collection_modifyitems(None, [item]) + + assert item.markers == [] # not skipped + assert not report.exists() # nothing recorded