fix(2): F2-11 — SSO-dep deps-not-ready SKIP no longer yields GREEN !testme

When a DEPS-declaring recipe's setup_custom_tests fails, its @requires_deps (SSO/OIDC) tests skip; a skip-only pytest file exits 0 so the run previously reported overall=0 (GREEN) while the only SSO test never ran (violates P7). Fix preserves generic-tier failure-isolation but corrects the green SIGNAL: - conftest.pytest_collection_modifyitems counts skipped requires_deps tests and appends to $CCCI_DEPS_SKIP_REPORT. - run_recipe_ci: sums the count, surfaces it in RUN SUMMARY, and new pure predicate sso_dep_unverified(declared, deps_ready, skipped) flips overall=1. - 7 new unit tests (tests/unit/test_f211_sso_skip.py). Verified deploy-free (rate-limit-independent): 35/35 unit PASS; cold real-test proof on lasuite-docs test_oidc_with_keycloak.py -> 1 skipped + skip-report==1 -> orchestrator would set overall=1. Full e2e deferred until Docker Hub rate limit lifts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 21:25:27 +01:00
parent 10d2a13031
commit 5b34496557
5 changed files with 248 additions and 1 deletions
--- a/machine-docs/JOURNAL-2.md
+++ b/machine-docs/JOURNAL-2.md
@ -544,3 +544,48 @@ re-pulls that burn the anonymous quota; let the cache persist). Disk pressure mu
 by pruning ONLY truly-dangling images, or by the operator growing the cc-ci disk.
 (Also noted: recipe env is `ONLY_OFFICE_DOMAIN`, underscore — my EXTRA_ENV flattened COLLABORA/MINIO
 domains but not onlyoffice's; only matters for the WOPI/TLS path, to revisit when base converges.)
+
+## 2026-05-28 (later) — Gitea restored; consumed Adversary inbox; fixed F2-11 (SSO-skip-goes-green)
+
+Gitea (git.autonomic.zone) recovered ~21:08Z (orchestrator confirmed). Reconciled: `git pull --rebase`
+(up to date), pushed my 2 queued local commits (1138d77 + 4a118ea → origin), then a 3rd pull picked up
+the Adversary's `b941f55` (its outage-queued writes: F2-11 + REVIEW-2 idle checkpoint + BUILDER-INBOX).
+Consumed + deleted BUILDER-INBOX. The 3 watchdog pings during the outage were phantoms (Adversary's
+failed push retries) — nothing was lost.
+
+**Adversary's BUILDER-INBOX (digested):** DONE-gate warnings (F2-7 authentik, F2-9 cryptpad create-pad,
+ghost §4.3 create-post floor, Q3.2 drive specifics, full P1–P8 Q5 re-verify) — all need deploys, so
+gated on the Docker Hub rate limit. Plus **F2-11** (medium, not a VETO), which is pure code → fixed it
+now (rate-limit-independent).
+
+**F2-11 — SSO-dep "deps-not-ready" SKIP must not yield a GREEN run.** Adversary cold-proved: when
+`setup_custom_tests` fails for a DEPS-declaring recipe, `CCCI_DEPS_READY=0` → conftest skips every
+`@requires_deps` test → a skip-only pytest file exits 0 → `run_custom` returns "pass" → `overall=0` →
+`!testme` GREEN while the only SSO/OIDC test never ran. Violates P7.
+
+Why my fix is shaped this way: the failure-isolation design (a transient SSO-setup failure must not
+break the *generic* tier signal) is correct and I kept it — generic tier results stand untouched. The
+defect was only that the green SIGNAL was indistinguishable from "SSO verified." So I correct the
+signal, not the isolation:
+- `conftest.pytest_collection_modifyitems` now COUNTS the requires_deps tests it skips and appends the
+  count to `$CCCI_DEPS_SKIP_REPORT` (one line per pytest invocation; orchestrator sums across the
+  per-custom-file loop). Chose a filesystem report (not exit code) because pytest has no "fail on
+  skip" and a skip-only file legitimately exits 0 — the orchestrator already shares run-scoped temp
+  files with the pytest subprocess (depsfile/statefile/countfile), so this matches the pattern.
+- `run_recipe_ci`: reads + sums the count, surfaces it in RUN SUMMARY (`custom: pass (N requires_deps
+  SKIPPED ... SSO UNVERIFIED)`), and a new pure predicate `sso_dep_unverified(declared, deps_ready,
+  skipped)` flips `overall=1` when a recipe declares DEPS + deps not ready + ≥1 requires_deps skipped.
+  Gated on skip>0 so a deps-declaring recipe with no requires_deps tests isn't false-failed.
+
+Verified (both deploy-free — rate-limit-independent):
+1. `cc-ci-run -m pytest tests/unit -q` → **35 passed** (28 prior + 7 new in test_f211_sso_skip.py:
+   predicate truth table + conftest skip/record/append/noop-when-ready).
+2. Cold real-test proof on cc-ci: `CCCI_DEPS_READY=0 CCCI_DEPS_SKIP_REPORT=/tmp/f211-skip.txt
+   cc-ci-run -m pytest tests/lasuite-docs/functional/test_oidc_with_keycloak.py -rs` → `1 skipped`,
+   `PYTEST_EXIT=0` (the hazard), but `/tmp/f211-skip.txt` now contains `1` → orchestrator would compute
+   `sso_dep_unverified(["keycloak"], False, 1)=True` → `overall=1`. Hazard closed.
+
+Full e2e (real deploy with a forced setup_custom_tests failure → observe overall=1) deferred to when
+the Docker Hub rate limit lifts; the unit + cold-real-test proofs cover the predicate, the conftest
+signal on real files, and the count flow — only the sequential read→sum→predicate→overall wiring is
+unexercised by a live run, and it's straight-line code.
--- a/machine-docs/STATUS-2.md
+++ b/machine-docs/STATUS-2.md
@ -69,6 +69,38 @@ Remaining substantial: Q3.2 lasuite-drive (needs mirror), Q3.3 lasuite-meet (mir
 immich (needs mirror), Q4.2/Q4.5-7/Q4.9-10 (mostly need mirror). The mirror-and-enroll path is
 established (recipe-create-pr skill); pausing this sprint for Adversary cold-verify.

+## Adversary findings — Builder response
+
+**F2-11 — FIXED, awaiting Adversary re-verify** (commit: `git log --oneline | grep 'F2-11'`).
+SSO-dep "deps-not-ready"
+SKIP no longer yields a GREEN `!testme`.
+- **WHAT:** when a recipe declares `DEPS` and `setup_custom_tests` fails (deps not ready) so its
+  `@requires_deps` (SSO/OIDC) tests SKIP, the run now reports **FAIL** (`overall=1`), not green —
+  while generic-tier failure-isolation is preserved (install/upgrade/backup/restore results stand).
+- **WHERE (code):**
+  - `tests/conftest.py::pytest_collection_modifyitems` — now counts the requires_deps tests it skips
+    and appends the count to `$CCCI_DEPS_SKIP_REPORT`.
+  - `runner/run_recipe_ci.py` — sets `CCCI_DEPS_SKIP_REPORT` (run-scoped temp, near `depsfile`);
+    after teardown sums the count into `requires_deps_skipped`; RUN SUMMARY annotates the custom tier
+    (`custom: pass (N requires_deps SKIPPED ... SSO UNVERIFIED)`); new pure predicate
+    `sso_dep_unverified(declared, deps_ready, requires_deps_skipped)` flips `overall=1`.
+  - `tests/unit/test_f211_sso_skip.py` — 7 new unit tests.
+- **HOW to verify (both deploy-free, rate-limit-independent):**
+  1. `ssh cc-ci 'cd /root/cc-ci && cc-ci-run -m pytest tests/unit -q'` → **EXPECTED: 35 passed**
+     (28 prior + 7 F2-11).
+  2. Cold real-test signal proof:
+     `ssh cc-ci 'cd /root/cc-ci && rm -f /tmp/f211-skip.txt && CCCI_DEPS_READY=0 \
+       CCCI_DEPS_NOT_READY_REASON=boom CCCI_DEPS_SKIP_REPORT=/tmp/f211-skip.txt \
+       cc-ci-run -m pytest tests/lasuite-docs/functional/test_oidc_with_keycloak.py -rs; \
+       cat /tmp/f211-skip.txt'`
+     → **EXPECTED:** `1 skipped`, pytest exit 0 (the hazard), and `/tmp/f211-skip.txt` == `1`. Since
+     lasuite-docs declares `DEPS=["keycloak"]`, the orchestrator computes
+     `sso_dep_unverified(["keycloak"], False, 1)=True` → `overall=1`.
+- **NOT verified by a live run yet:** full e2e (real deploy with forced setup_custom_tests failure →
+  observe `overall=1`) is deferred until the Docker Hub rate limit (## Blocked) lifts. The two proofs
+  above cover the predicate, the conftest signal on real files, and the count flow; only the
+  straight-line read→sum→predicate→overall wiring is unexercised by a live deploy.
+
 ## Gate
 **Gate: Q2 — Adversary PASS @2026-05-28** (REVIEW-2 `## Q2 — PASS @2026-05-28 (re-verify after
 F2-5 fix + F2-6 collateral resolution)`; cold e2e on `/root/adv-verify` HEAD `874bfbb`:
--- a/runner/run_recipe_ci.py
+++ b/runner/run_recipe_ci.py
@ -45,6 +45,17 @@ from harness import deps as deps_mod, discovery, generic, lifecycle, naming  # n
 ALL_STAGES = ("install", "upgrade", "backup", "restore", "custom")


+def sso_dep_unverified(declared, deps_ready: bool, requires_deps_skipped: int) -> bool:
+    """F2-11 gate predicate (pure, unit-tested). True when a recipe declares DEPS but its
+    setup_custom_tests failed (deps not ready) AND that caused ≥1 `requires_deps` (SSO/OIDC) test
+    to SKIP. In that case the recipe's characteristic SSO claim was NOT verified, so the run must
+    NOT report GREEN — even though a skip-only pytest file exits 0 and leaves every tier 'pass'.
+    Generic-tier failure-isolation is preserved (those results stand); only the green SIGNAL is
+    corrected. Gated on skip>0 so a deps-declaring recipe with no requires_deps tests isn't
+    false-failed."""
+    return bool(declared) and not deps_ready and requires_deps_skipped > 0
+
+
 def _truthy(v: str | None) -> bool:
    return str(v or "").strip().lower() in ("1", "true", "yes", "on")

@ -427,6 +438,11 @@ def main() -> int:
    with open(depsfile, "w") as f:
        json.dump({}, f)
    os.environ["CCCI_DEPS_FILE"] = depsfile
+    # F2-11: conftest appends the count of requires_deps tests it skips (deps-not-ready) here.
+    skipfile = os.path.join(tempfile.gettempdir(), f"ccci-depskip-{domain}.txt")
+    with contextlib.suppress(OSError):
+        os.remove(skipfile)
+    os.environ["CCCI_DEPS_SKIP_REPORT"] = skipfile
    declared = deps_mod.declared_deps(recipe)
    if declared:
        print(f"\n===== DEPS declared (deploy AFTER generic tiers): {declared} =====", flush=True)
@ -562,6 +578,15 @@ def main() -> int:
        os.remove(statefile)
    with contextlib.suppress(OSError):
        os.remove(depsfile)
+    # F2-11: sum the requires_deps skip counts conftest recorded across the custom files.
+    requires_deps_skipped = 0
+    try:
+        with open(skipfile) as f:
+            requires_deps_skipped = sum(int(x) for x in f.read().split() if x.strip())
+    except OSError:
+        pass
+    with contextlib.suppress(OSError):
+        os.remove(skipfile)

    # ---- per-op summary (DG6 feed) ----
    # SSO-dep plan §1: DG4.1 generalised — one `abra app new` per app in the run (recipe + each
@ -582,7 +607,12 @@ def main() -> int:
            print(f"  deps-not-ready: {deps_not_ready_reason}")
    order = [s for s in ALL_STAGES if s in results]
    for op in order:
-        print(f"  {op:8s}: {results[op]}")
+        suffix = ""
+        # F2-11: annotate the custom tier when requires_deps (SSO) tests were skipped, so a reader
+        # of the summary can't mistake a green custom tier for "SSO verified".
+        if op == "custom" and requires_deps_skipped:
+            suffix = f"  ({requires_deps_skipped} requires_deps SKIPPED: deps-not-ready — SSO UNVERIFIED)"
+        print(f"  {op:8s}: {results[op]}{suffix}")

    overall = 0
    if deploy_count != expected_deploy_count:
@ -597,6 +627,17 @@ def main() -> int:
        overall = 1
    if any(v == "fail" for v in results.values()):
        overall = 1
+    # F2-11: a deps-declaring recipe whose setup_custom_tests failed has NOT verified its SSO/OIDC
+    # claim — its requires_deps tests SKIPPED (a skip-only file exits 0, so without this the run
+    # would report GREEN). Fail the run for that recipe; generic-tier results above are untouched.
+    if sso_dep_unverified(declared, deps_ready, requires_deps_skipped):
+        print(
+            f"!! recipe declares DEPS={declared} but setup_custom_tests failed and "
+            f"{requires_deps_skipped} requires_deps (SSO) test(s) were SKIPPED — SSO claim NOT "
+            f"verified; failing run (F2-11). deps-not-ready: {deps_not_ready_reason}",
+            file=sys.stderr,
+        )
+        overall = 1
    if not results:
        print("no tiers ran", file=sys.stderr)
        return 1
--- a/tests/conftest.py
+++ b/tests/conftest.py
@ -107,9 +107,23 @@ def pytest_collection_modifyitems(config, items):
        return
    reason = os.environ.get("CCCI_DEPS_NOT_READY_REASON", "(no reason given)")
    skip_mark = pytest.mark.skip(reason=f"deps-not-ready: {reason}")
+    skipped = 0
    for item in items:
        if "requires_deps" in item.keywords:
            item.add_marker(skip_mark)
+            skipped += 1
+    # F2-11: a skip-only pytest file exits 0, so without this the orchestrator can't tell
+    # "SSO verified" from "SSO test silently skipped because deps weren't ready". Record the count
+    # of requires_deps tests we skipped to a report file the orchestrator reads — it surfaces the
+    # count in RUN SUMMARY and FAILS the recipe's SSO claim (a green exit must not mask an unrun
+    # SSO test). Appended one line per pytest invocation (one per custom file); orchestrator sums.
+    report = os.environ.get("CCCI_DEPS_SKIP_REPORT")
+    if report and skipped:
+        try:
+            with open(report, "a") as f:
+                f.write(f"{skipped}\n")
+        except OSError:
+            pass


 def pytest_configure(config):
--- a/tests/unit/test_f211_sso_skip.py
+++ b/tests/unit/test_f211_sso_skip.py
@ -0,0 +1,115 @@
+"""Unit tests for F2-11 — SSO-dep "deps-not-ready" SKIP must NOT yield a GREEN run.
+
+Two halves of the fix are tested without any real deploy:
+1. `run_recipe_ci.sso_dep_unverified` — the pure gate predicate the orchestrator uses to flip
+   `overall` to fail when a deps-declaring recipe's SSO tests were skipped because deps weren't ready.
+2. `conftest.pytest_collection_modifyitems` — when CCCI_DEPS_READY=0 it (a) skips every
+   `requires_deps` test and (b) records the skipped count to `$CCCI_DEPS_SKIP_REPORT` so the
+   orchestrator can surface it + gate on it.
+
+The end-to-end hazard (a real SSO-dep recipe going green on a skip) is exercised by e2e against
+cc-ci; here we lock the decision logic + the conftest→orchestrator signal that drives it.
+"""
+
+from __future__ import annotations
+
+import importlib.util
+import os
+import sys
+
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
+import run_recipe_ci  # noqa: E402
+
+
+# ---- 1. the pure gate predicate ----
+
+def test_sso_dep_unverified_true_when_declared_notready_and_skipped():
+    """declares DEPS + deps not ready + ≥1 requires_deps test skipped → run must FAIL (F2-11)."""
+    assert run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=1)
+    assert run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=3)
+
+
+def test_sso_dep_unverified_false_when_deps_ready():
+    """deps ready (setup_custom_tests succeeded) → SSO tests actually ran → not a failure."""
+    assert not run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=True, requires_deps_skipped=0)
+
+
+def test_sso_dep_unverified_false_when_no_deps_declared():
+    """A recipe with no DEPS can never trip the SSO-skip gate."""
+    assert not run_recipe_ci.sso_dep_unverified([], deps_ready=False, requires_deps_skipped=0)
+    assert not run_recipe_ci.sso_dep_unverified(None, deps_ready=False, requires_deps_skipped=2)
+
+
+def test_sso_dep_unverified_false_when_nothing_skipped():
+    """Deps declared + not ready but ZERO requires_deps tests skipped → don't false-fail
+    (the recipe has no SSO-marked tests to have been masked)."""
+    assert not run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=0)
+
+
+# ---- 2. conftest skip + record behavior ----
+
+def _load_conftest():
+    """Load tests/conftest.py under a private module name (avoid clashing with pytest's own
+    loaded `conftest`), so we can call pytest_collection_modifyitems directly with fakes."""
+    path = os.path.join(os.path.dirname(__file__), "..", "conftest.py")
+    spec = importlib.util.spec_from_file_location("ccci_conftest_under_test", path)
+    mod = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(mod)
+    return mod
+
+
+class _FakeItem:
+    def __init__(self, keywords):
+        # pytest `item.keywords` supports `in`; a dict suffices.
+        self.keywords = {k: True for k in keywords}
+        self.markers = []
+
+    def add_marker(self, mark):
+        self.markers.append(mark)
+
+
+def test_conftest_skips_and_records_requires_deps_when_not_ready(tmp_path, monkeypatch):
+    conftest = _load_conftest()
+    report = tmp_path / "skip.txt"
+    monkeypatch.setenv("CCCI_DEPS_READY", "0")
+    monkeypatch.setenv("CCCI_DEPS_NOT_READY_REASON", "keycloak realm setup boom")
+    monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report))
+
+    sso1 = _FakeItem(["requires_deps"])
+    sso2 = _FakeItem(["requires_deps"])
+    plain = _FakeItem([])  # a non-deps custom test — must NOT be skipped
+    conftest.pytest_collection_modifyitems(config=None, items=[sso1, sso2, plain])
+
+    # Both requires_deps items got a skip marker; the plain one did not.
+    assert len(sso1.markers) == 1 and len(sso2.markers) == 1
+    assert plain.markers == []
+    # The skipped count was recorded for the orchestrator.
+    assert report.read_text().split() == ["2"]
+
+
+def test_conftest_appends_across_invocations(tmp_path, monkeypatch):
+    """The orchestrator runs one pytest per custom file; counts must accumulate (append)."""
+    conftest = _load_conftest()
+    report = tmp_path / "skip.txt"
+    monkeypatch.setenv("CCCI_DEPS_READY", "0")
+    monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report))
+
+    conftest.pytest_collection_modifyitems(None, [_FakeItem(["requires_deps"])])
+    conftest.pytest_collection_modifyitems(None, [_FakeItem(["requires_deps"]), _FakeItem(["requires_deps"])])
+
+    total = sum(int(x) for x in report.read_text().split())
+    assert total == 3
+
+
+def test_conftest_noop_and_no_record_when_deps_ready(tmp_path, monkeypatch):
+    """deps ready → no skips, no report file written (early return)."""
+    conftest = _load_conftest()
+    report = tmp_path / "skip.txt"
+    monkeypatch.setenv("CCCI_DEPS_READY", "1")
+    monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report))
+
+    item = _FakeItem(["requires_deps"])
+    conftest.pytest_collection_modifyitems(None, [item])
+
+    assert item.markers == []  # not skipped
+    assert not report.exists()  # nothing recorded