fix(2): F2-11 — SSO-dep deps-not-ready SKIP no longer yields GREEN !testme

When a DEPS-declaring recipe's setup_custom_tests fails, its @requires_deps (SSO/OIDC)
tests skip; a skip-only pytest file exits 0 so the run previously reported overall=0
(GREEN) while the only SSO test never ran (violates P7). Fix preserves generic-tier
failure-isolation but corrects the green SIGNAL:
- conftest.pytest_collection_modifyitems counts skipped requires_deps tests and appends
  to $CCCI_DEPS_SKIP_REPORT.
- run_recipe_ci: sums the count, surfaces it in RUN SUMMARY, and new pure predicate
  sso_dep_unverified(declared, deps_ready, skipped) flips overall=1.
- 7 new unit tests (tests/unit/test_f211_sso_skip.py).

Verified deploy-free (rate-limit-independent): 35/35 unit PASS; cold real-test proof on
lasuite-docs test_oidc_with_keycloak.py -> 1 skipped + skip-report==1 -> orchestrator
would set overall=1. Full e2e deferred until Docker Hub rate limit lifts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 21:25:27 +01:00
parent 10d2a13031
commit 5b34496557
5 changed files with 248 additions and 1 deletions

View File

@ -544,3 +544,48 @@ re-pulls that burn the anonymous quota; let the cache persist). Disk pressure mu
by pruning ONLY truly-dangling images, or by the operator growing the cc-ci disk.
(Also noted: recipe env is `ONLY_OFFICE_DOMAIN`, underscore — my EXTRA_ENV flattened COLLABORA/MINIO
domains but not onlyoffice's; only matters for the WOPI/TLS path, to revisit when base converges.)
## 2026-05-28 (later) — Gitea restored; consumed Adversary inbox; fixed F2-11 (SSO-skip-goes-green)
Gitea (git.autonomic.zone) recovered ~21:08Z (orchestrator confirmed). Reconciled: `git pull --rebase`
(up to date), pushed my 2 queued local commits (1138d77 + 4a118ea → origin), then a 3rd pull picked up
the Adversary's `b941f55` (its outage-queued writes: F2-11 + REVIEW-2 idle checkpoint + BUILDER-INBOX).
Consumed + deleted BUILDER-INBOX. The 3 watchdog pings during the outage were phantoms (Adversary's
failed push retries) — nothing was lost.
**Adversary's BUILDER-INBOX (digested):** DONE-gate warnings (F2-7 authentik, F2-9 cryptpad create-pad,
ghost §4.3 create-post floor, Q3.2 drive specifics, full P1P8 Q5 re-verify) — all need deploys, so
gated on the Docker Hub rate limit. Plus **F2-11** (medium, not a VETO), which is pure code → fixed it
now (rate-limit-independent).
**F2-11 — SSO-dep "deps-not-ready" SKIP must not yield a GREEN run.** Adversary cold-proved: when
`setup_custom_tests` fails for a DEPS-declaring recipe, `CCCI_DEPS_READY=0` → conftest skips every
`@requires_deps` test → a skip-only pytest file exits 0 → `run_custom` returns "pass" → `overall=0` →
`!testme` GREEN while the only SSO/OIDC test never ran. Violates P7.
Why my fix is shaped this way: the failure-isolation design (a transient SSO-setup failure must not
break the *generic* tier signal) is correct and I kept it — generic tier results stand untouched. The
defect was only that the green SIGNAL was indistinguishable from "SSO verified." So I correct the
signal, not the isolation:
- `conftest.pytest_collection_modifyitems` now COUNTS the requires_deps tests it skips and appends the
count to `$CCCI_DEPS_SKIP_REPORT` (one line per pytest invocation; orchestrator sums across the
per-custom-file loop). Chose a filesystem report (not exit code) because pytest has no "fail on
skip" and a skip-only file legitimately exits 0 — the orchestrator already shares run-scoped temp
files with the pytest subprocess (depsfile/statefile/countfile), so this matches the pattern.
- `run_recipe_ci`: reads + sums the count, surfaces it in RUN SUMMARY (`custom: pass (N requires_deps
SKIPPED ... SSO UNVERIFIED)`), and a new pure predicate `sso_dep_unverified(declared, deps_ready,
skipped)` flips `overall=1` when a recipe declares DEPS + deps not ready + ≥1 requires_deps skipped.
Gated on skip>0 so a deps-declaring recipe with no requires_deps tests isn't false-failed.
Verified (both deploy-free — rate-limit-independent):
1. `cc-ci-run -m pytest tests/unit -q` → **35 passed** (28 prior + 7 new in test_f211_sso_skip.py:
predicate truth table + conftest skip/record/append/noop-when-ready).
2. Cold real-test proof on cc-ci: `CCCI_DEPS_READY=0 CCCI_DEPS_SKIP_REPORT=/tmp/f211-skip.txt
cc-ci-run -m pytest tests/lasuite-docs/functional/test_oidc_with_keycloak.py -rs` → `1 skipped`,
`PYTEST_EXIT=0` (the hazard), but `/tmp/f211-skip.txt` now contains `1` → orchestrator would compute
`sso_dep_unverified(["keycloak"], False, 1)=True` → `overall=1`. Hazard closed.
Full e2e (real deploy with a forced setup_custom_tests failure → observe overall=1) deferred to when
the Docker Hub rate limit lifts; the unit + cold-real-test proofs cover the predicate, the conftest
signal on real files, and the count flow — only the sequential read→sum→predicate→overall wiring is
unexercised by a live run, and it's straight-line code.

View File

@ -69,6 +69,38 @@ Remaining substantial: Q3.2 lasuite-drive (needs mirror), Q3.3 lasuite-meet (mir
immich (needs mirror), Q4.2/Q4.5-7/Q4.9-10 (mostly need mirror). The mirror-and-enroll path is
established (recipe-create-pr skill); pausing this sprint for Adversary cold-verify.
## Adversary findings — Builder response
**F2-11 — FIXED, awaiting Adversary re-verify** (commit: `git log --oneline | grep 'F2-11'`).
SSO-dep "deps-not-ready"
SKIP no longer yields a GREEN `!testme`.
- **WHAT:** when a recipe declares `DEPS` and `setup_custom_tests` fails (deps not ready) so its
`@requires_deps` (SSO/OIDC) tests SKIP, the run now reports **FAIL** (`overall=1`), not green —
while generic-tier failure-isolation is preserved (install/upgrade/backup/restore results stand).
- **WHERE (code):**
- `tests/conftest.py::pytest_collection_modifyitems` — now counts the requires_deps tests it skips
and appends the count to `$CCCI_DEPS_SKIP_REPORT`.
- `runner/run_recipe_ci.py` — sets `CCCI_DEPS_SKIP_REPORT` (run-scoped temp, near `depsfile`);
after teardown sums the count into `requires_deps_skipped`; RUN SUMMARY annotates the custom tier
(`custom: pass (N requires_deps SKIPPED ... SSO UNVERIFIED)`); new pure predicate
`sso_dep_unverified(declared, deps_ready, requires_deps_skipped)` flips `overall=1`.
- `tests/unit/test_f211_sso_skip.py` — 7 new unit tests.
- **HOW to verify (both deploy-free, rate-limit-independent):**
1. `ssh cc-ci 'cd /root/cc-ci && cc-ci-run -m pytest tests/unit -q'`**EXPECTED: 35 passed**
(28 prior + 7 F2-11).
2. Cold real-test signal proof:
`ssh cc-ci 'cd /root/cc-ci && rm -f /tmp/f211-skip.txt && CCCI_DEPS_READY=0 \
CCCI_DEPS_NOT_READY_REASON=boom CCCI_DEPS_SKIP_REPORT=/tmp/f211-skip.txt \
cc-ci-run -m pytest tests/lasuite-docs/functional/test_oidc_with_keycloak.py -rs; \
cat /tmp/f211-skip.txt'`
**EXPECTED:** `1 skipped`, pytest exit 0 (the hazard), and `/tmp/f211-skip.txt` == `1`. Since
lasuite-docs declares `DEPS=["keycloak"]`, the orchestrator computes
`sso_dep_unverified(["keycloak"], False, 1)=True``overall=1`.
- **NOT verified by a live run yet:** full e2e (real deploy with forced setup_custom_tests failure →
observe `overall=1`) is deferred until the Docker Hub rate limit (## Blocked) lifts. The two proofs
above cover the predicate, the conftest signal on real files, and the count flow; only the
straight-line read→sum→predicate→overall wiring is unexercised by a live deploy.
## Gate
**Gate: Q2 — Adversary PASS @2026-05-28** (REVIEW-2 `## Q2 — PASS @2026-05-28 (re-verify after
F2-5 fix + F2-6 collateral resolution)`; cold e2e on `/root/adv-verify` HEAD `874bfbb`:

View File

@ -45,6 +45,17 @@ from harness import deps as deps_mod, discovery, generic, lifecycle, naming # n
ALL_STAGES = ("install", "upgrade", "backup", "restore", "custom")
def sso_dep_unverified(declared, deps_ready: bool, requires_deps_skipped: int) -> bool:
"""F2-11 gate predicate (pure, unit-tested). True when a recipe declares DEPS but its
setup_custom_tests failed (deps not ready) AND that caused ≥1 `requires_deps` (SSO/OIDC) test
to SKIP. In that case the recipe's characteristic SSO claim was NOT verified, so the run must
NOT report GREEN — even though a skip-only pytest file exits 0 and leaves every tier 'pass'.
Generic-tier failure-isolation is preserved (those results stand); only the green SIGNAL is
corrected. Gated on skip>0 so a deps-declaring recipe with no requires_deps tests isn't
false-failed."""
return bool(declared) and not deps_ready and requires_deps_skipped > 0
def _truthy(v: str | None) -> bool:
return str(v or "").strip().lower() in ("1", "true", "yes", "on")
@ -427,6 +438,11 @@ def main() -> int:
with open(depsfile, "w") as f:
json.dump({}, f)
os.environ["CCCI_DEPS_FILE"] = depsfile
# F2-11: conftest appends the count of requires_deps tests it skips (deps-not-ready) here.
skipfile = os.path.join(tempfile.gettempdir(), f"ccci-depskip-{domain}.txt")
with contextlib.suppress(OSError):
os.remove(skipfile)
os.environ["CCCI_DEPS_SKIP_REPORT"] = skipfile
declared = deps_mod.declared_deps(recipe)
if declared:
print(f"\n===== DEPS declared (deploy AFTER generic tiers): {declared} =====", flush=True)
@ -562,6 +578,15 @@ def main() -> int:
os.remove(statefile)
with contextlib.suppress(OSError):
os.remove(depsfile)
# F2-11: sum the requires_deps skip counts conftest recorded across the custom files.
requires_deps_skipped = 0
try:
with open(skipfile) as f:
requires_deps_skipped = sum(int(x) for x in f.read().split() if x.strip())
except OSError:
pass
with contextlib.suppress(OSError):
os.remove(skipfile)
# ---- per-op summary (DG6 feed) ----
# SSO-dep plan §1: DG4.1 generalised — one `abra app new` per app in the run (recipe + each
@ -582,7 +607,12 @@ def main() -> int:
print(f" deps-not-ready: {deps_not_ready_reason}")
order = [s for s in ALL_STAGES if s in results]
for op in order:
print(f" {op:8s}: {results[op]}")
suffix = ""
# F2-11: annotate the custom tier when requires_deps (SSO) tests were skipped, so a reader
# of the summary can't mistake a green custom tier for "SSO verified".
if op == "custom" and requires_deps_skipped:
suffix = f" ({requires_deps_skipped} requires_deps SKIPPED: deps-not-ready — SSO UNVERIFIED)"
print(f" {op:8s}: {results[op]}{suffix}")
overall = 0
if deploy_count != expected_deploy_count:
@ -597,6 +627,17 @@ def main() -> int:
overall = 1
if any(v == "fail" for v in results.values()):
overall = 1
# F2-11: a deps-declaring recipe whose setup_custom_tests failed has NOT verified its SSO/OIDC
# claim — its requires_deps tests SKIPPED (a skip-only file exits 0, so without this the run
# would report GREEN). Fail the run for that recipe; generic-tier results above are untouched.
if sso_dep_unverified(declared, deps_ready, requires_deps_skipped):
print(
f"!! recipe declares DEPS={declared} but setup_custom_tests failed and "
f"{requires_deps_skipped} requires_deps (SSO) test(s) were SKIPPED — SSO claim NOT "
f"verified; failing run (F2-11). deps-not-ready: {deps_not_ready_reason}",
file=sys.stderr,
)
overall = 1
if not results:
print("no tiers ran", file=sys.stderr)
return 1

View File

@ -107,9 +107,23 @@ def pytest_collection_modifyitems(config, items):
return
reason = os.environ.get("CCCI_DEPS_NOT_READY_REASON", "(no reason given)")
skip_mark = pytest.mark.skip(reason=f"deps-not-ready: {reason}")
skipped = 0
for item in items:
if "requires_deps" in item.keywords:
item.add_marker(skip_mark)
skipped += 1
# F2-11: a skip-only pytest file exits 0, so without this the orchestrator can't tell
# "SSO verified" from "SSO test silently skipped because deps weren't ready". Record the count
# of requires_deps tests we skipped to a report file the orchestrator reads — it surfaces the
# count in RUN SUMMARY and FAILS the recipe's SSO claim (a green exit must not mask an unrun
# SSO test). Appended one line per pytest invocation (one per custom file); orchestrator sums.
report = os.environ.get("CCCI_DEPS_SKIP_REPORT")
if report and skipped:
try:
with open(report, "a") as f:
f.write(f"{skipped}\n")
except OSError:
pass
def pytest_configure(config):

View File

@ -0,0 +1,115 @@
"""Unit tests for F2-11 — SSO-dep "deps-not-ready" SKIP must NOT yield a GREEN run.
Two halves of the fix are tested without any real deploy:
1. `run_recipe_ci.sso_dep_unverified` — the pure gate predicate the orchestrator uses to flip
`overall` to fail when a deps-declaring recipe's SSO tests were skipped because deps weren't ready.
2. `conftest.pytest_collection_modifyitems` — when CCCI_DEPS_READY=0 it (a) skips every
`requires_deps` test and (b) records the skipped count to `$CCCI_DEPS_SKIP_REPORT` so the
orchestrator can surface it + gate on it.
The end-to-end hazard (a real SSO-dep recipe going green on a skip) is exercised by e2e against
cc-ci; here we lock the decision logic + the conftest→orchestrator signal that drives it.
"""
from __future__ import annotations
import importlib.util
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
import run_recipe_ci # noqa: E402
# ---- 1. the pure gate predicate ----
def test_sso_dep_unverified_true_when_declared_notready_and_skipped():
"""declares DEPS + deps not ready + ≥1 requires_deps test skipped → run must FAIL (F2-11)."""
assert run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=1)
assert run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=3)
def test_sso_dep_unverified_false_when_deps_ready():
"""deps ready (setup_custom_tests succeeded) → SSO tests actually ran → not a failure."""
assert not run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=True, requires_deps_skipped=0)
def test_sso_dep_unverified_false_when_no_deps_declared():
"""A recipe with no DEPS can never trip the SSO-skip gate."""
assert not run_recipe_ci.sso_dep_unverified([], deps_ready=False, requires_deps_skipped=0)
assert not run_recipe_ci.sso_dep_unverified(None, deps_ready=False, requires_deps_skipped=2)
def test_sso_dep_unverified_false_when_nothing_skipped():
"""Deps declared + not ready but ZERO requires_deps tests skipped → don't false-fail
(the recipe has no SSO-marked tests to have been masked)."""
assert not run_recipe_ci.sso_dep_unverified(["keycloak"], deps_ready=False, requires_deps_skipped=0)
# ---- 2. conftest skip + record behavior ----
def _load_conftest():
"""Load tests/conftest.py under a private module name (avoid clashing with pytest's own
loaded `conftest`), so we can call pytest_collection_modifyitems directly with fakes."""
path = os.path.join(os.path.dirname(__file__), "..", "conftest.py")
spec = importlib.util.spec_from_file_location("ccci_conftest_under_test", path)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
return mod
class _FakeItem:
def __init__(self, keywords):
# pytest `item.keywords` supports `in`; a dict suffices.
self.keywords = {k: True for k in keywords}
self.markers = []
def add_marker(self, mark):
self.markers.append(mark)
def test_conftest_skips_and_records_requires_deps_when_not_ready(tmp_path, monkeypatch):
conftest = _load_conftest()
report = tmp_path / "skip.txt"
monkeypatch.setenv("CCCI_DEPS_READY", "0")
monkeypatch.setenv("CCCI_DEPS_NOT_READY_REASON", "keycloak realm setup boom")
monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report))
sso1 = _FakeItem(["requires_deps"])
sso2 = _FakeItem(["requires_deps"])
plain = _FakeItem([]) # a non-deps custom test — must NOT be skipped
conftest.pytest_collection_modifyitems(config=None, items=[sso1, sso2, plain])
# Both requires_deps items got a skip marker; the plain one did not.
assert len(sso1.markers) == 1 and len(sso2.markers) == 1
assert plain.markers == []
# The skipped count was recorded for the orchestrator.
assert report.read_text().split() == ["2"]
def test_conftest_appends_across_invocations(tmp_path, monkeypatch):
"""The orchestrator runs one pytest per custom file; counts must accumulate (append)."""
conftest = _load_conftest()
report = tmp_path / "skip.txt"
monkeypatch.setenv("CCCI_DEPS_READY", "0")
monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report))
conftest.pytest_collection_modifyitems(None, [_FakeItem(["requires_deps"])])
conftest.pytest_collection_modifyitems(None, [_FakeItem(["requires_deps"]), _FakeItem(["requires_deps"])])
total = sum(int(x) for x in report.read_text().split())
assert total == 3
def test_conftest_noop_and_no_record_when_deps_ready(tmp_path, monkeypatch):
"""deps ready → no skips, no report file written (early return)."""
conftest = _load_conftest()
report = tmp_path / "skip.txt"
monkeypatch.setenv("CCCI_DEPS_READY", "1")
monkeypatch.setenv("CCCI_DEPS_SKIP_REPORT", str(report))
item = _FakeItem(["requires_deps"])
conftest.pytest_collection_modifyitems(None, [item])
assert item.markers == [] # not skipped
assert not report.exists() # nothing recorded