diff --git a/cc-ci-plan/plan-phase2-recipe-tests.md b/cc-ci-plan/plan-phase2-recipe-tests.md index f1add92..9eaf188 100644 --- a/cc-ci-plan/plan-phase2-recipe-tests.md +++ b/cc-ci-plan/plan-phase2-recipe-tests.md @@ -64,8 +64,12 @@ each within 24h** (logged in `REVIEW.md`): (respecting `MAX_TESTS`/node budget) and SSO setup runs automatically (§4.2). - [ ] **P6 — Browser flows where they matter (D3).** Recipes whose core function is a UI flow have a Playwright test of that flow (login, create-an-object, etc.), not just API checks. -- [ ] **P7 — No weakened tests.** Every assertion is real; nothing is `skip`/`xfail`'d to go green. - Genuinely-untestable aspects are documented findings (`DECISIONS.md`), not silent skips. +- [ ] **P7 — No weakened tests, no corners cut (§7.1).** Every assertion is real and checks app + state; nothing is `skip`/`xfail`'d, mocked, or reduced to a health-only stand-in to go green. + The bar: anything meaningful is testable with effort (OIDC/SSO, federation, media, WOPI, WebRTC + connectivity, data survival all included). Any "untestable" claim is the rare exception — a true + environment-level blocker only, with the maximal subset still implemented and **Adversary + sign-off** (§8); "needs a browser / SSO / another app" is not a valid excuse. - [ ] **P8 — Docs.** `docs/enroll-recipe.md` updated with the per-recipe test contract (§4.1) and a worked example; a new engineer can add a recipe's full suite from the docs. @@ -208,6 +212,26 @@ Same as `plan.md` §6/§6.1/§7/§9. Phase-2-specific emphases: for health-only stand-ins, `skip`/`xfail`, or assertions that don't actually check app state. - **Real data-integrity for backups** (P4) — "service is up after restore" is *not* sufficient; the seeded data must be proven to survive. + +### 7.1 Adversary mandate (Phase 2) — no skipped tests, no corners cut +The default assumption is that **everything meaningful about an app is testable with enough effort — +the job is to write a *good* test, not to declare it impossible.** OIDC/SSO login, token issuance and +JWT validation, federation endpoints, media upload/download, WOPI discovery, WebRTC ICE/connectivity, +backup data survival — these are all testable end-to-end and **must** be tested, not stubbed. The +Adversary actively enforces this and **reads the test bodies, not just pass/fail**: +- **Reject** any test that is `skip`/`xfail`/commented-out, mocked, mutated to a `health_check` + stand-in, asserts nothing material (e.g. only `status==200`), or is hard-coded to pass. +- **Reject "we couldn't test X"** unless it is a genuine *environment-level* limitation (e.g. the + test host cannot receive inbound UDP, so the full lasuite-meet *media relay* path can't complete) — + and even then demand the **maximal testable subset** (e.g. signaling, token issuance, ICE candidate + gathering) plus a `DECISIONS.md` justification with the specific technical blocker. "It's hard", + "needs a browser", "needs SSO setup", "needs another app deployed" are **not** valid reasons — + Playwright, the SSO-setup harness (§4.2), and the dependency resolver exist precisely to remove + those excuses. +- **Verify parity for real (P2):** for each `PARITY.md` row, confirm the cc-ci test checks the *same + thing* the recipe-maintainer original did — not a hollow rename. +- **Re-run cold and inspect:** the Adversary re-runs a sampled recipe's suite from a clean state and + reads the diffs/assertions; a green run with empty assertions is a FAIL and a `[adversary]` finding. - **Respect Phase-1 resource caps** — deps multiply live apps per run; keep within `MAX_TESTS`/node budget; sequence heavy recipes; teardown (incl. deps) is guaranteed. - **Tests are recipe-versioned** — they run against the PR's recipe version; don't hardcode values @@ -225,5 +249,8 @@ Same as `plan.md` §6/§6.1/§7/§9. Phase-2-specific emphases: store; SSO test users/clients are class-B, generated per run, destroyed at teardown). - How many recipe-specific tests beyond the **≥2** floor per recipe (scale with the app's surface; don't gold-plate trivial recipes). -- Recipes that genuinely can't be CI'd in this environment (e.g. ones needing inbound UDP/TURN like - lasuite-meet's media path) → document the tested subset + why (mirror Phase-1 D10's honesty rule). +- A test deemed "impossible" is the **rare exception**, not a convenient out (§7.1). It is only + acceptable for a true environment-level blocker (e.g. no inbound UDP for lasuite-meet's *media + relay*), requires the **maximal testable subset** still implemented, a specific technical reason in + `DECISIONS.md`, **and Adversary sign-off**. SSO/OIDC, browser flows, multi-app dependencies, and + data-integrity are explicitly **not** exceptions — they are testable and required.