Phase-2 plan: harden Adversary mandate — no skipped tests / corners cut
Add §7.1 Adversary mandate: default assumption is everything meaningful is testable (OIDC/SSO, federation, media, WOPI, WebRTC connectivity, backup data survival) — the job is a good test, not declaring impossibility. Adversary reads test bodies, rejects skip/xfail/mock/health-only/empty-assertion tests and bogus parity renames, re-runs cold. "Untestable" is a rare exception needing a true environment blocker + maximal subset + Adversary sign-off; "needs browser/SSO/another app" is not valid. Tighten P7 and §8 to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -64,8 +64,12 @@ each within 24h** (logged in `REVIEW.md`):
|
||||
(respecting `MAX_TESTS`/node budget) and SSO setup runs automatically (§4.2).
|
||||
- [ ] **P6 — Browser flows where they matter (D3).** Recipes whose core function is a UI flow have a
|
||||
Playwright test of that flow (login, create-an-object, etc.), not just API checks.
|
||||
- [ ] **P7 — No weakened tests.** Every assertion is real; nothing is `skip`/`xfail`'d to go green.
|
||||
Genuinely-untestable aspects are documented findings (`DECISIONS.md`), not silent skips.
|
||||
- [ ] **P7 — No weakened tests, no corners cut (§7.1).** Every assertion is real and checks app
|
||||
state; nothing is `skip`/`xfail`'d, mocked, or reduced to a health-only stand-in to go green.
|
||||
The bar: anything meaningful is testable with effort (OIDC/SSO, federation, media, WOPI, WebRTC
|
||||
connectivity, data survival all included). Any "untestable" claim is the rare exception — a true
|
||||
environment-level blocker only, with the maximal subset still implemented and **Adversary
|
||||
sign-off** (§8); "needs a browser / SSO / another app" is not a valid excuse.
|
||||
- [ ] **P8 — Docs.** `docs/enroll-recipe.md` updated with the per-recipe test contract (§4.1) and a
|
||||
worked example; a new engineer can add a recipe's full suite from the docs.
|
||||
|
||||
@ -208,6 +212,26 @@ Same as `plan.md` §6/§6.1/§7/§9. Phase-2-specific emphases:
|
||||
for health-only stand-ins, `skip`/`xfail`, or assertions that don't actually check app state.
|
||||
- **Real data-integrity for backups** (P4) — "service is up after restore" is *not* sufficient; the
|
||||
seeded data must be proven to survive.
|
||||
|
||||
### 7.1 Adversary mandate (Phase 2) — no skipped tests, no corners cut
|
||||
The default assumption is that **everything meaningful about an app is testable with enough effort —
|
||||
the job is to write a *good* test, not to declare it impossible.** OIDC/SSO login, token issuance and
|
||||
JWT validation, federation endpoints, media upload/download, WOPI discovery, WebRTC ICE/connectivity,
|
||||
backup data survival — these are all testable end-to-end and **must** be tested, not stubbed. The
|
||||
Adversary actively enforces this and **reads the test bodies, not just pass/fail**:
|
||||
- **Reject** any test that is `skip`/`xfail`/commented-out, mocked, mutated to a `health_check`
|
||||
stand-in, asserts nothing material (e.g. only `status==200`), or is hard-coded to pass.
|
||||
- **Reject "we couldn't test X"** unless it is a genuine *environment-level* limitation (e.g. the
|
||||
test host cannot receive inbound UDP, so the full lasuite-meet *media relay* path can't complete) —
|
||||
and even then demand the **maximal testable subset** (e.g. signaling, token issuance, ICE candidate
|
||||
gathering) plus a `DECISIONS.md` justification with the specific technical blocker. "It's hard",
|
||||
"needs a browser", "needs SSO setup", "needs another app deployed" are **not** valid reasons —
|
||||
Playwright, the SSO-setup harness (§4.2), and the dependency resolver exist precisely to remove
|
||||
those excuses.
|
||||
- **Verify parity for real (P2):** for each `PARITY.md` row, confirm the cc-ci test checks the *same
|
||||
thing* the recipe-maintainer original did — not a hollow rename.
|
||||
- **Re-run cold and inspect:** the Adversary re-runs a sampled recipe's suite from a clean state and
|
||||
reads the diffs/assertions; a green run with empty assertions is a FAIL and a `[adversary]` finding.
|
||||
- **Respect Phase-1 resource caps** — deps multiply live apps per run; keep within `MAX_TESTS`/node
|
||||
budget; sequence heavy recipes; teardown (incl. deps) is guaranteed.
|
||||
- **Tests are recipe-versioned** — they run against the PR's recipe version; don't hardcode values
|
||||
@ -225,5 +249,8 @@ Same as `plan.md` §6/§6.1/§7/§9. Phase-2-specific emphases:
|
||||
store; SSO test users/clients are class-B, generated per run, destroyed at teardown).
|
||||
- How many recipe-specific tests beyond the **≥2** floor per recipe (scale with the app's surface;
|
||||
don't gold-plate trivial recipes).
|
||||
- Recipes that genuinely can't be CI'd in this environment (e.g. ones needing inbound UDP/TURN like
|
||||
lasuite-meet's media path) → document the tested subset + why (mirror Phase-1 D10's honesty rule).
|
||||
- A test deemed "impossible" is the **rare exception**, not a convenient out (§7.1). It is only
|
||||
acceptable for a true environment-level blocker (e.g. no inbound UDP for lasuite-meet's *media
|
||||
relay*), requires the **maximal testable subset** still implemented, a specific technical reason in
|
||||
`DECISIONS.md`, **and Adversary sign-off**. SSO/OIDC, browser flows, multi-app dependencies, and
|
||||
data-integrity are explicitly **not** exceptions — they are testable and required.
|
||||
|
||||
Reference in New Issue
Block a user