diff --git a/cc-ci-plan/plan-server-regression-canaries.md b/cc-ci-plan/plan-server-regression-canaries.md index 5ff2cdc..7cb6098 100644 --- a/cc-ci-plan/plan-server-regression-canaries.md +++ b/cc-ci-plan/plan-server-regression-canaries.md @@ -26,14 +26,36 @@ The suite proves the server can do **both** halves of its job — and the second |---|---|---|---| | **Simple (good)** | `custom-html-tiny` | Minimal, fast, few deps — quick signal | GREEN | | **Significant (good)** | `lasuite-docs` | Multi-service: backend + Postgres + Collabora WOPI + keycloak OIDC — exercises real breadth | GREEN | -| **Known-BAD (false-green guard)** | a seeded fixture (see below) | App comes up healthy but a semantic tier assertion is violated | **RED** | +| **Known-BAD: custom-assertion** | a seeded fixture (see below) | App comes up healthy but a functional/custom assertion is violated | **RED** | +| **Known-BAD: per-tier ×4** | `custom-html-tiny` broken at one tier each (see below) | install / upgrade / backup / restore each fail in turn | **RED** at the intended tier | -**Known-bad fixture:** reuse/recreate the phase-5 seeded case — `custom-html` branch `v5-stale-docroot` -(serves `.txt` as `application/octet-stream` while the app is externally healthy), which already -produced a RED build (#75) with only the content-type custom assertion failing. The regression test -asserts the harness returns **RED** for this fixture. (If that branch is gone, recreate the pattern: -an app that is up + passes lifecycle tiers but fails one functional assertion.) Pin the fixture by -commit SHA so it's stable. +**Known-bad fixture (custom-assertion):** reuse/recreate the phase-5 seeded case — `custom-html` branch +`v5-stale-docroot` (serves `.txt` as `application/octet-stream` while the app is externally healthy), +which already produced a RED build (#75) with only the content-type custom assertion failing. The +regression test asserts the harness returns **RED** for this fixture. (If that branch is gone, recreate +the pattern: an app that is up + passes lifecycle tiers but fails one functional assertion.) Pin by SHA. + +### Per-tier RED canaries — prove the server catches failure at EVERY tier (fast) + +The single fixture above only proves the server catches a *custom-assertion* failure. Add **one RED +canary per lifecycle tier** so we prove the server reports RED at each of install / upgrade / backup / +restore — false-green is the scariest regression, and it can hide at any tier (e.g. restore silently +restoring nothing, the ghost/mattermost class of bug). Use the **simplest recipe — `custom-html-tiny`** +(static content, deploys in seconds) so all four run **fast**; each is a fixture broken at exactly one +tier, pinned by commit SHA. + +| RED canary | How it's broken (custom-html-tiny fixture) | Expected harness result | +|---|---|---| +| **install** | image tag that never becomes healthy / a healthcheck that can't pass | **install tier RED** | +| **upgrade** | installs clean; the upgrade target breaks the container so post-upgrade health fails | install PASS, **upgrade tier RED** | +| **backup** | install+upgrade clean; backup misconfigured (backupbot label/target wrong → backup errors or yields no artifact) | **backup tier RED** | +| **restore** | backup succeeds; restore is a no-op (hook does nothing) so the pre-seeded marker is ABSENT after restore | **restore tier RED** (the scariest false-green) | + +Each pytest asserts **precisely**: overall verdict RED, the failing tier is the *intended* one, AND the +tiers *before* it PASSED (e.g. upgrade-RED requires install to have passed) — so it proves "catches a +failure **at this tier**", not merely "fails somewhere". These four form the **fast subset** of the +suite; consider a sub-marker (`@pytest.mark.canary_fast`) so they can optionally run as a quicker +pre-check while the slow good canaries (esp. lasuite-docs) stay on the milestone cadence below. ## What "works as expected" means per tier (real assertions, not exit codes) @@ -78,10 +100,14 @@ lasuite-docs is minutes, needs the live server/abra/Swarm). Run them **deliberat 2. Run GREEN on both good canaries (`custom-html-tiny`, `lasuite-docs`) with the per-tier semantic assertions actually executing (Adversary confirms the assertions FAIL if you tamper with an outcome — i.e. the assertions have teeth, they're not vacuous). -3. The known-bad canary makes the suite assert **RED** — and the Adversary confirms that if the server - *wrongly* returned green for it, the regression test would FAIL (false-green is caught). -4. A short `tests/regression/README.md`: how to run it, what each canary guards, how to add a canary. -5. NOT merged — recipe/test PR opened for operator review (loops never merge). +3. The custom-assertion known-bad canary makes the suite assert **RED** — and the Adversary confirms + that if the server *wrongly* returned green for it, the regression test would FAIL (false-green caught). +4. **The four per-tier RED canaries** (install/upgrade/backup/restore, on `custom-html-tiny`) each make + the suite assert RED **at the intended tier**, with the prior tiers asserted PASS. Adversary confirms + each has teeth: if the server wrongly green-lit that tier, the corresponding test would FAIL. They run + fast. +5. A short `tests/regression/README.md`: how to run it, what each canary guards, how to add a canary. +6. NOT merged — recipe/test PR opened for operator review (loops never merge). ## Risks / notes - **Slow + resource-heavy:** full lifecycle on lasuite-docs is minutes and needs the live server/abra/