Some checks failed
continuous-integration/drone/push Build is failing
Three canaries (@pytest.mark.canary) drive the real cold CI lifecycle:
- good-simple: custom-html-tiny @ main (435df8fc) — fast signal, expects GREEN
- good-significant: lasuite-docs @ main (290a8ad7) — multi-service, expects GREEN
- bad-false-green: custom-html @ v5-stale-docroot (71e7326a) — expects RED
Semantic teeth: beyond exit-code, each test asserts that specific named tests
ran in results.json stages (test_serving, test_serving_and_frontend, test_content_type).
If an assertion is removed, the named test disappears → regression test fails.
Includes conftest (run_recipe_ci helper + stage_has_{passing,failing}_test),
README (cadence policy, how to run, how to add), and phase state files.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
137 lines
5.0 KiB
Markdown
137 lines
5.0 KiB
Markdown
# Regression canaries — E2E self-tests for the cc-ci server
|
|
|
|
A standing pytest suite that drives the **real** cc-ci lifecycle harness against pinned canary
|
|
recipes and verifies both halves of the server's job:
|
|
|
|
1. **Good canaries** — healthy apps are reported GREEN (install + upgrade + backup/restore pass).
|
|
2. **Bad canary** — broken apps are caught RED; a false-green makes the regression test itself fail.
|
|
|
|
These tests run the full cold lifecycle on the live cc-ci server. They are **slow** (minutes per
|
|
canary) and **opt-in** — kept out of the per-commit fast path by the `canary` marker.
|
|
|
|
---
|
|
|
|
## How to run
|
|
|
|
Run on the cc-ci server (abra + Docker + Swarm required):
|
|
|
|
```bash
|
|
ssh cc-ci
|
|
cd /root/cc-ci # or wherever the repo is checked out
|
|
cc-ci-run python -m pytest tests/regression/ -m canary -v
|
|
```
|
|
|
|
Or a single canary:
|
|
|
|
```bash
|
|
cc-ci-run python -m pytest tests/regression/ -m canary -k good-simple -v
|
|
```
|
|
|
|
From the orchestrator:
|
|
|
|
```bash
|
|
ssh cc-ci "cd /root/cc-ci && cc-ci-run python -m pytest tests/regression/ -m canary -v"
|
|
```
|
|
|
|
---
|
|
|
|
## Canaries
|
|
|
|
| ID | Recipe | Purpose | Expected verdict |
|
|
|----|--------|---------|-----------------|
|
|
| `good-simple` | `custom-html-tiny` | Minimal static server — fast signal | GREEN |
|
|
| `good-significant` | `lasuite-docs` | Multi-service (backend + Postgres + Collabora + OIDC) | GREEN |
|
|
| `bad-false-green` | `custom-html` @ `v5-stale-docroot` | App is UP but serves wrong Content-Type — catches false-green | RED |
|
|
|
|
### Why the bad canary exists
|
|
|
|
The scariest regression is a **false-green**: the server reports PASS while the app is broken.
|
|
We already saw a fabricated full-PASS during the build. The `bad-false-green` canary pins a known-
|
|
broken fixture (`v5-stale-docroot`: nginx serves `.txt` as `application/octet-stream`). The
|
|
harness's `test_content_type_html_and_txt` catches this and returns RED (build #75 was RED for
|
|
exactly this fixture).
|
|
|
|
The regression test asserts `rc != 0`. If the harness ever wrongly returns green for this fixture,
|
|
that assert fires — false-green is caught before any merge.
|
|
|
|
---
|
|
|
|
## What each canary verifies
|
|
|
|
### Per-tier semantic assertions (the "teeth")
|
|
|
|
The tests assert MORE than the harness exit code: they check that **specific named assertions**
|
|
ran and got the expected result. This guards against a different failure mode — a tier that
|
|
nominally "passes" because the assertion was silently removed or made vacuous.
|
|
|
|
| Stage | Test name | What it proves |
|
|
|-------|-----------|---------------|
|
|
| install | `test_serving` | Generic HTTP readiness check actually ran |
|
|
| install | `test_serving_and_frontend` | Lasuite-docs frontend (SPA shell) actually loaded |
|
|
| custom | `test_content_type` | Content-type assertion actually ran (bad canary only) |
|
|
|
|
If a tier assertion is removed: the named test disappears from `results.json` → the semantic
|
|
check fires → the regression suite catches the removal.
|
|
|
|
### Additional structural assertions (good canaries)
|
|
|
|
- `install` tier: "pass" (not fail, not skip)
|
|
- No tier is "fail" (skips acceptable for recipes without backup/custom tests)
|
|
- `flags.clean_teardown = True` (no leftover containers/volumes/secrets)
|
|
- `flags.no_secret_leak = True` (no secret value in the results artifact)
|
|
|
|
---
|
|
|
|
## Cadence policy
|
|
|
|
**Do NOT run on every commit or PR.** These are slow and resource-heavy. Run them:
|
|
|
|
- Before a **release** of the cc-ci server (after a batch of server changes).
|
|
- As a **polishing pass** or pre-merge check for significant server refactors.
|
|
- On-demand when you suspect a regression: `pytest -m canary`.
|
|
|
|
They are NOT wired to the per-commit Drone pipeline. If adding a `!testme`-style trigger for the
|
|
cc-ci repo, gate it behind a deliberate label (e.g. `run-canaries`) — not an automatic run on
|
|
every push.
|
|
|
|
---
|
|
|
|
## How to add a canary
|
|
|
|
1. Identify a recipe that is already deployable and has pinned version tags.
|
|
2. Decide the expected verdict (GREEN or RED) and which tier assertions have teeth.
|
|
3. Add an entry to `CANARIES` in `test_canaries.py`:
|
|
|
|
```python
|
|
{
|
|
"id": "good-myrecipe",
|
|
"recipe": "my-recipe",
|
|
"src": "recipe-maintainers/my-recipe",
|
|
"ref": "<pinned-sha>", # pin to a specific commit for stability
|
|
"expected_green": True,
|
|
"stage_pass_checks": [
|
|
("install", "test_serving"), # verify this named test ran and passed
|
|
],
|
|
"stage_fail_checks": [],
|
|
}
|
|
```
|
|
|
|
4. Run the canary once to confirm it passes:
|
|
`cc-ci-run python -m pytest tests/regression/ -m canary -k good-myrecipe -v`
|
|
|
|
5. Update the pin comment with the date and the recipe version it was pinned at.
|
|
|
|
---
|
|
|
|
## Pin maintenance
|
|
|
|
Canary refs are pinned to specific SHAs for stability. When a recipe publishes a new release:
|
|
|
|
1. Update the `"ref"` SHA in the canary definition (use the new main-branch HEAD).
|
|
2. Update the pin comment with the new date/version.
|
|
3. Re-run the canary to confirm GREEN before committing the pin update.
|
|
|
|
The bad canary (`v5-stale-docroot`) is a stable fixture branch — update only if the branch is
|
|
deleted. If deleted, recreate the pattern: an app that is up + passes lifecycle tiers but fails
|
|
one functional assertion.
|