cc-ci/tests/regression/README.md

# Regression canaries — E2E self-tests for the cc-ci server

A standing pytest suite that drives the **real** cc-ci lifecycle harness against pinned canary
recipes and verifies both halves of the server's job:

1. **Good canaries** — healthy apps are reported GREEN (install + upgrade + backup/restore pass).
2. **Bad canary** — broken apps are caught RED; a false-green makes the regression test itself fail.

These tests run the full cold lifecycle on the live cc-ci server. They are **slow** (minutes per
canary) and **opt-in** — kept out of the per-commit fast path by the `canary` marker.

---

## How to run

Run on the cc-ci server (abra + Docker + Swarm required):

```bash
ssh cc-ci
cd /root/cc-ci            # or wherever the repo is checked out
cc-ci-run python -m pytest tests/regression/ -m canary -v
```

Or a single canary:

```bash
cc-ci-run python -m pytest tests/regression/ -m canary -k good-simple -v
```

From the orchestrator:

```bash
ssh cc-ci "cd /root/cc-ci && cc-ci-run python -m pytest tests/regression/ -m canary -v"
```

---

## Canaries

| ID | Recipe | Purpose | Expected verdict |
|----|--------|---------|-----------------|
| `good-simple` | `custom-html-tiny` | Minimal static server — fast signal | GREEN |
| `good-significant` | `lasuite-docs` | Multi-service (backend + Postgres + Collabora + OIDC) | GREEN |
| `bad-false-green` | `custom-html` @ `v5-stale-docroot` | App is UP but serves wrong Content-Type — catches false-green | RED |

### Why the bad canary exists

The scariest regression is a **false-green**: the server reports PASS while the app is broken.
We already saw a fabricated full-PASS during the build. The `bad-false-green` canary pins a known-
broken fixture (`v5-stale-docroot`: nginx serves `.txt` as `application/octet-stream`). The
harness's `test_content_type_html_and_txt` catches this and returns RED (build #75 was RED for
exactly this fixture).

The regression test asserts `rc != 0`. If the harness ever wrongly returns green for this fixture,
that assert fires — false-green is caught before any merge.

---

## What each canary verifies

### Per-tier semantic assertions (the "teeth")

The tests assert MORE than the harness exit code: they check that **specific named assertions**
ran and got the expected result. This guards against a different failure mode — a tier that
nominally "passes" because the assertion was silently removed or made vacuous.

| Stage | Test name | What it proves |
|-------|-----------|---------------|
| install | `test_serving` | Generic HTTP readiness check actually ran |
| install | `test_serving_and_frontend` | Lasuite-docs frontend (SPA shell) actually loaded |
| custom | `test_content_type` | Content-type assertion actually ran (bad canary only) |

If a tier assertion is removed: the named test disappears from `results.json` → the semantic
check fires → the regression suite catches the removal.

### Additional structural assertions (good canaries)

- `install` tier: "pass" (not fail, not skip)
- No tier is "fail" (skips acceptable for recipes without backup/custom tests)
- `flags.clean_teardown = True` (no leftover containers/volumes/secrets)
- `flags.no_secret_leak = True` (no secret value in the results artifact)

---

## Cadence policy

**Do NOT run on every commit or PR.** These are slow and resource-heavy. Run them:

- Before a **release** of the cc-ci server (after a batch of server changes).
- As a **polishing pass** or pre-merge check for significant server refactors.
- On-demand when you suspect a regression: `pytest -m canary`.

They are NOT wired to the per-commit Drone pipeline. If adding a `!testme`-style trigger for the
cc-ci repo, gate it behind a deliberate label (e.g. `run-canaries`) — not an automatic run on
every push.

---

## How to add a canary

1. Identify a recipe that is already deployable and has pinned version tags.
2. Decide the expected verdict (GREEN or RED) and which tier assertions have teeth.
3. Add an entry to `CANARIES` in `test_canaries.py`:

```python
{
    "id": "good-myrecipe",
    "recipe": "my-recipe",
    "src": "recipe-maintainers/my-recipe",
    "ref": "<pinned-sha>",           # pin to a specific commit for stability
    "expected_green": True,
    "stage_pass_checks": [
        ("install", "test_serving"),  # verify this named test ran and passed
    ],
    "stage_fail_checks": [],
}
```

4. Run the canary once to confirm it passes:
   `cc-ci-run python -m pytest tests/regression/ -m canary -k good-myrecipe -v`

5. Update the pin comment with the date and the recipe version it was pinned at.

---

## Pin maintenance

Canary refs are pinned to specific SHAs for stability. When a recipe publishes a new release:

1. Update the `"ref"` SHA in the canary definition (use the new main-branch HEAD).
2. Update the pin comment with the new date/version.
3. Re-run the canary to confirm GREEN before committing the pin update.

The bad canary (`v5-stale-docroot`) is a stable fixture branch — update only if the branch is
deleted. If deleted, recreate the pattern: an app that is up + passes lifecycle tiers but fails
one functional assertion.