feat(regression): add tests/regression/ E2E canary suite
Some checks failed
continuous-integration/drone/push Build is failing

Three canaries (@pytest.mark.canary) drive the real cold CI lifecycle:
- good-simple: custom-html-tiny @ main (435df8fc) — fast signal, expects GREEN
- good-significant: lasuite-docs @ main (290a8ad7) — multi-service, expects GREEN
- bad-false-green: custom-html @ v5-stale-docroot (71e7326a) — expects RED

Semantic teeth: beyond exit-code, each test asserts that specific named tests
ran in results.json stages (test_serving, test_serving_and_frontend, test_content_type).
If an assertion is removed, the named test disappears → regression test fails.

Includes conftest (run_recipe_ci helper + stage_has_{passing,failing}_test),
README (cadence policy, how to run, how to add), and phase state files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
autonomic-bot
2026-06-02 01:25:55 +00:00
parent 91a7088f56
commit fd3db37c49
6 changed files with 566 additions and 1 deletions

View File

@ -2,7 +2,12 @@
## Build backlog
*(Builder-owned section — read-only for Adversary)*
- [x] Create `tests/regression/` suite (conftest + test_canaries + README)
- [ ] Run `good-simple` canary (custom-html-tiny main) → confirm GREEN + test_serving passes
- [ ] Run `bad-false-green` canary (custom-html v5-stale-docroot) → confirm RED + test_content_type fails
- [ ] Run `good-significant` canary (lasuite-docs main) → confirm GREEN + test_serving_and_frontend passes
- [ ] Open PR for operator review (DoD item 5: NOT merged)
- [ ] Claim gate once all canary runs are GREEN/RED as expected + PR is open
## Adversary findings

View File

@ -0,0 +1,56 @@
# JOURNAL — server regression canaries phase (Builder)
**Phase:** server regression canaries
**Started:** 2026-06-02
---
## Step 0 — phase kickoff and design (2026-06-02)
**Context:** Mirror phase (plan-mirror-enroll-all-recipes.md) completed DONE at 2026-06-02T01:16Z.
Adversary initialized regression phase files in machine-docs/ at commit f202c5a.
**Decision: run regression tests ON cc-ci, not from the orchestrator**
The regression tests call `run_recipe_ci.py` which uses abra/docker/swarm — these only exist on
cc-ci. The test process runs under `cc-ci-run python -m pytest`, which sets up the right PATH
(abra, python3, playwright, etc.). The test then invokes `run_recipe_ci.py` as a subprocess using
`sys.executable` (inherits the same python3 from cc-ci-run).
The README.md documents the `ssh cc-ci "cc-ci-run python -m pytest tests/regression/ -m canary"`
invocation pattern.
**Canary selection:**
| ID | Recipe | SHA | Rationale |
|----|--------|-----|-----------|
| good-simple | custom-html-tiny | 435df8fc (main) | Fast, few deps, quick signal |
| good-significant | lasuite-docs | 290a8ad7 (main) | Multi-service, exercises real breadth |
| bad-false-green | custom-html | 71e7326a (v5-stale-docroot) | Already produced RED build #75; pinned fixture |
SHAs confirmed from Gitea API on 2026-06-02.
**Semantic checks ("teeth") design:**
The regression tests assert BOTH exit code AND named tests in results.json stages. This guards
against two failure modes:
1. Harness returns wrong exit code (false-green / false-red) → rc assertion catches it
2. A specific assertion is silently removed/vacuated → named test disappears from stages → semantic check catches it
For custom-html-tiny: `test_serving` (generic install) must appear passing
For lasuite-docs: `test_serving_and_frontend` (install overlay) must appear passing
For bad canary: `test_content_type` (custom functional) must appear failing
**File layout:**
- `tests/regression/conftest.py` — run_recipe_ci(), stage_has_passing_test(), stage_has_failing_test()
- `tests/regression/test_canaries.py` — parametrized @pytest.mark.canary test
- `tests/regression/README.md` — cadence policy + how to run + how to add
**Next step:** commit + push, then run good-simple and bad-false-green canaries to get real output.
lasuite-docs is slow (10-20 min) so will run it last.
---
## Step 1 — initial canary runs (in progress, 2026-06-02)
Committed suite, will record canary outputs here as they complete.

View File

@ -0,0 +1,85 @@
# STATUS — server regression canaries phase
**Phase:** server regression canaries (codified E2E self-tests)
**SSOT:** `/srv/cc-ci/cc-ci-plan/plan-server-regression-canaries.md`
**Builder loop started:** 2026-06-02
**Repo:** git.autonomic.zone/recipe-maintainers/cc-ci
---
## Current state
**Gate: D-initial CLAIMED — test suite written; awaiting first canary run**
The `tests/regression/` suite is committed. Before claiming the final gate (all DoD items
verified), the canaries need to actually run on the live server and return the expected verdicts.
Currently running the good-simple (custom-html-tiny) canary to confirm GREEN.
---
## What was built
`tests/regression/` committed in the cc-ci repo:
- `conftest.py``run_recipe_ci()` helper that invokes the real harness as subprocess, returns `(rc, results_dict, artifact_dir)`; `stage_has_passing_test()` / `stage_has_failing_test()` helpers for semantic checks
- `test_canaries.py` — parametrized `@pytest.mark.canary` test with three canaries (see below)
- `README.md` — cadence policy, how to run, how to add a canary
---
## Canaries defined
| ID | Recipe | SHA pinned | Expected |
|----|--------|-----------|----------|
| `good-simple` | `custom-html-tiny` | `435df8fc` (main 2026-06-02) | GREEN |
| `good-significant` | `lasuite-docs` | `290a8ad7` (main 2026-06-02) | GREEN |
| `bad-false-green` | `custom-html` | `71e7326a` (v5-stale-docroot) | RED |
---
## Semantic assertions (teeth)
Good canaries:
- `rc == 0` (harness exit)
- install tier: "pass"
- No tier is "fail"
- `flags.clean_teardown == True`
- `flags.no_secret_leak == True`
- Named test `test_serving` present + passing in install stage (custom-html-tiny)
- Named test `test_serving_and_frontend` present + passing in install stage (lasuite-docs)
Bad canary:
- `rc != 0` (PRIMARY — false-green catches here)
- Named test `test_content_type` present + FAILING in custom stage (proves guard not vacuated)
---
## How to verify (Adversary commands)
From cc-ci server root (requires the repo checked out at `/root/cc-ci` or similar):
```bash
# Good simple (fast ~2-5 min):
cc-ci-run python -m pytest tests/regression/ -m canary -k good-simple -v
# Bad canary (fast ~2-5 min, same recipe lifecycle):
cc-ci-run python -m pytest tests/regression/ -m canary -k bad-false-green -v
# Full suite (slow — lasuite-docs is 10-20 min):
cc-ci-run python -m pytest tests/regression/ -m canary -v
```
Expected outcomes:
- `good-simple`: test PASSES (harness returns GREEN, test_serving passes)
- `bad-false-green`: test PASSES (harness returns RED, test_content_type fails in custom stage)
- `good-significant`: test PASSES (harness returns GREEN, test_serving_and_frontend passes)
Verify teeth: tamper with an outcome to confirm the regression test fails:
- For good canary: unset `test_serving` (remove it) → `stage_has_passing_test` returns False → test fails
- For bad canary: change the assert to `rc == 0` → would fail if harness returns non-zero (teeth work)
---
## In-flight
- Canary `good-simple` running on cc-ci (started ~now)
- Status will be updated once run completes and we have actual output to paste