Some checks failed
continuous-integration/drone/push Build is failing
Three canaries (@pytest.mark.canary) drive the real cold CI lifecycle:
- good-simple: custom-html-tiny @ main (435df8fc) — fast signal, expects GREEN
- good-significant: lasuite-docs @ main (290a8ad7) — multi-service, expects GREEN
- bad-false-green: custom-html @ v5-stale-docroot (71e7326a) — expects RED
Semantic teeth: beyond exit-code, each test asserts that specific named tests
ran in results.json stages (test_serving, test_serving_and_frontend, test_content_type).
If an assertion is removed, the named test disappears → regression test fails.
Includes conftest (run_recipe_ci helper + stage_has_{passing,failing}_test),
README (cadence policy, how to run, how to add), and phase state files.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
57 lines
2.4 KiB
Markdown
57 lines
2.4 KiB
Markdown
# JOURNAL — server regression canaries phase (Builder)
|
|
|
|
**Phase:** server regression canaries
|
|
**Started:** 2026-06-02
|
|
|
|
---
|
|
|
|
## Step 0 — phase kickoff and design (2026-06-02)
|
|
|
|
**Context:** Mirror phase (plan-mirror-enroll-all-recipes.md) completed DONE at 2026-06-02T01:16Z.
|
|
Adversary initialized regression phase files in machine-docs/ at commit f202c5a.
|
|
|
|
**Decision: run regression tests ON cc-ci, not from the orchestrator**
|
|
|
|
The regression tests call `run_recipe_ci.py` which uses abra/docker/swarm — these only exist on
|
|
cc-ci. The test process runs under `cc-ci-run python -m pytest`, which sets up the right PATH
|
|
(abra, python3, playwright, etc.). The test then invokes `run_recipe_ci.py` as a subprocess using
|
|
`sys.executable` (inherits the same python3 from cc-ci-run).
|
|
|
|
The README.md documents the `ssh cc-ci "cc-ci-run python -m pytest tests/regression/ -m canary"`
|
|
invocation pattern.
|
|
|
|
**Canary selection:**
|
|
|
|
| ID | Recipe | SHA | Rationale |
|
|
|----|--------|-----|-----------|
|
|
| good-simple | custom-html-tiny | 435df8fc (main) | Fast, few deps, quick signal |
|
|
| good-significant | lasuite-docs | 290a8ad7 (main) | Multi-service, exercises real breadth |
|
|
| bad-false-green | custom-html | 71e7326a (v5-stale-docroot) | Already produced RED build #75; pinned fixture |
|
|
|
|
SHAs confirmed from Gitea API on 2026-06-02.
|
|
|
|
**Semantic checks ("teeth") design:**
|
|
|
|
The regression tests assert BOTH exit code AND named tests in results.json stages. This guards
|
|
against two failure modes:
|
|
1. Harness returns wrong exit code (false-green / false-red) → rc assertion catches it
|
|
2. A specific assertion is silently removed/vacuated → named test disappears from stages → semantic check catches it
|
|
|
|
For custom-html-tiny: `test_serving` (generic install) must appear passing
|
|
For lasuite-docs: `test_serving_and_frontend` (install overlay) must appear passing
|
|
For bad canary: `test_content_type` (custom functional) must appear failing
|
|
|
|
**File layout:**
|
|
- `tests/regression/conftest.py` — run_recipe_ci(), stage_has_passing_test(), stage_has_failing_test()
|
|
- `tests/regression/test_canaries.py` — parametrized @pytest.mark.canary test
|
|
- `tests/regression/README.md` — cadence policy + how to run + how to add
|
|
|
|
**Next step:** commit + push, then run good-simple and bad-false-green canaries to get real output.
|
|
lasuite-docs is slow (10-20 min) so will run it last.
|
|
|
|
---
|
|
|
|
## Step 1 — initial canary runs (in progress, 2026-06-02)
|
|
|
|
Committed suite, will record canary outputs here as they complete.
|