153 lines
7.2 KiB
Markdown
153 lines
7.2 KiB
Markdown
# REVIEW — server regression canaries phase (Adversary ledger)
|
|
|
|
**Phase:** server regression canaries (codified E2E self-tests)
|
|
**SSOT:** `/srv/cc-ci/cc-ci-plan/plan-server-regression-canaries.md`
|
|
**Adversary loop started:** 2026-06-02T01:15Z
|
|
**Repo:** git.autonomic.zone/recipe-maintainers/cc-ci
|
|
**Adversary clone:** /srv/cc-ci/cc-ci-adv
|
|
|
|
---
|
|
|
|
## D-gate verdicts
|
|
|
|
### D-initial update @2026-06-02T01:46Z — A-reg-1 CLOSED; A-reg-2 still open
|
|
|
|
**A-reg-1 RESOLVED.** Cold-verify after fix:
|
|
```
|
|
ssh cc-ci && cd /root/builder-clone && git pull --rebase
|
|
cc-ci-run -m pytest tests/regression/ --collect-only
|
|
```
|
|
Output: `collected 3 items` — `test_canary[good-simple]`, `test_canary[good-significant]`, `test_canary[bad-false-green]`. No errors.
|
|
|
|
**Canary artifacts cold-verified from cc-ci artifact dirs:**
|
|
|
|
`good-simple (custom-html-tiny)` — `/var/lib/cc-ci-runs/regression-good-simple-1/results.json`:
|
|
- `results: install=pass, upgrade=pass, backup=skip, restore=skip, custom=skip` ✓
|
|
- `flags: clean_teardown=true, no_secret_leak=true` ✓
|
|
- `install/test_serving`: PASS ✓ (stage_has_passing_test confirms teeth present)
|
|
|
|
`bad-false-green (custom-html v5-stale-docroot)` — `/var/lib/cc-ci-runs/regression-bad-canary-1/results.json`:
|
|
- `results: install=pass, upgrade=pass, backup=pass, restore=pass, custom=FAIL` ✓
|
|
- `flags: clean_teardown=true, no_secret_leak=true` ✓
|
|
- `custom/test_content_type_html_and_txt`: FAIL with `Content-Type='application/octet-stream'` ✓
|
|
- `rc` would be non-zero (any(v=="fail")) ✓ → regression test `assert rc != 0` PASSES
|
|
|
|
`good-significant (lasuite-docs)` — upgrade FAILED in Builder's run:
|
|
- `results: install=PASS, upgrade=FAIL` — `test_upgrade_reconverges` → convergence race
|
|
- This is the known WOPI/upgrade convergence risk from the plan (§ Risks). Builder is re-running.
|
|
- OBSERVATION (non-blocking now): if consistently flaky, add bounded retries to readiness probe per
|
|
plan policy ("bounded retries on readiness only, never on correctness assertion"). Will watch.
|
|
|
|
**A-reg-2 partially addressed** — 4 per-tier RED canary tests added to suite, 7 tests collect.
|
|
But bad-backup and bad-restore FIXTURES are broken (see A-reg-3). A-reg-2 cannot close until
|
|
all 4 canaries actually produce the expected results.
|
|
|
|
---
|
|
|
|
### D-initial-2 update @2026-06-02T02:00Z — A-reg-3 filed; bad-backup/bad-restore fixtures broken
|
|
|
|
4 per-tier RED canary tests now in suite (7 tests collect via cold --collect-only). SHAs verified:
|
|
- `4ae8866100563204` (custom-html-tiny, bad image) ✓ — bad-install + bad-upgrade fixture
|
|
- `e1e3c5fc5e2bd414` (custom-html, bad-backup) — SHA exists BUT compose.yml is empty (A-reg-3)
|
|
- `5a481cc1f6b2a462` (custom-html, bad-restore) — SHA exists BUT compose.yml is empty (A-reg-3)
|
|
|
|
**Cold-verified canary run results:**
|
|
|
|
bad-install (regression-bad-install-v2): `install=fail, upgrade=na` ✓ — install tier fails as intended
|
|
bad-upgrade (regression-bad-upgrade-v2): `install=pass, upgrade=fail, custom=skip` ✓ — upgrade tier fails as intended
|
|
bad-backup (regression-bad-backup-1): `install=pass, upgrade=fail, backup=skip` ✗ — WRONG TIER
|
|
|
|
Root cause A-reg-3: `regression-bad-backup` branch has empty compose.yml (whole file deleted, not
|
|
just backup path changed). Empty compose → chaos upgrade deploy fails → upgrade=fail, backup never
|
|
runs. Same issue for `regression-bad-restore` (same empty compose.yml diff).
|
|
|
|
**`_assert_red_at_tier` for bad-backup would FAIL** with `expected 'backup'='fail', got 'skip'` —
|
|
proving the fixture is broken, not the test.
|
|
|
|
**What still needs fixing before final gate:**
|
|
1. A-reg-3: Recreate bad-backup and bad-restore fixtures with correct compose.yml (only targeted change)
|
|
2. Run bad-backup and bad-restore to confirm correct tier failures
|
|
3. Re-run good-significant (lasuite-docs) to confirm upgrade race is transient
|
|
4. Open PR
|
|
|
|
---
|
|
|
|
### D-initial: FAIL @2026-06-02T01:38Z — suite won't collect (A-reg-1); plan gap (A-reg-2)
|
|
|
|
Builder claimed: test suite written, initial gate; canaries in-flight.
|
|
|
|
**Cold verification result: FAIL — two blocking issues.**
|
|
|
|
**A-reg-1 (CRITICAL): Relative import fails, 0 tests collected.**
|
|
```
|
|
ssh cc-ci && cd /root/builder-clone
|
|
cc-ci-run -m pytest tests/regression/ --collect-only
|
|
```
|
|
Output (cold, fresh shell):
|
|
```
|
|
collected 0 items / 1 error
|
|
ImportError: attempted relative import with no known parent package
|
|
tests/regression/test_canaries.py:18: from .conftest import run_recipe_ci, ...
|
|
!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!
|
|
```
|
|
Root cause: `tests/regression/__init__.py` and `tests/__init__.py` missing. Fix: add them or
|
|
use absolute imports (as other test files in this repo do).
|
|
|
|
**A-reg-2 (HIGH): Plan updated (commit 7bdeb74) — 4 per-tier RED canaries now mandatory (DoD#4).**
|
|
Updated plan requires RED canaries for install/upgrade/backup/restore tiers on custom-html-tiny,
|
|
each asserting RED at the intended tier with prior tiers PASS. Current suite: 3 canaries only
|
|
(2 good + 1 bad-custom-assertion). All four are MISSING. Cannot claim DONE without them.
|
|
|
|
**Other code quality observations (not blocking):**
|
|
- Canary SHAs all verified present on Gitea ✓
|
|
- custom-html-tiny: `435df8fc98ef7598` ✓ (main 2026-06-02 merge commit)
|
|
- lasuite-docs: `290a8ad72d06232f` ✓ (v0.3.3+v5.1.0 merge)
|
|
- custom-html v5-stale-docroot: `71e7326a99bbb690` ✓ (confirmed RED via build #81)
|
|
- `CCCI_RUN_ID` and `CCCI_RUNS_DIR` correctly picked up by `results.py` ✓
|
|
- `_assert_red` / `_assert_green` logic sound ✓
|
|
- README cadence policy complete ✓
|
|
|
|
**Verdict: FAIL. Standing issues: A-reg-1 (critical), A-reg-2 (high). Builder must fix both
|
|
before re-claiming this gate.**
|
|
|
|
---
|
|
|
|
## Adversary findings
|
|
|
|
*(See BACKLOG-regression.md § Adversary findings: A-reg-1, A-reg-2)*
|
|
|
|
---
|
|
|
|
## Break-it probes log
|
|
|
|
*(Break-it probes will be recorded here as they are run)*
|
|
|
|
---
|
|
|
|
## Pre-orientation findings @01:17Z
|
|
|
|
**Known-bad fixture confirmed present and working:**
|
|
- Branch: `recipe-maintainers/custom-html:v5-stale-docroot` (SHA `71e7326a99bb`)
|
|
- Build #81 (run 3h ago): confirmed RED — `custom` stage FAIL; specifically:
|
|
- `test_content_type_html_and_txt`: FAIL — `ccci-e0d6e804.txt Content-Type='application/octet-stream'`, expected `text/plain`
|
|
- All other tiers (install/upgrade/backup/restore): PASS
|
|
- `clean_teardown=true`, `no_secret_leak=true`
|
|
- **Implication for regression suite DoD#3**: the known-bad canary correctly produces RED;
|
|
the regression test must assert this outcome AND must be shown to fail if the server returns
|
|
green for it (false-green detection).
|
|
|
|
**Good canaries:**
|
|
- `custom-html-tiny`: build #45 GREEN (SHA `4bd8416a209f`, 21h ago) — simple, fast
|
|
- `lasuite-docs`: multi-service stack with DEPS=["keycloak"], DEPLOY_TIMEOUT=900s — test exists at tests/lasuite-docs/
|
|
|
|
**Infrastructure state:**
|
|
- Bridge (`ccci-bridge_app`): running, polling 20 repos every 30s ✓
|
|
- Drone exec runner: running ✓
|
|
- Dashboard: serving at ci.commoninternet.net ✓
|
|
- Builder hasn't started regression phase: no STATUS-regression.md yet
|
|
|
|
**Notes:**
|
|
- Mirror phase (plan-mirror-enroll-all-recipes.md) completed DONE at 2026-06-02T01:16Z.
|
|
- This phase starts fresh: no STATUS-regression.md or tests/regression/ yet.
|
|
- Watching for Builder to create STATUS-regression.md and begin work.
|