Cold-verified from cc-ci artifact dirs + PR branch collect: - DoD#1: 7 tests collect from regression-canaries branch ✓ - DoD#2: good-simple (install/upgrade=pass, test_serving) ✓; good-significant run-2 (all tiers pass, test_serving_and_frontend) ✓ - DoD#3: bad-false-green RED, rc!=0 false-green guard has teeth ✓ - DoD#4: all 4 per-tier RED canaries at correct tiers (install/upgrade/backup/restore) ✓ - DoD#5: README cadence+canaries+add-instructions ✓ - DoD#6: PR#5 state=open, merged=False ✓ Inbox consumed; no vetoes; phase DONE pending operator PR review.
14 KiB
REVIEW — server regression canaries phase (Adversary ledger)
Phase: server regression canaries (codified E2E self-tests)
SSOT: /srv/cc-ci/cc-ci-plan/plan-server-regression-canaries.md
Adversary loop started: 2026-06-02T01:15Z
Repo: git.autonomic.zone/recipe-maintainers/cc-ci
Adversary clone: /srv/cc-ci/cc-ci-adv
D-gate verdicts
D-final: PASS @2026-06-02T03:36Z — all 7 canaries cold-verified; PR#5 open; all DoD items met
Cold verification result: PASS
All DoD items independently verified (cold shell, Adversary clone, no cached state):
DoD#1 — tests/regression/ committed:
cc-ci-run -m pytest tests/regression/ --collect-only -qon cc-ci from PR branch: 7 tests collected ✓- Files present on
regression-canariesbranch:conftest.py,test_canaries.py,README.md, plustests/custom-html-bkp-bad/andtests/custom-html-rst-bad/✓
DoD#2 — both good canaries GREEN with semantic assertion teeth:
good-simple(regression-good-simple-1, SHA435df8fc):install=pass, upgrade=pass,test_servingPASS in install stage ✓- Teeth: if
test_servingremoved →stage_has_passing_test("install","test_serving")→ False → assert fires ✓
- Teeth: if
good-significant(regression-good-significant-2, SHA290a8ad7):install=pass, upgrade=pass, backup=pass, restore=pass, custom=pass,clean_teardown=true,no_secret_leak=true✓test_serving_and_frontendPASS in install stage ✓- Teeth: if
test_serving_and_frontendremoved →stage_has_passing_test("install","test_serving_and_frontend")→ False → assert fires ✓ - Run 1 had upgrade=fail (convergence race, transient); run 2 fully GREEN. Known plan risk; no action needed unless persistent.
DoD#3 — bad-false-green catches false-green:
bad-false-green(regression-bad-canary-1, SHA71e7326a):custom=fail,test_content_type_html_and_txt: FAIL(Content-Type='application/octet-stream') ✓- Teeth: if harness returns rc=0 →
assert rc != 0fires → false-green caught ✓
DoD#4 — 4 per-tier RED canaries (cold-verified from artifacts):
bad-install(regression-bad-install-v2, SHA4ae8866):install=fail, upgrade=na✓ — failing_tier=install, passing_before=[] ✓bad-upgrade(regression-bad-upgrade-v2, SHA4ae8866):install=pass, upgrade=fail✓ — prior tier PASS verified ✓bad-backup(regression-bad-backup-5, SHAb6fe99de, recipecustom-html-bkp-bad):install=pass, backup=fail✓ —test_backup_captures_stateFAIL ✓bad-restore(regression-bad-restore-3, SHA9a73a184, recipecustom-html-rst-bad):install=pass, backup=pass, restore=fail✓ —test_restore_returns_stateFAIL ✓- All 4: if harness wrongly returned rc=0 →
assert rc != 0fires ✓; if wrong tier failed → tier check assertion fires ✓
DoD#5 — README.md:
tests/regression/README.mdpresent on regression-canaries branch ✓- Contains: cadence policy ("Do NOT run on every commit"), canary table, per-tier teeth explanation, how to add a canary ✓
DoD#6 — NOT merged, PR opened for operator review:
- PR#5:
https://git.autonomic.zone/recipe-maintainers/cc-ci/pulls/5— state=open, merged=False ✓ - Branch:
regression-canaries→main. 10 files, 704 insertions ✓ - PR body says "Do not merge — loops never merge" ✓
Observations (non-blocking, not DoD blockers):
- good-significant run 1's upgrade=fail was a convergence race; transient (run 2 passed without retry). No test weakening, no retry added — consistent with plan policy.
- Semantic stage_pass_checks only explicitly guard install tier for good-significant. Upgrade/backup/restore tooth coverage is via
_assert_green's "no tier failed" check. Limitation noted; acceptable per plan DoD requirements. - A-reg-2 comment in test_canaries.py says "test_backup_artifact fails" for bad-backup; actual behavior is test_backup_artifact passes and test_backup_captures_state fails. Misleading comment, non-blocking.
Verdict: D-final PASS. All 7 canaries verified. All 6 DoD items met. Phase is complete pending operator review of PR#5. No vetoes.
D-initial update @2026-06-02T01:46Z — A-reg-1 CLOSED; A-reg-2 still open
A-reg-1 RESOLVED. Cold-verify after fix:
ssh cc-ci && cd /root/builder-clone && git pull --rebase
cc-ci-run -m pytest tests/regression/ --collect-only
Output: collected 3 items — test_canary[good-simple], test_canary[good-significant], test_canary[bad-false-green]. No errors.
Canary artifacts cold-verified from cc-ci artifact dirs:
good-simple (custom-html-tiny) — /var/lib/cc-ci-runs/regression-good-simple-1/results.json:
results: install=pass, upgrade=pass, backup=skip, restore=skip, custom=skip✓flags: clean_teardown=true, no_secret_leak=true✓install/test_serving: PASS ✓ (stage_has_passing_test confirms teeth present)
bad-false-green (custom-html v5-stale-docroot) — /var/lib/cc-ci-runs/regression-bad-canary-1/results.json:
results: install=pass, upgrade=pass, backup=pass, restore=pass, custom=FAIL✓flags: clean_teardown=true, no_secret_leak=true✓custom/test_content_type_html_and_txt: FAIL withContent-Type='application/octet-stream'✓rcwould be non-zero (any(v=="fail")) ✓ → regression testassert rc != 0PASSES
good-significant (lasuite-docs) — upgrade FAILED in Builder's run:
results: install=PASS, upgrade=FAIL—test_upgrade_reconverges→ convergence race- This is the known WOPI/upgrade convergence risk from the plan (§ Risks). Builder is re-running.
- OBSERVATION (non-blocking now): if consistently flaky, add bounded retries to readiness probe per plan policy ("bounded retries on readiness only, never on correctness assertion"). Will watch.
A-reg-2 partially addressed — 4 per-tier RED canary tests added to suite, 7 tests collect. But bad-backup and bad-restore FIXTURES are broken (see A-reg-3). A-reg-2 cannot close until all 4 canaries actually produce the expected results.
D-initial-2 update @2026-06-02T02:00Z — A-reg-3 filed; bad-backup/bad-restore fixtures broken
4 per-tier RED canary tests now in suite (7 tests collect via cold --collect-only). SHAs verified:
4ae8866100563204(custom-html-tiny, bad image) ✓ — bad-install + bad-upgrade fixturee1e3c5fc5e2bd414(custom-html, bad-backup) — SHA exists BUT compose.yml is empty (A-reg-3)5a481cc1f6b2a462(custom-html, bad-restore) — SHA exists BUT compose.yml is empty (A-reg-3)
Cold-verified canary run results:
bad-install (regression-bad-install-v2): install=fail, upgrade=na ✓ — install tier fails as intended
bad-upgrade (regression-bad-upgrade-v2): install=pass, upgrade=fail, custom=skip ✓ — upgrade tier fails as intended
bad-backup (regression-bad-backup-1): install=pass, upgrade=fail, backup=skip ✗ — WRONG TIER
Root cause A-reg-3: regression-bad-backup branch has empty compose.yml (whole file deleted, not
just backup path changed). Empty compose → chaos upgrade deploy fails → upgrade=fail, backup never
runs. Same issue for regression-bad-restore (same empty compose.yml diff).
_assert_red_at_tier for bad-backup would FAIL with expected 'backup'='fail', got 'skip' —
proving the fixture is broken, not the test.
What still needs fixing before final gate:
A-reg-3CLOSED — fixtures fixed and cold-verified ✓A-reg-2CLOSED — all 4 per-tier RED canaries present and verified ✓- good-significant: still needs successful re-run (upgrade flakiness unresolved)
- Open PR (DoD#6): not yet opened
Comprehensive canary verification @2026-06-02T02:20Z
All 6 of 7 canaries cold-verified from cc-ci artifact dirs (fresh SSH shell, no cached state):
GREEN canaries:
good-simple(regression-good-simple-1, SHA435df8fc):install=pass, upgrade=pass, backup/restore/custom=skip,clean_teardown=true,no_secret_leak=true,test_serving: pass✓good-significant(regression-good-significant-1, SHA290a8ad7): PENDING — upgrade FAIL (convergence race). Needs re-run to confirm transient.
Custom-assertion RED canary:
bad-false-green(regression-bad-canary-1, SHA71e7326a):install/upgrade/backup/restore=pass, custom=fail,test_content_type_html_and_txt: FAIL(Content-Type='application/octet-stream') ✓
Per-tier RED canaries (all cold-verified from artifact dirs):
bad-install(regression-bad-install-v2, SHA4ae8866):install=fail, upgrade=na✓ — failing_tier=install, no prior tier checkedbad-upgrade(regression-bad-upgrade-v2, SHA4ae8866):install=pass, upgrade=fail✓ — install=pass before failingbad-backup(regression-bad-backup-5, SHAb6fe99de, recipecustom-html-bkp-bad):install=pass, backup=fail✓ — test_backup_captures_state FAILbad-restore(regression-bad-restore-3, SHA9a73a184, recipecustom-html-rst-bad):install=pass, backup=pass, restore=fail✓ — test_restore_returns_state FAIL
Teeth verification:
- good-simple: if test_serving removed → stage_has_passing_test("install","test_serving") returns False → regression test FAILS ✓
- bad-false-green: if harness returns rc=0 → assert rc!=0 FAILS → false-green caught ✓
- bad-install: if harness returns rc=0 for bad image → assert rc!=0 FAILS ✓
- bad-upgrade: if upgrade wrongly passes → tier_results["upgrade"]="pass"≠"fail" → assert FAILS ✓
- bad-backup: if backup wrongly passes → rc=0 → assert rc!=0 FAILS ✓
- bad-restore: if restore wrongly passes → tier_results["restore"]!="fail" → assert FAILS ✓; if backup wrongly fails → tier_results["backup"]!="pass" → assert FAILS ✓
DoD status:
- DoD#1 (tests/regression/ committed): ✓
- DoD#2 (good canaries GREEN with semantic assertions): good-simple ✓; good-significant PENDING re-run
- DoD#3 (bad-false-green catches false-green): ✓ verified
- DoD#4 (4 per-tier RED canaries): ✓ all 4 verified
- DoD#5 (README.md): ✓ present with cadence, canaries, how to add
- DoD#6 (PR open for operator review): NOT YET
Remaining blockers before final PASS:
- good-significant must pass (or flakiness addressed with bounded retries on readiness)
- PR must be opened (DoD#6)
D-initial: FAIL @2026-06-02T01:38Z — suite won't collect (A-reg-1); plan gap (A-reg-2)
Builder claimed: test suite written, initial gate; canaries in-flight.
Cold verification result: FAIL — two blocking issues.
A-reg-1 (CRITICAL): Relative import fails, 0 tests collected.
ssh cc-ci && cd /root/builder-clone
cc-ci-run -m pytest tests/regression/ --collect-only
Output (cold, fresh shell):
collected 0 items / 1 error
ImportError: attempted relative import with no known parent package
tests/regression/test_canaries.py:18: from .conftest import run_recipe_ci, ...
!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!
Root cause: tests/regression/__init__.py and tests/__init__.py missing. Fix: add them or
use absolute imports (as other test files in this repo do).
A-reg-2 (HIGH): Plan updated (commit 7bdeb74) — 4 per-tier RED canaries now mandatory (DoD#4). Updated plan requires RED canaries for install/upgrade/backup/restore tiers on custom-html-tiny, each asserting RED at the intended tier with prior tiers PASS. Current suite: 3 canaries only (2 good + 1 bad-custom-assertion). All four are MISSING. Cannot claim DONE without them.
Other code quality observations (not blocking):
- Canary SHAs all verified present on Gitea ✓
- custom-html-tiny:
435df8fc98ef7598✓ (main 2026-06-02 merge commit) - lasuite-docs:
290a8ad72d06232f✓ (v0.3.3+v5.1.0 merge) - custom-html v5-stale-docroot:
71e7326a99bbb690✓ (confirmed RED via build #81)
- custom-html-tiny:
CCCI_RUN_IDandCCCI_RUNS_DIRcorrectly picked up byresults.py✓_assert_red/_assert_greenlogic sound ✓- README cadence policy complete ✓
Verdict: FAIL. Standing issues: A-reg-1 (critical), A-reg-2 (high). Builder must fix both before re-claiming this gate.
Adversary findings
(See BACKLOG-regression.md § Adversary findings: A-reg-1, A-reg-2)
Break-it probes log
(Break-it probes will be recorded here as they are run)
Pre-orientation findings @01:17Z
Known-bad fixture confirmed present and working:
- Branch:
recipe-maintainers/custom-html:v5-stale-docroot(SHA71e7326a99bb) - Build #81 (run 3h ago): confirmed RED —
customstage FAIL; specifically:test_content_type_html_and_txt: FAIL —ccci-e0d6e804.txt Content-Type='application/octet-stream', expectedtext/plain- All other tiers (install/upgrade/backup/restore): PASS
clean_teardown=true,no_secret_leak=true
- Implication for regression suite DoD#3: the known-bad canary correctly produces RED; the regression test must assert this outcome AND must be shown to fail if the server returns green for it (false-green detection).
Good canaries:
custom-html-tiny: build #45 GREEN (SHA4bd8416a209f, 21h ago) — simple, fastlasuite-docs: multi-service stack with DEPS=["keycloak"], DEPLOY_TIMEOUT=900s — test exists at tests/lasuite-docs/
Infrastructure state:
- Bridge (
ccci-bridge_app): running, polling 20 repos every 30s ✓ - Drone exec runner: running ✓
- Dashboard: serving at ci.commoninternet.net ✓
- Builder hasn't started regression phase: no STATUS-regression.md yet
Notes:
- Mirror phase (plan-mirror-enroll-all-recipes.md) completed DONE at 2026-06-02T01:16Z.
- This phase starts fresh: no STATUS-regression.md or tests/regression/ yet.
- Watching for Builder to create STATUS-regression.md and begin work.