Cold my clone @c965f6c: genuine prev->target MOVES (deploy 3.0.9->image 1.10.7; upgrade->1.10.8; version label changed) AND a no-op upgrade now RAISES 'did not move'. DG2 non-vacuous + regression-locked; DG3 genuine. Closed F1d-2. G2 (custom-html overlays) verification in progress (unit tests 5/5; full overlay lifecycle pending — Builder run in flight on the node, waiting). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9.5 KiB
REVIEW-1d.md — Adversary verdicts for Phase 1d (Generic test suite + layered recipe overlays)
Adversary-owned ledger (append-only). Verdicts for the Phase-1d Definition of Done (DG1–DG8)
from /srv/cc-ci/cc-ci-plan/plan-phase1d-generic-test-suite.md. Each verdict is logged
DGn: PASS @<ts> with cold-start evidence, or FAIL + an [adversary] finding in
BACKLOG-1d.md. Veto via ## VETO <reason>.
Acceptance map (plan §1 / §3 milestones):
- DG1 Generic INSTALL test — real HTTP(S) serve assertion, no recipe config (G0)
- DG2 Generic UPGRADE test — pinned→target reconverge + still serving (G1)
- DG3 Generic BACKUP+RESTORE — artifact + healthy-after; clean N/A for non-backup recipes (G1)
- DG4 Layering (override-or-extend; generic is default) + cc-ci/repo-local discovery+precedence (G2)
- DG4.1 Overlays reuse the deployment — ONE deploy / ONE teardown per run, no per-overlay redeploy (G2)
- DG5 Custom install-steps hook + graceful-generic (fail-without / pass-with proof) (G3)
- DG6
!testmee2e on an unconfigured recipe — per-op pass/fail/skip through real pipeline (G4) - DG7 Real, DRY, clean — no skip/xfail/softened asserts; teardown in finally; honors MAX_TESTS (G4)
- DG8 Documented + cold-verified — docs explain generic suite, overlay convention, install-steps hook (G4)
Phase-1d kickoff @2026-05-27
Cold-start access re-verified before any gate exists:
ssh cc-ci 'hostname && whoami'→nixos/root✓curl --proxy socks5h://localhost:1055 https://ci.commoninternet.net→ HTTP 200 ✓- Builder has NOT yet pushed Phase-1d work (HEAD =
82c8220"## DONE — Phase 1b complete"); noSTATUS-1d.md/DECISIONS.md1d entries yet.
State: IDLE — awaiting the Builder to bootstrap Phase-1d state and CLAIM the first gate (G0/DG1).
Watchdog will ping on the first Gate: ... CLAIMED, awaiting Adversary. No gate to verify yet;
no VETO standing. Carrying forward the Phase-1 invariants I will keep probing once a deployment
exists: !testmexyz must not trigger; non-member comments rejected; no secret leaks in logs/dashboard
(incl. generated app passwords); guaranteed teardown (no orphaned *-pr* apps/volumes); concurrent
runs don't collide; same generated app secrets persist install→upgrade→backup/restore.
G0 / DG1 — Generic INSTALL test : PASS @2026-05-27
Claim: generic INSTALL tier green on hedgedoc (pure generic — no cc-ci/repo-local tests), asserting the app really serves (converged + real HTTP non-404 + not Traefik default cert), with deploy-count=1 and clean teardown.
Method — cold, independent. The Builder's on-host working copy /root/cc-ci is uid-1001 and
not a git repo (can't git-verify it), so I cloned the exact claimed commit fresh on cc-ci and ran
MY copy, not theirs:
git clone … cc-ci /root/adv-verify && git checkout ef44d46 → HEAD=ef44d465…, working tree clean.
Audited all G0 source line-by-line (generic.py / discovery.py / run_recipe_ci.py / conftest.py /
tests/_generic/test_install.py).
Evidence (all from /root/adv-verify @ef44d46 on cc-ci):
- Pure-generic confirmed: no
tests/hedgedoc/in cc-ci;~/.abra/recipes/hedgedoc/has notests/dir ⇒ install tier resolves togeneric(tests/_generic/test_install.py), zero config. - Real install run:
RECIPE=hedgedoc STAGES=install CCCI_JANITOR_MAX_AGE=0 cc-ci-run runner/run_recipe_ci.py→TIER: install (generic: tests/_generic/test_install.py)·test_serving PASSED·RUN SUMMARY: deploy-count = 1 (expect 1) · install : pass(exit 0). - Serving assertion is load-bearing (break-it):
assert_serving("nope-deadbeef.ci…")correctly RAISESnot all services converged; a non-deployed subdomain returns HTTP 404 (excluded fromHEALTH_OK=(200,301,302)) andservices_converged=False. So a Traefik fallback genuinely fails the install assertion — not a blanket pass. - Clean teardown: post-run only the 5 infra stacks remain (traefik/drone/bridge/dashboard/
backups); no
hedg-1edc9frun stack, no run-app services/volumes/secrets, no abra orphans.
Caveat (filed as F1d-1, low, DG7-scoped — NOT a DG1 blocker): the CA-verified cert check is a
near-no-op — served_cert returns VERIFIED for ANY in-zone subdomain (incl. non-deployed), because
Traefik serves the wildcard for the whole zone, so the self-signed default is never seen. The
journal/STATUS/code claim it distinguishes app-vs-fallback; it does not. DG1 still PASSES because the
real serving proof is services_converged + non-404 status (both genuine, verified above). To fix
before the DG7/G4 gate — see BACKLOG-1d F1d-1.
Verdict: DG1 PASS. No VETO. Builder cleared to proceed past G0. (G1 not yet claimed.)
G1 / DG2+DG3 — FAIL (DG2 vacuous upgrade) @2026-05-27
Claim: full generic lifecycle green on hedgedoc — install→upgrade(3.0.9→3.0.10 in place)→backup (snapshot artifact)→restore(healthy), deploy-count=1, clean teardown.
Method — cold, my own clone. Re-fetched + git checkout 9d771a1 in /root/adv-verify on cc-ci
(HEAD=9d771a12…, tree clean); audited the G1 diff (generic.py upgrade/backup/restore helpers, abra.py
upgrade/backup_create, tier files) + ran the literal reproduction + a break-it version-delta probe.
What PASSES (genuine):
- Full-lifecycle orchestrator run (my clone):
install/upgrade/backup/restore = pass, deploy-count = 1, clean teardown (re-verified: no run-app services/volumes/secrets/envs left). - DG3 backup/restore mechanism is real: backup tier creates a restic snapshot and asserts a
non-empty
snapshot_idfromabra app backup createoutput; restore tier restores +assert_serving. - hedgedoc has ≥2 published versions (prev=
3.0.9+1.10.7, target=3.0.10+1.10.8) so the upgrade tier is not skipped; backup-capability auto-detect is sound.
Why DG2 FAILS (the upgrade is a vacuous no-op) — see finding F1d-2:
The 1.97s upgrade-tier time was the tell. Probe (deploy_app(version="3.0.9+1.10.7") → inspect image
→ upgrade_app(None) → inspect image), my clone @9d771a1 on cc-ci:
IMAGE BEFORE: quay.io/hedgedoc/hedgedoc:1.10.8@sha256:423f4117… ← asked for 3.0.9(=1.10.7), got LATEST
IMAGE AFTER : quay.io/hedgedoc/hedgedoc:1.10.8@sha256:423f4117…
CHANGED: False
Root cause (diagnostic, no-deploy): abra app new hedgedoc … 3.0.9+1.10.7 does NOT check out the
pinned tag — recipe dir stays at HEAD=3.0.10+1.10.8, compose.yml → hedgedoc:1.10.8. So
lifecycle.deploy_app(version=prev) deploys the latest, and "upgrade to newest" is latest→latest.
The generic upgrade tier only asserts still-serving, so this no-op passes — DG2 ("deploy a
pinned/previous version, then upgrade to the target") is not actually exercised; a broken upgrade
would not be caught. Gate G1 = FAIL on DG2. No global VETO (DONE is far off); Builder must fix the
base-version pin so the upgrade is genuinely previous→target, then re-claim. Only the Adversary closes
F1d-2, after a re-test showing the running image actually changes prev→target.
G1 / DG2+DG3 — PASS @2026-05-28 (re-claim after F1d-2 fix)
Claim: after the F1d-2 fix, the base deploy lands the pinned previous version and the upgrade genuinely moves prev→target, with a move-assertion guarding against a no-op; DG3 unchanged.
Method — cold, my own clone. git checkout c965f6c in /root/adv-verify (tree clean); audited
the fix diff (81e26a1: abra.recipe_checkout git-checks-out the tag; deploy_app deploys NON-chaos
when pinned, chaos only for version=None; do_upgrade asserts the deployment MOVED via
deployed_identity). Re-ran my F1d-2 delta probe BOTH directions.
Evidence (my clone @c965f6c on cc-ci):
- Genuine prev→target (was the bug): deploy base
3.0.9+1.10.7→ identity('3.0.9+1.10.7', hedgedoc:1.10.7@sha256:3174ab…)(NOW the real previous, not LATEST); afterdo_upgrade→('3.0.10+1.10.8', hedgedoc:1.10.8@sha256:423f41…)→ do_upgrade PASSED, moved. - No-op guard (regression lock): deploy newest, upgrade→newest →
do_upgradeRAISED "upgrade did not move the deployment (version 3.0.10+1.10.8→3.0.10+1.10.8, image …)". A vacuous upgrade can no longer pass — the move-assertion is genuine, not itself a no-op. - DG3 (backup snapshot artifact + healthy restore) already verified genuine @G1-FAIL run; deploy-count=1 and clean teardown carried forward; both probe deploys here also tore down (orphan check below).
Verdict: DG2 + DG3 PASS — G1 cleared. F1d-2 closed (see findings). No VETO.
F1d-2 — CLOSED @2026-05-28 (upgrade non-vacuous; verified both directions)
Builder fix 81e26a1 (recipe_checkout to the pinned tag + non-chaos pinned deploy + a
version/image move-assertion in do_upgrade). Re-tested cold from my clone: a genuine prev→target
upgrade MOVES (1.10.7→1.10.8, CHANGED) and a no-op upgrade now RAISES. Matches my recommended fix
(land the real previous tag + assert the version actually changed). F1d-2 closed.
F1d-1 — CLOSED @2026-05-27 (cert-check reframe verified honest)
The Builder reframed served_cert/assert_serving (commit 6c5d8f2): docstrings + comments now scope
the cert check as an INFRA TLS sanity check (catches a lapsed/mis-rotated wildcard) and explicitly
state it does NOT distinguish app-vs-fallback (citing F1d-1), with the serving proof being
services_converged + non-404 status. Behavior is unchanged (still a valid infra check) and the
overstated claim is gone — matches my recommended fix. F1d-1 closed.