Files
cc-ci/machine-docs/REVIEW-1d.md
autonomic-bot b5c1faffea review(1d): G1 PASS (re-claim) — F1d-2 fixed, upgrade non-vacuous (verified both ways)
Cold my clone @c965f6c: genuine prev->target MOVES (deploy 3.0.9->image 1.10.7; upgrade->1.10.8;
version label changed) AND a no-op upgrade now RAISES 'did not move'. DG2 non-vacuous +
regression-locked; DG3 genuine. Closed F1d-2. G2 (custom-html overlays) verification in progress
(unit tests 5/5; full overlay lifecycle pending — Builder run in flight on the node, waiting).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:18:22 +01:00

152 lines
9.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# REVIEW-1d.md — Adversary verdicts for Phase 1d (Generic test suite + layered recipe overlays)
Adversary-owned ledger (append-only). Verdicts for the Phase-1d Definition of Done (DG1DG8)
from `/srv/cc-ci/cc-ci-plan/plan-phase1d-generic-test-suite.md`. Each verdict is logged
`DGn: PASS @<ts>` with cold-start evidence, or `FAIL` + an `[adversary]` finding in
`BACKLOG-1d.md`. Veto via `## VETO <reason>`.
Acceptance map (plan §1 / §3 milestones):
- DG1 Generic INSTALL test — real HTTP(S) serve assertion, no recipe config (G0)
- DG2 Generic UPGRADE test — pinned→target reconverge + still serving (G1)
- DG3 Generic BACKUP+RESTORE — artifact + healthy-after; clean N/A for non-backup recipes (G1)
- DG4 Layering (override-or-extend; generic is default) + cc-ci/repo-local discovery+precedence (G2)
- DG4.1 Overlays reuse the deployment — ONE deploy / ONE teardown per run, no per-overlay redeploy (G2)
- DG5 Custom install-steps hook + graceful-generic (fail-without / pass-with proof) (G3)
- DG6 `!testme` e2e on an unconfigured recipe — per-op pass/fail/skip through real pipeline (G4)
- DG7 Real, DRY, clean — no skip/xfail/softened asserts; teardown in finally; honors MAX_TESTS (G4)
- DG8 Documented + cold-verified — docs explain generic suite, overlay convention, install-steps hook (G4)
---
## Phase-1d kickoff @2026-05-27
Cold-start access re-verified before any gate exists:
- `ssh cc-ci 'hostname && whoami'``nixos` / `root`
- `curl --proxy socks5h://localhost:1055 https://ci.commoninternet.net` → HTTP 200 ✓
- Builder has NOT yet pushed Phase-1d work (HEAD = `82c8220` "## DONE — Phase 1b complete");
no `STATUS-1d.md` / `DECISIONS.md` 1d entries yet.
State: IDLE — awaiting the Builder to bootstrap Phase-1d state and CLAIM the first gate (G0/DG1).
Watchdog will ping on the first `Gate: ... CLAIMED, awaiting Adversary`. No gate to verify yet;
no VETO standing. Carrying forward the Phase-1 invariants I will keep probing once a deployment
exists: !testmexyz must not trigger; non-member comments rejected; no secret leaks in logs/dashboard
(incl. generated app passwords); guaranteed teardown (no orphaned `*-pr*` apps/volumes); concurrent
runs don't collide; same generated app secrets persist install→upgrade→backup/restore.
---
## G0 / DG1 — Generic INSTALL test : **PASS** @2026-05-27
**Claim:** generic INSTALL tier green on **hedgedoc** (pure generic — no cc-ci/repo-local tests),
asserting the app really serves (converged + real HTTP non-404 + not Traefik default cert), with
deploy-count=1 and clean teardown.
**Method — cold, independent.** The Builder's on-host working copy `/root/cc-ci` is uid-1001 and
**not a git repo** (can't git-verify it), so I cloned the exact claimed commit fresh on cc-ci and ran
MY copy, not theirs:
`git clone … cc-ci /root/adv-verify && git checkout ef44d46``HEAD=ef44d465…`, working tree clean.
Audited all G0 source line-by-line (generic.py / discovery.py / run_recipe_ci.py / conftest.py /
tests/_generic/test_install.py).
**Evidence (all from /root/adv-verify @ef44d46 on cc-ci):**
1. *Pure-generic confirmed:* no `tests/hedgedoc/` in cc-ci; `~/.abra/recipes/hedgedoc/` has no
`tests/` dir ⇒ install tier resolves to `generic` (`tests/_generic/test_install.py`), zero config.
2. *Real install run:* `RECIPE=hedgedoc STAGES=install CCCI_JANITOR_MAX_AGE=0 cc-ci-run
runner/run_recipe_ci.py` →
`TIER: install (generic: tests/_generic/test_install.py)` · `test_serving PASSED` ·
`RUN SUMMARY: deploy-count = 1 (expect 1) · install : pass` (exit 0).
3. *Serving assertion is load-bearing (break-it):* `assert_serving("nope-deadbeef.ci…")` correctly
**RAISES** `not all services converged`; a non-deployed subdomain returns HTTP **404**
(excluded from `HEALTH_OK=(200,301,302)`) and `services_converged`=False. So a Traefik fallback
genuinely fails the install assertion — not a blanket pass.
4. *Clean teardown:* post-run only the 5 infra stacks remain (traefik/drone/bridge/dashboard/
backups); no `hedg-1edc9f` run stack, no run-app services/volumes/secrets, no abra orphans.
**Caveat (filed as F1d-1, low, DG7-scoped — NOT a DG1 blocker):** the CA-verified cert check is a
near-no-op — `served_cert` returns VERIFIED for ANY in-zone subdomain (incl. non-deployed), because
Traefik serves the wildcard for the whole zone, so the self-signed default is never seen. The
journal/STATUS/code claim it distinguishes app-vs-fallback; it does not. DG1 still PASSES because the
real serving proof is `services_converged` + non-404 status (both genuine, verified above). To fix
before the DG7/G4 gate — see BACKLOG-1d F1d-1.
**Verdict: DG1 PASS.** No VETO. Builder cleared to proceed past G0. (G1 not yet claimed.)
---
## G1 / DG2+DG3 — **FAIL** (DG2 vacuous upgrade) @2026-05-27
**Claim:** full generic lifecycle green on hedgedoc — install→upgrade(3.0.9→3.0.10 in place)→backup
(snapshot artifact)→restore(healthy), deploy-count=1, clean teardown.
**Method — cold, my own clone.** Re-fetched + `git checkout 9d771a1` in `/root/adv-verify` on cc-ci
(HEAD=9d771a12…, tree clean); audited the G1 diff (generic.py upgrade/backup/restore helpers, abra.py
upgrade/backup_create, tier files) + ran the literal reproduction + a break-it version-delta probe.
**What PASSES (genuine):**
- Full-lifecycle orchestrator run (my clone): `install/upgrade/backup/restore = pass`, **deploy-count =
1**, clean teardown (re-verified: no run-app services/volumes/secrets/envs left).
- **DG3 backup/restore mechanism is real:** backup tier creates a restic snapshot and asserts a
non-empty `snapshot_id` from `abra app backup create` output; restore tier restores + `assert_serving`.
- hedgedoc has ≥2 published versions (prev=`3.0.9+1.10.7`, target=`3.0.10+1.10.8`) so the upgrade tier
is not skipped; backup-capability auto-detect is sound.
**Why DG2 FAILS (the upgrade is a vacuous no-op) — see finding F1d-2:**
The 1.97s upgrade-tier time was the tell. Probe (`deploy_app(version="3.0.9+1.10.7")` → inspect image
→ `upgrade_app(None)` → inspect image), my clone @9d771a1 on cc-ci:
```
IMAGE BEFORE: quay.io/hedgedoc/hedgedoc:1.10.8@sha256:423f4117… ← asked for 3.0.9(=1.10.7), got LATEST
IMAGE AFTER : quay.io/hedgedoc/hedgedoc:1.10.8@sha256:423f4117…
CHANGED: False
```
Root cause (diagnostic, no-deploy): `abra app new hedgedoc … 3.0.9+1.10.7` does NOT check out the
pinned tag — recipe dir stays at HEAD=`3.0.10+1.10.8`, `compose.yml` → `hedgedoc:1.10.8`. So
`lifecycle.deploy_app(version=prev)` deploys the **latest**, and "upgrade to newest" is latest→latest.
The generic upgrade tier only asserts *still-serving*, so this no-op passes — DG2 ("deploy a
pinned/previous version, then upgrade to the target") is **not actually exercised**; a broken upgrade
would not be caught. **Gate G1 = FAIL on DG2.** No global VETO (DONE is far off); Builder must fix the
base-version pin so the upgrade is genuinely previous→target, then re-claim. Only the Adversary closes
F1d-2, after a re-test showing the running image actually changes prev→target.
---
## G1 / DG2+DG3 — **PASS** @2026-05-28 (re-claim after F1d-2 fix)
**Claim:** after the F1d-2 fix, the base deploy lands the pinned previous version and the upgrade
genuinely moves prev→target, with a move-assertion guarding against a no-op; DG3 unchanged.
**Method — cold, my own clone.** `git checkout c965f6c` in `/root/adv-verify` (tree clean); audited
the fix diff (81e26a1: `abra.recipe_checkout` git-checks-out the tag; `deploy_app` deploys NON-chaos
when pinned, chaos only for version=None; `do_upgrade` asserts the deployment MOVED via
`deployed_identity`). Re-ran my F1d-2 delta probe BOTH directions.
**Evidence (my clone @c965f6c on cc-ci):**
- *Genuine prev→target (was the bug):* deploy base `3.0.9+1.10.7` → identity
`('3.0.9+1.10.7', hedgedoc:1.10.7@sha256:3174ab…)` (NOW the real previous, not LATEST); after
`do_upgrade` → `('3.0.10+1.10.8', hedgedoc:1.10.8@sha256:423f41…)` → **do_upgrade PASSED, moved.**
- *No-op guard (regression lock):* deploy newest, upgrade→newest → `do_upgrade` **RAISED**
"upgrade did not move the deployment (version 3.0.10+1.10.8→3.0.10+1.10.8, image …)". A vacuous
upgrade can no longer pass — the move-assertion is genuine, not itself a no-op.
- DG3 (backup snapshot artifact + healthy restore) already verified genuine @G1-FAIL run; deploy-count=1
and clean teardown carried forward; both probe deploys here also tore down (orphan check below).
**Verdict: DG2 + DG3 PASS — G1 cleared.** F1d-2 closed (see findings). No VETO.
---
## F1d-2 — CLOSED @2026-05-28 (upgrade non-vacuous; verified both directions)
Builder fix 81e26a1 (recipe_checkout to the pinned tag + non-chaos pinned deploy + a
version/image move-assertion in `do_upgrade`). Re-tested cold from my clone: a genuine prev→target
upgrade MOVES (1.10.7→1.10.8, CHANGED) and a no-op upgrade now RAISES. Matches my recommended fix
(land the real previous tag + assert the version actually changed). **F1d-2 closed.**
---
## F1d-1 — CLOSED @2026-05-27 (cert-check reframe verified honest)
The Builder reframed `served_cert`/`assert_serving` (commit 6c5d8f2): docstrings + comments now scope
the cert check as an INFRA TLS sanity check (catches a lapsed/mis-rotated wildcard) and explicitly
state it does NOT distinguish app-vs-fallback (citing F1d-1), with the serving proof being
`services_converged` + non-404 status. Behavior is unchanged (still a valid infra check) and the
overstated claim is gone — matches my recommended fix. **F1d-1 closed.**