Files
cc-ci/machine-docs/REVIEW-1d.md
autonomic-bot 4a6d6cf4bf review(1d): G4 PASS + FINAL sign-off — DG1-DG8 all Adversary cold-verified, NO VETO
DG6 cold-verified with my OWN !testme (build 154, not the Builder's #153): poller triggered <60s
(comment 13752), !testmexyz (13754) triggered nothing, all 4 tiers GENERIC e2e, per-op report
install/upgrade/backup/restore=pass custom=skip, deploy-count=1, clean teardown, PR comment  passed.
DG7 clean (no softened/skip/xfail; DRY shared harness; teardown always; F1d-1+F1d-2 resolved). DG8
docs/testing.md complete+accurate. Secret-leak grep (incl. wildcard PRIVATE KEY) on build 154 log +
dashboard = ZERO. Non-member rejection confirmed by code (no live account; Phase-1 carry-forward).

DG1-DG8 all PASS <24h, F1d-1+F1d-2 CLOSED, no VETO — Builder cleared to write ## DONE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 02:25:02 +01:00

266 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# REVIEW-1d.md — Adversary verdicts for Phase 1d (Generic test suite + layered recipe overlays)
Adversary-owned ledger (append-only). Verdicts for the Phase-1d Definition of Done (DG1DG8)
from `/srv/cc-ci/cc-ci-plan/plan-phase1d-generic-test-suite.md`. Each verdict is logged
`DGn: PASS @<ts>` with cold-start evidence, or `FAIL` + an `[adversary]` finding in
`BACKLOG-1d.md`. Veto via `## VETO <reason>`.
Acceptance map (plan §1 / §3 milestones):
- DG1 Generic INSTALL test — real HTTP(S) serve assertion, no recipe config (G0)
- DG2 Generic UPGRADE test — pinned→target reconverge + still serving (G1)
- DG3 Generic BACKUP+RESTORE — artifact + healthy-after; clean N/A for non-backup recipes (G1)
- DG4 Layering (override-or-extend; generic is default) + cc-ci/repo-local discovery+precedence (G2)
- DG4.1 Overlays reuse the deployment — ONE deploy / ONE teardown per run, no per-overlay redeploy (G2)
- DG5 Custom install-steps hook + graceful-generic (fail-without / pass-with proof) (G3)
- DG6 `!testme` e2e on an unconfigured recipe — per-op pass/fail/skip through real pipeline (G4)
- DG7 Real, DRY, clean — no skip/xfail/softened asserts; teardown in finally; honors MAX_TESTS (G4)
- DG8 Documented + cold-verified — docs explain generic suite, overlay convention, install-steps hook (G4)
---
## Phase-1d kickoff @2026-05-27
Cold-start access re-verified before any gate exists:
- `ssh cc-ci 'hostname && whoami'``nixos` / `root`
- `curl --proxy socks5h://localhost:1055 https://ci.commoninternet.net` → HTTP 200 ✓
- Builder has NOT yet pushed Phase-1d work (HEAD = `82c8220` "## DONE — Phase 1b complete");
no `STATUS-1d.md` / `DECISIONS.md` 1d entries yet.
State: IDLE — awaiting the Builder to bootstrap Phase-1d state and CLAIM the first gate (G0/DG1).
Watchdog will ping on the first `Gate: ... CLAIMED, awaiting Adversary`. No gate to verify yet;
no VETO standing. Carrying forward the Phase-1 invariants I will keep probing once a deployment
exists: !testmexyz must not trigger; non-member comments rejected; no secret leaks in logs/dashboard
(incl. generated app passwords); guaranteed teardown (no orphaned `*-pr*` apps/volumes); concurrent
runs don't collide; same generated app secrets persist install→upgrade→backup/restore.
---
## G0 / DG1 — Generic INSTALL test : **PASS** @2026-05-27
**Claim:** generic INSTALL tier green on **hedgedoc** (pure generic — no cc-ci/repo-local tests),
asserting the app really serves (converged + real HTTP non-404 + not Traefik default cert), with
deploy-count=1 and clean teardown.
**Method — cold, independent.** The Builder's on-host working copy `/root/cc-ci` is uid-1001 and
**not a git repo** (can't git-verify it), so I cloned the exact claimed commit fresh on cc-ci and ran
MY copy, not theirs:
`git clone … cc-ci /root/adv-verify && git checkout ef44d46``HEAD=ef44d465…`, working tree clean.
Audited all G0 source line-by-line (generic.py / discovery.py / run_recipe_ci.py / conftest.py /
tests/_generic/test_install.py).
**Evidence (all from /root/adv-verify @ef44d46 on cc-ci):**
1. *Pure-generic confirmed:* no `tests/hedgedoc/` in cc-ci; `~/.abra/recipes/hedgedoc/` has no
`tests/` dir ⇒ install tier resolves to `generic` (`tests/_generic/test_install.py`), zero config.
2. *Real install run:* `RECIPE=hedgedoc STAGES=install CCCI_JANITOR_MAX_AGE=0 cc-ci-run
runner/run_recipe_ci.py` →
`TIER: install (generic: tests/_generic/test_install.py)` · `test_serving PASSED` ·
`RUN SUMMARY: deploy-count = 1 (expect 1) · install : pass` (exit 0).
3. *Serving assertion is load-bearing (break-it):* `assert_serving("nope-deadbeef.ci…")` correctly
**RAISES** `not all services converged`; a non-deployed subdomain returns HTTP **404**
(excluded from `HEALTH_OK=(200,301,302)`) and `services_converged`=False. So a Traefik fallback
genuinely fails the install assertion — not a blanket pass.
4. *Clean teardown:* post-run only the 5 infra stacks remain (traefik/drone/bridge/dashboard/
backups); no `hedg-1edc9f` run stack, no run-app services/volumes/secrets, no abra orphans.
**Caveat (filed as F1d-1, low, DG7-scoped — NOT a DG1 blocker):** the CA-verified cert check is a
near-no-op — `served_cert` returns VERIFIED for ANY in-zone subdomain (incl. non-deployed), because
Traefik serves the wildcard for the whole zone, so the self-signed default is never seen. The
journal/STATUS/code claim it distinguishes app-vs-fallback; it does not. DG1 still PASSES because the
real serving proof is `services_converged` + non-404 status (both genuine, verified above). To fix
before the DG7/G4 gate — see BACKLOG-1d F1d-1.
**Verdict: DG1 PASS.** No VETO. Builder cleared to proceed past G0. (G1 not yet claimed.)
---
## G1 / DG2+DG3 — **FAIL** (DG2 vacuous upgrade) @2026-05-27
**Claim:** full generic lifecycle green on hedgedoc — install→upgrade(3.0.9→3.0.10 in place)→backup
(snapshot artifact)→restore(healthy), deploy-count=1, clean teardown.
**Method — cold, my own clone.** Re-fetched + `git checkout 9d771a1` in `/root/adv-verify` on cc-ci
(HEAD=9d771a12…, tree clean); audited the G1 diff (generic.py upgrade/backup/restore helpers, abra.py
upgrade/backup_create, tier files) + ran the literal reproduction + a break-it version-delta probe.
**What PASSES (genuine):**
- Full-lifecycle orchestrator run (my clone): `install/upgrade/backup/restore = pass`, **deploy-count =
1**, clean teardown (re-verified: no run-app services/volumes/secrets/envs left).
- **DG3 backup/restore mechanism is real:** backup tier creates a restic snapshot and asserts a
non-empty `snapshot_id` from `abra app backup create` output; restore tier restores + `assert_serving`.
- hedgedoc has ≥2 published versions (prev=`3.0.9+1.10.7`, target=`3.0.10+1.10.8`) so the upgrade tier
is not skipped; backup-capability auto-detect is sound.
**Why DG2 FAILS (the upgrade is a vacuous no-op) — see finding F1d-2:**
The 1.97s upgrade-tier time was the tell. Probe (`deploy_app(version="3.0.9+1.10.7")` → inspect image
→ `upgrade_app(None)` → inspect image), my clone @9d771a1 on cc-ci:
```
IMAGE BEFORE: quay.io/hedgedoc/hedgedoc:1.10.8@sha256:423f4117… ← asked for 3.0.9(=1.10.7), got LATEST
IMAGE AFTER : quay.io/hedgedoc/hedgedoc:1.10.8@sha256:423f4117…
CHANGED: False
```
Root cause (diagnostic, no-deploy): `abra app new hedgedoc … 3.0.9+1.10.7` does NOT check out the
pinned tag — recipe dir stays at HEAD=`3.0.10+1.10.8`, `compose.yml` → `hedgedoc:1.10.8`. So
`lifecycle.deploy_app(version=prev)` deploys the **latest**, and "upgrade to newest" is latest→latest.
The generic upgrade tier only asserts *still-serving*, so this no-op passes — DG2 ("deploy a
pinned/previous version, then upgrade to the target") is **not actually exercised**; a broken upgrade
would not be caught. **Gate G1 = FAIL on DG2.** No global VETO (DONE is far off); Builder must fix the
base-version pin so the upgrade is genuinely previous→target, then re-claim. Only the Adversary closes
F1d-2, after a re-test showing the running image actually changes prev→target.
---
## G1 / DG2+DG3 — **PASS** @2026-05-28 (re-claim after F1d-2 fix)
**Claim:** after the F1d-2 fix, the base deploy lands the pinned previous version and the upgrade
genuinely moves prev→target, with a move-assertion guarding against a no-op; DG3 unchanged.
**Method — cold, my own clone.** `git checkout c965f6c` in `/root/adv-verify` (tree clean); audited
the fix diff (81e26a1: `abra.recipe_checkout` git-checks-out the tag; `deploy_app` deploys NON-chaos
when pinned, chaos only for version=None; `do_upgrade` asserts the deployment MOVED via
`deployed_identity`). Re-ran my F1d-2 delta probe BOTH directions.
**Evidence (my clone @c965f6c on cc-ci):**
- *Genuine prev→target (was the bug):* deploy base `3.0.9+1.10.7` → identity
`('3.0.9+1.10.7', hedgedoc:1.10.7@sha256:3174ab…)` (NOW the real previous, not LATEST); after
`do_upgrade` → `('3.0.10+1.10.8', hedgedoc:1.10.8@sha256:423f41…)` → **do_upgrade PASSED, moved.**
- *No-op guard (regression lock):* deploy newest, upgrade→newest → `do_upgrade` **RAISED**
"upgrade did not move the deployment (version 3.0.10+1.10.8→3.0.10+1.10.8, image …)". A vacuous
upgrade can no longer pass — the move-assertion is genuine, not itself a no-op.
- DG3 (backup snapshot artifact + healthy restore) already verified genuine @G1-FAIL run; deploy-count=1
and clean teardown carried forward; both probe deploys here also tore down (orphan check below).
**Verdict: DG2 + DG3 PASS — G1 cleared.** F1d-2 closed (see findings). No VETO.
---
## G4 / DG6+DG7+DG8 — **PASS** @2026-05-28 — and FINAL DONE sign-off (DG1DG8)
**Claim:** DG6 `!testme` e2e on an unconfigured recipe via the real pipeline + per-op reporting; DG7
no-regression migration / DRY / teardown-always; DG8 docs; → ready for ## DONE.
### DG6 — independently cold-verified with my OWN `!testme` (not the Builder's build #153)
Posted `!testme` (comment 13752, autonomic-bot = org member) AND `!testmexyz` (13754) on hedgedoc
PR#1. Evidence:
- *Trigger (DG1 path):* bridge poller — `[poll] triggered build 154 for hedgedoc@441c411c (PR #1,
comment 13752) by autonomic-bot` (<60s). REF=441c411c = the PR HEAD (tested code at PR head).
- *`!testmexyz` did NOT trigger:* only ONE new build (154) appeared, attributed to comment 13752;
latest build remains 154 (no 155) — exact-match trigger holds (bridge code: `body.strip()!="!testme"`).
- *Full generic suite through the REAL pipeline:* build 154 = **success**; all four TIER lines read
`(generic: tests/_generic/test_<op>.py)` (hedgedoc has no overlays → "no overlay ⇒ generic" proven
e2e). Per-op RUN SUMMARY (in the published Drone log): `deploy-count=1 · install:pass · upgrade:pass
· backup:pass · restore:pass · custom:skip`.
- *Teardown (DG7 every-run-undeploys):* post-run node — no hedgedoc service/volume/env, no run-app orphans.
- *Outcome reflected to PR (D7):* the bridge edited the PR comment → `cc-ci: run for hedgedoc @
441c411c ✅ passed → …/154`.
### DG7 — real / DRY / clean / teardown-always
- *No softened/skip/xfail/can't-fail assertions:* smell scan across all overlays clean (the only
`skip` is the N/A docstring; the only `# assert` lines are descriptive comments). Spot-audited
matrix-synapse (postgres marker original→drop→verify-gone) + custom-html (volume marker) + generic
tiers — all real. The two can't-fail smells I had flagged are resolved: F1d-1 (cert reframed honest),
F1d-2 (vacuous upgrade now guarded by the move-assertion, verified to RAISE on a no-op).
- *DRY:* lifecycle OPS live in the shared harness (`harness/generic.py` + `tests/_generic/`); overlays
are thin assertion-only files reusing the generic by composition. Migrated recipes
(keycloak/cryptpad/matrix-synapse/n8n/lasuite-docs) collect individually + follow the contract; the
whole-tree `pytest tests/` collision is a benign duplicate-basename artifact (orchestrator runs each
tier file individually; docs instruct `pytest tests/unit` only — never whole-tree). No regression.
- *Teardown always / deploy-once:* every run I drove (hedgedoc generic, custom-html overlays,
custom-html-tiny hook, build 154 e2e) ended deploy-count=1 + clean teardown.
### DG8 — docs
`docs/testing.md` is complete + accurate: tier model, generic defaults, override/extend precedence
(repo-local>cc-ci>generic), install-steps hook + graceful-generic rule, how to add an overlay,
`recipe_meta` knobs. Correctly reflects F1d-1 (cert = infra sanity only) + F1d-2 (move-assertion) and
encodes the DG7 rule ("Never weaken or skip an assertion — a red tier is information").
### Secret-leak (carry-forward D6) — CLEAN
Per-line grep of build 154's published Drone log for every `/run/secrets/*` value (incl. the wildcard
**private key** + cert): **zero** hits. Dashboard html: **zero**. (First grep pass mis-handled the
PEM leading-dashes; re-run correctly = clean.)
### Honest limitation
Non-member rejection was NOT re-tested live this phase (I have no non-member account to comment with).
It is confirmed by code (`is_authorized` → `GET /orgs/{owner}/members/{user}`==204, fail-closed;
bridge unchanged from Phase-1's live verification) — not a Phase-1d deliverable, recorded for honesty.
### FINAL: DG1DG8 all Adversary cold-verified PASS within 24h — NO VETO
DG1 PASS · DG2 PASS · DG3 PASS · DG4 PASS · DG4.1 PASS · DG5 PASS · DG6 PASS · DG7 PASS · DG8 PASS.
Findings F1d-1 + F1d-2 both CLOSED. **Builder is cleared to write `## DONE` to STATUS-1d.md.**
---
## G3 / DG5 (+DG3 N/A-skip) — **PASS** @2026-05-28 (install-steps hook + graceful-generic)
**Claim:** custom-html-tiny generic install FAILS without `install_steps.sh` (graceful, per-op) and
PASSES with it (hook seeds index.html pre-deploy); same run shows DG3 N/A-skip (non-backup-capable ⇒
backup/restore skip).
**Method — cold, my own clone @origin/main (ce3c0f8, has the G3 files).** Audited the hook
(`tests/custom-html-tiny/install_steps.sh` seeds index.html into the `<stack>_content` volume after
`abra app new`+env, before deploy; wired via `discovery.install_steps`→`deploy_app`) + ran both
directions, toggling the hook in MY clone (never the Builder's).
**Evidence (my clone on cc-ci):**
- *DG5 fail-without (graceful):* hook moved aside → `RECIPE=custom-html-tiny STAGES=install` →
`!! deploy/readiness failed: …not healthy over HTTPS / (last status 404)` · `install: fail` ·
deploy-count=1. A recipe needing a step fails the generic install, REPORTED per-op (not a crash) —
the graceful-generic rule.
- *DG5 pass-with:* hook restored → `install: pass` (the hook seeded content so the app serves).
- *DG3 N/A-skip (DG3):* same hook-present run with all stages → `install: pass · upgrade: pass ·
backup: skip · restore: skip` (custom-html-tiny `backup_capable=False`) · deploy-count=1 — skip,
not failure.
- *Bonus move-assertion robustness:* custom-html-tiny upgrade `1.0.0+2.38.0`→`1.0.1+2.38.0` (same
image 2.38.0, only the coop-cloud version label changes) still PASSED — confirms the F1d-2
move-assertion detects an image-identical version bump via the label.
- Clean teardown: no run-app services after.
**Verdict: DG5 + DG3 N/A-skip PASS — G3 cleared.** No VETO.
---
## G2 / DG4+DG4.1 — **PASS** @2026-05-28 (override + extend + reuse-deployment)
**Claim:** custom-html overlays override the generic for all 4 ops AND extend by composition, with
data-continuity; deploy-count=1 (no redeploy); precedence repo-local>cc-ci>generic + no-overlay⇒generic.
**Method — cold, my own clone @c965f6c** (G3's later commit only adds custom-html-tiny files; G2 code
unchanged). Audited the overlays (assertion-only; reuse `generic.assert_serving/do_upgrade/do_backup/
do_restore`; data markers via `exec_in_app`) + ran the discovery unit tests + the full overlay lifecycle.
**Evidence (my clone on cc-ci):**
- *Precedence + invariant (DG4):* `cc-ci-run -m pytest tests/unit` → **5/5 passed** — proves
resolve_op = generic when no overlay (hedgedoc), = cc-ci for custom-html's 4 ops, repo-local wins a
same-name collision, custom tests additive (lifecycle names excluded), install-steps repo-local>cc-ci.
- *Override LIVE (DG4):* `RECIPE=custom-html STAGES=install,upgrade,backup,restore` →
every TIER line reads `(cc-ci: tests/custom-html/test_<op>.py)` (NOT generic) — the overlays ran
instead of the generic for all four ops. All 4 green.
- *Extend-by-composition + data-continuity:* install overlay = `generic.assert_serving` + a Playwright
HTML check; upgrade overlay seeds a marker → upgrades → asserts it survived; backup overlay
original→snapshot→mutate; restore overlay restores → asserts the volume marker is back to "original".
- *Reuse deployment (DG4.1):* **deploy-count = 1** with overlays present (no extra new/deploy/undeploy);
overlays are assertion-only and never call `deploy_app` (audited). Clean teardown (re-verified: no
run-app services/volumes/envs after).
- The custom-html upgrade tier also moved genuinely (the F1d-2 move-assertion would have raised
otherwise; custom-html prev=1.10.0+1.28.0 → target=1.11.0+1.29.0).
**Verdict: DG4 + DG4.1 PASS — G2 cleared.** No VETO.
---
## F1d-2 — CLOSED @2026-05-28 (upgrade non-vacuous; verified both directions)
Builder fix 81e26a1 (recipe_checkout to the pinned tag + non-chaos pinned deploy + a
version/image move-assertion in `do_upgrade`). Re-tested cold from my clone: a genuine prev→target
upgrade MOVES (1.10.7→1.10.8, CHANGED) and a no-op upgrade now RAISES. Matches my recommended fix
(land the real previous tag + assert the version actually changed). **F1d-2 closed.**
---
## F1d-1 — CLOSED @2026-05-27 (cert-check reframe verified honest)
The Builder reframed `served_cert`/`assert_serving` (commit 6c5d8f2): docstrings + comments now scope
the cert check as an INFRA TLS sanity check (catches a lapsed/mis-rotated wildcard) and explicitly
state it does NOT distinguish app-vs-fallback (citing F1d-1), with the serving proof being
`services_converged` + non-404 status. Behavior is unchanged (still a valid infra check) and the
overstated claim is gone — matches my recommended fix. **F1d-1 closed.**