12 KiB
REVIEW-shot.md — Adversary verdicts, phase shot (recipe screenshot audit & repair)
Owner: Adversary loop. Append-only verdict log. Gates: M1 (audit+diagnosis), M2 (all working).
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-shot-screenshots.md.
No gate CLAIMED yet (phase just opened; Builder has not bootstrapped STATUS-shot.md). Doing independent cold ground-truth prep below so M1/M2 cold-verify is fast and un-anchored.
Independent cold pre-audit (Adversary, @2026-06-11T01:20Z)
Method: ssh cc-ci, scanned /var/lib/cc-ci-runs/*/results.json for recipe + screenshot field +
on-disk screenshot.png size; scp'd suspect PNGs locally and looked at them (Read tool).
This is MY ground truth, formed before any Builder claim — to compare against the Builder's matrix.
PNG sizes from latest representative runs (m2r-* sweep + numbered drone runs):
| recipe | PNG bytes | my visual read | class |
|---|---|---|---|
| immich | 4801 | pure blank white frame | BLANK |
| n8n | 4801 | blank near-white frame | BLANK |
| lasuite-meet | 4801 | (size-identical to immich/n8n 4801B — blank tell) | BLANK (to confirm visually) |
| cryptpad | 4802 | blank light-grey frame | BLANK |
| keycloak | 8764 | spinner + "Loading the Administration Console" — paint-race loading state, NOT a real login form | BLANK/LOADING (not the "genuine sparse login" §2 guessed) |
| lasuite-docs | 6022 | bare spinner on white | BLANK/LOADING |
| lasuite-drive | ~5.9K | (size sibling of lasuite-docs — likely same spinner) | BLANK (to confirm) |
| plausible | null / NO PNG | every run null (122→357 incl. 357); run dir has no screenshot.png; capture stdout not in run dir (goes to Drone build log) — root cause still to trace | NULL |
| ghost | 444183 | (reference healthy, §2) | OK (visual-confirm at M2) |
| mattermost-lts | 242139 | reference healthy | OK |
| hedgedoc | 131967 | reference healthy | OK |
| discourse | 66-67K | reference healthy | OK |
| custom-html | 35707 | reference healthy | OK |
| mailu | 33800 | reference healthy | OK |
| matrix-synapse | 33296 | reference healthy | OK |
| uptime-kuma | 30858 | reference healthy | OK |
| custom-html-tiny | 12950 | reference healthy | OK |
| mumble | 7913 | voice server — web-UI N/A candidate (confirm) | N/A? |
Confirmed defect classes match the orchestrator pre-audit (§2): SPA paint-race (domcontentloaded fires before JS paints) → immich/n8n/cryptpad fully blank, keycloak/lasuite-docs/-drive caught at loading spinner; plausible never captures (null on every run). The 4801B byte-identical size is a reliable blank-frame fingerprint.
Open items I must still resolve when verifying:
- plausible NULL root cause — need the Drone build log for a plausible run (capture stdout: "capture failed" vs "produced no file" vs step never reached). Run dir alone doesn't have it.
- lasuite-meet / lasuite-drive / mumble — visual confirm.
- Authoritative enrolled-recipe set: every
tests/<recipe>/recipe_meta.pyminus fixtures (_generic,regression,concurrency,custom-html-bkp-bad,custom-html-rst-bad).
No verdict yet. Awaiting claim(shot): M1.
M1: PASS @2026-06-11T01:38Z (audit + diagnosis complete)
Claim: claim(shot): M1 commit e005897; matrix+diagnoses at 8978fa6. STATUS-shot.md "M1 claim".
Verified COLD from my own clone + ssh cc-ci, without reading JOURNAL-shot.md (anti-anchoring).
My independent pre-audit (commit 4f3a747, formed BEFORE reading the Builder's matrix) already
agreed on every BLANK/LOADING/NULL read I had pre-formed — no anchoring.
Enrolled set — complete, no omissions. ls tests/*/recipe_meta.py = 21. Minus the two harness
canaries custom-html-bkp-bad, custom-html-rst-bad (plan §2 explicitly excludes both) = 19.
The 19 matrix rows are exactly that set (diffed by hand) and exactly the plan §2 expected set.
_generic/regression/concurrency/unit have no recipe_meta.py → correctly absent. ✓
Every non-OK row has evidence-backed root cause (independently re-derived):
- plausible NULL — ran the Builder's drone-log command myself: build 357 step log shows
capture failed … page.goto(https://plau-…/) never returned a status in (200,301,302,303,401,403) after 15 attempts (45s); last status=500./500s by design (DISABLE_AUTH) → default landing capture can never succeed; needs a SCREENSHOT hook to a rendering path. Confirmed. ✓ - bluesky-pds NULL — capture is
if deploy_ok:-gated, OUTSIDE the deploy try/except (runner/run_recipe_ci.py:1024, read it). install=fail level=0 → capture correctly skipped. Not a screenshot defect; upstream image breakage already in DEFERRED.md (rcust). ✓ - BLANK/LOADING — screenshot.py:84-93 navigates
wait_until="domcontentloaded"then screenshots immediately, no paint wait; accept_statuses excludes 500 (plausible mechanism). Read the code. ✓ - mumble NOT N/A — tests/mumble/recipe_meta.py header: deploys
compose.mumbleweb.yml, a mumble-web HTTP client routed through Traefik, HEALTH_PATH "/". A real web surface IS served → correctly the HARDER (non-N/A) call. ✓
Independent visual spot-checks (Read tool) — 11 artifacts, matrix matched reality on every one: immich 4801B = pure white; n8n 4801B = blank; cryptpad 4802B = blank grey; lasuite-meet 4801B = pure white; keycloak 8764B = "Loading the Administration Console" spinner (NOT a real login — the §2 "might be a genuine login" guess was wrong, Builder classed it LOADING correctly); lasuite-docs 6022B = bare spinner; mumble 7913B = spinner ring on grey; mattermost-lts 242139B = blue brand splash + logo, NO login form (correctly LOADING despite large size — size alone is NOT a sufficient signal, good catch); n8n run 197 30256B = real "Set up owner account" form, empty fields, credential-free (flaky-pass + secret-safe, confirmed); custom-html 35707B = genuine "Welcome to nginx!" (honest fresh-install view for a bare static host — OK); plausible = NULL via drone log. Includes plausible ✓ and multiple 4801B cases ✓ (M1 minimum was ≥5 incl. those — exceeded).
N/A arguments — agreed:
- bluesky-pds → justified N/A (deploy-gated: can't screenshot what can't deploy; upstream breakage is pre-existing/DEFERRED, not a screenshot defect). Agreed, contingent on the upstream image still being broken at M2 — if it becomes deployable, it re-enters as a real recipe.
- mumble → NOT N/A. Agreed (real mumble-web surface, evidence above).
No omissions, no fabricated visual reads, diagnoses are causal not symptomatic. M1 PASS.
Watch-list for M2 (so the Builder has it early — NOT blocking M1):
- Harness default-wait fix must stay within NAV_DEADLINE_S=45 / step worst-case ≤~60s and must NEVER affect a verdict on screenshot failure (R7) — I will test the failure path has teeth but no verdict impact, and compare pre/post run durations.
- plausible SCREENSHOT hook must land on a credential-free rendering path (not /login showing a generated secret; not a 500 page).
- mattermost-lts proof: a bigger PNG is NOT acceptance — I will visually confirm the real login, not a brand splash.
- Secret-safety: every final PNG must show no generated credentials (install wizards, secrets pages). n8n's "Set up owner account" with EMPTY fields is the safe shape; a pre-filled one is not.
- M2 requires ≥2 proof runs via the drone
!testmepath + me Reading every final PNG.
Did not read JOURNAL-shot.md before this verdict. No finding filed (audit is accurate). No VETO.
M2: PASS @2026-06-11T07:17:53Z — all screenshots working (cold-verified from scratch)
Verified independently from a cold start (my own clone, my own scp/Read/re-runs; did NOT read
JOURNAL before this verdict). Claim commit 196156e. Every M2 DoD item checked:
1. Every final PNG Read (18/18) — real, representative, credential-free. Pulled each PNG by scp, Read it with the image tool, byte-size matched the claim on all 18:
- Fixed-class (10): immich 234351B "Welcome to Immich" onboarding; plausible 64132B real registration form (EMPTY fields); keycloak 215587B real "Sign in to your account" (EMPTY) — was the 8764B "Loading Admin Console" spinner at M1, settle fix resolved it; cryptpad 57310B real landing + doc-type picker; lasuite-meet 225686B real video-conf landing; lasuite-docs 284769B real Docs landing; lasuite-drive 132037B real "Fichiers" landing; n8n 26433B "Set up owner account" (ALL fields EMPTY — secret-safe, now deterministic); mattermost-lts 178367B real "Log in to your account" form (EMPTY) — NOT the byte-identical interstitial (hook v2 click-through works — my sharpest watch-item, resolved); mumble 7980B loader spinner (see §N/A).
- Healthy-class (8): ghost 444183B blog landing; hedgedoc 131967B landing; discourse 66121B forum + welcome topic; custom-html 35707B "Welcome to nginx!" (honest fresh-install); custom-html-tiny 12950B seeded content; mailu 33800B sign-in (EMPTY); matrix-synapse 33296B "It works!"; uptime-kuma 30858B "Create your admin account" (EMPTY). Every login/setup form has EMPTY fields — NO generated credential is shown anywhere. Secret-safety cardinal guardrail holds across all 18.
2. No verdict/level regression. All 10 proof runs status=pass at their baseline level (immich /plausible/keycloak/cryptpad/lasuite-*/n8n/mumble=4, mattermost-lts=2). screenshot field populated on every one. no_secret_leak=true on every proof run I sampled (370/371/keycloak/n8n/mattermost /mumble).
3. ≥2 genuine drone !testme proofs — confirmed end-to-end, NOT manual. ccci-bridge_app logs:
[poll] triggered build 370 for immich@107d7220 (PR #2, comment 14321) by autonomic-bot and
...build 371 for plausible@13458fac (PR #3, comment 14322)..., both reflected outcome ...: success. The bridge polled Gitea, found real !testme comments, triggered the builds, reflected
verdicts back — the full comment→build path. Drone params {RECIPE,PR,REF,SRC}, event=custom,
trigger/sender=autonomic-bot — matches the Phase-1c bridge-!testme fingerprint (REVIEW-1c:110).
4. Durations unaffected (no balloon). Drone same-recipe pre/post: immich 199s→198s, plausible 209s→166s (faster — capture no longer burns 45s failing on the 500). Screenshot step wait budget = 60000ms exactly (unit test_wait_budget_within_step_cap + my own cold probe). ≤~60s holds.
5. R7 (cosmetics never block) — intact. Call site run_recipe_ci.py ~1024-1037 is OUTSIDE the
deploy try/except AND double-wrapped in its own try/except (_scrub-bed log) — and git log proves
NO shot-phase commit touched run_recipe_ci.py (call site unchanged). capture() swallows everything →
None → placeholder. I cold-probed the new helpers independently: _settle swallows all exceptions,
_snap keeps the larger frame (A1 fix, 5/5), 60s budget — 9/9+5/5 pass. Screenshot unit suite 12/12
- card suite 10/10 ran GREEN cold on the real harness (cc-ci-run) from my scp'd clone.
6. Dashboard/card/badge render — live 200. GET dashboard / → 200; runs/370+371/screenshot.png → 200 image/png; badge/immich.svg + badge/plausible.svg → 200 image/svg+xml.
7. N/A set (19/19 enrolled, no omissions) — AGREED.
- bluesky-pds → N/A, re-confirmed at M2 (ab-bluesky-pds-oldmain: install=fail, level=0, screenshot=null → placeholder correct; upstream MODULE_NOT_FOUND still broken, DEFERRED).
- mumble → N/A-variant, AGREED — this reverses my M1 "NOT N/A" stance, on NEW evidence not
available at M1. rankenstein/mumble-web:0.5 renders no usable UI for an anonymous browser:
connect-dialog DOM genuinely absent (probe4 console:
#connect-dialog_input_address ... did not match any element), perpetual loading-container spinner at 5/15/30/60/90s (probe2) — corroborated by my own Read of the 7980B spinner PNG. The loader frame is the literal web-surface reality every visitor gets; mumble's actual function (voice) is fully protocol-tested; fix needs a recipe/overlay change (out of scope, guardrail prefers upstream). Documented in DEFERRED with an upstream question. NOTE (not a defect, not a veto): the dashboard shows the honest loader frame rather than the "no screenshot" placeholder — acceptable as a documented, agreed limitation, NOT a healthy-app screenshot.
Finding A1 (blank-retry regression) was filed, fixed (7ad7d1f), and CLOSED after my cold re-test.
No open findings. No fabricated reads — every matrix/claim value matched what I independently
observed. M2 PASS. No VETO. With M1 PASS (ae10b55) + M2 PASS both fresh and A1 closed, the DoD
handshake (§6.1) is satisfied — the Builder may write ## DONE to STATUS-shot.md.
(Consulted no JOURNAL-shot.md before forming this verdict.)