Files
cc-ci/BACKLOG-shot.md

100 lines
8.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# BACKLOG-shot.md — phase `shot` (recipe screenshot audit & repair)
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-shot-screenshots.md. Gates: M1 (audit+diagnosis), M2 (all OK / agreed N/A).
## Build backlog
### P1 — Audit matrix (status: complete, all 19 PNGs visually inspected 2026-06-11)
Enrolled set (19) = `tests/<r>/recipe_meta.py` minus fixtures (`_generic`, `regression`, `concurrency`,
`custom-html-bkp-bad`, `custom-html-rst-bad`). Evidence: `/var/lib/cc-ci-runs/<run>/` on cc-ci;
PNGs pulled to /tmp/shot-audit/ on the builder host and each one Read (visually).
| recipe | latest run w/ artifacts | screenshot field | PNG bytes | visual content (I looked) | class |
|---|---|---|---|---|---|
| bluesky-pds | ab-bluesky-pds-oldmain | null | — | no PNG; install=fail level=0 (upstream image breakage, rcust DEFERRED) → capture correctly skipped (`if deploy_ok`) | N-A-candidate (blocked upstream) |
| cryptpad | m2r-cryptpad | screenshot.png | 4802 | solid light-grey frame, nothing else | BLANK |
| custom-html | m2r-custom-html | screenshot.png | 35707 | "Welcome to nginx!" default page | OK? (diagnose: is this the recipe's true fresh-install content?) |
| custom-html-tiny | m2r-custom-html-tiny | screenshot.png | 12950 | seeded CI content ("cc-ci custom-html-tiny … DG5") | OK |
| discourse | m2p-discourse | screenshot.png | 66121 | real forum UI, welcome topic, Sign Up/Log In | OK |
| ghost | m2r-ghost | screenshot.png | 444183 | real blog landing ("Thoughts, stories and ideas") | OK |
| hedgedoc | m2r-hedgedoc | screenshot.png | 131967 | real landing (logo, Sign In, feature intro) | OK |
| immich | 356 | screenshot.png | 4801 | pure white frame | BLANK |
| keycloak | m2r-keycloak | screenshot.png | 8764 | spinner + "Loading the Administration Console" | LOADING |
| lasuite-docs | m2r-lasuite-docs | screenshot.png | 6022 | lone spinner on white | LOADING |
| lasuite-drive | m2p2-lasuite-drive | screenshot.png | 5895 | lone spinner on white | LOADING |
| lasuite-meet | m2r-lasuite-meet | screenshot.png | 4801 | pure white frame | BLANK |
| mailu | m2r-mailu | screenshot.png | 33800 | real sign-in page (empty fields) | OK |
| matrix-synapse | m2r-matrix-synapse | screenshot.png | 33296 | "It works! Synapse is running" landing | OK |
| mattermost-lts | m2b-mattermost-lts | screenshot.png | 242139 | brand splash/loading screen (logo on blue), NOT the login form | LOADING (borderline — brand-recognizable but a loading state) |
| mumble | m2r-mumble | screenshot.png | 7913 | spinner on grey — a web page IS served on the domain | LOADING (diagnose what serves it; N/A may NOT be justified) |
| n8n | m2r-n8n | screenshot.png | 4801 | off-white blank frame. Flaky: run 197 (30256 B) shows the real "Set up owner account" form (empty fields, credential-free) | BLANK (flaky) |
| plausible | 357 | null | — | no PNG on ANY run (122→357) | NULL |
| uptime-kuma | m2r-uptime-kuma | screenshot.png | 30858 | real "Create your admin account" setup form (empty fields) | OK |
PNG-size note: 4801/4802 B at 1280×800 is a byte-stable blank-frame fingerprint (3 different apps, same size).
### P2 — Root-cause diagnoses
- [x] **NULL — plausible** (evidence: Drone build 357 ci-step log, t=73s):
`screenshot: capture failed (non-fatal, verdict unaffected): page.goto(https://plau-b51425.ci.commoninternet.net/) never returned a status in (200, 301, 302, 303, 401, 403) after 15 attempts (45s); last status=500`.
Plausible's `/` 500s **by design** under `DISABLE_AUTH=true` (auth_controller; documented in
`tests/plausible/functional/test_health_check.py` docstring and recipe_meta — that's why HEALTH_PATH
is `/api/health`). Default landing-page capture can NEVER succeed → needs a per-recipe SCREENSHOT
hook to a path that actually renders (probe live: e.g. /login or /sites).
- [x] **NULL — bluesky-pds**: install fails (level=0) before the app is up → `if deploy_ok:` gate in
runner/run_recipe_ci.py:1024 correctly skips capture. Not a screenshot defect; upstream image
breakage already filed in machine-docs/DEFERRED.md (rcust). → documented N/A while upstream is broken.
- [x] **BLANK class — immich, lasuite-meet, n8n(flaky), cryptpad**: SPA paint race. capture() navigates
with `wait_until="domcontentloaded"` (runner/harness/screenshot.py:91) and screenshots immediately;
SPA shell HTML has loaded but JS hasn't painted → solid 4801-2 B frame. n8n flakiness = same race,
sometimes JS wins (run 197 captured the real form).
- [x] **LOADING class — keycloak, lasuite-docs, lasuite-drive, mumble, mattermost-lts(borderline)**:
same race, caught mid-paint (spinner/splash rendered, app JS still loading/connecting).
- [x] **mumble** web stack identified: recipe deploys a `web` service (mumble-web client) on the domain —
spinner is its connecting state; landing renders a connect dialog once JS settles. NOT an N/A.
- [x] **custom-html** nginx-welcome question: the recipe's fresh install genuinely serves the nginx
default page at `/` (no content seeded for this recipe's install; only custom-html-tiny seeds via
install_steps.sh). Screenshot is an honest representative view of a fresh install. → OK as-is.
### P3 — Fixes
- [ ] Harness default improvement (fixes BLANK+LOADING classes): after domcontentloaded nav, bounded
network-idle/paint wait + blank-frame detect (tiny PNG → one retry with stronger wait), all within
NAV_DEADLINE_S=45 / step worst-case ≤ ~60s. Unit tests in tests/unit/test_screenshot.py.
- [ ] plausible SCREENSHOT hook (tests/plausible/recipe_meta.py) to a rendering, credential-free path.
- [ ] Re-audit mattermost-lts / mumble / keycloak / lasuite-* after harness fix; per-recipe hooks only
where the default still can't work.
- [ ] bluesky-pds: document N/A in matrix (Adversary agreement at M1/M2).
### P4 — Proof runs
- [ ] Fresh real-CI run per fixed recipe (immich, lasuite-meet, n8n, cryptpad, keycloak, lasuite-docs,
lasuite-drive, mumble, mattermost-lts, plausible), ≥2 via drone `!testme`; visual check each PNG;
card + dashboard render. Healthy class: cite existing artifact + visual check (done in P1).
## Adversary findings
### [adversary] A1 — blank-retry can REGRESS a larger frame to a worse one (LOW, non-blocking) — CLOSED @2026-06-11T06:32Z
**CLOSED:** fixed in 7ad7d1f (retry snapped to a temp path; `os.replace` only if `retry >= first`,
else discard + cleanup in `finally`). Re-verified COLD with my own probe (not the Builder's test):
the exact filed case `[9999,4801]` now keeps **9999** (retry discarded, no temp leak); originals
intact (`[4801,30256]`→30256, `[4801,4802]`→4802, `[35707]`→1 shot, `[5000,5000]`→replace). 5/5 pass.
R7 contract preserved (retry-raise still propagates to capture's swallow → None; first frame on disk).
--- original finding (for the record) ---
**Where:** `runner/harness/screenshot.py` `_snap_with_blank_retry` (ce50f64).
**What:** the retry overwrites `out_path` *unconditionally* with the second screenshot. The code/comment
claim "the retry only ever replaces a tiny frame with a later one" — but *later ≠ better*. If the first
frame is e.g. 9999 B (a partial render, just under `BLANK_SIZE_BYTES=10000`) and the page regresses in the
extra 4 s settle (redirect, session-timeout splash, error overlay), the retry can yield a 4801 B blank that
**overwrites the better 9999 B frame**. The Builder's unit test only covers blank→blank (4801→4802); the
bigger→smaller regression is untested.
**Repro (cold, my independent probe, not the Builder's test file):** fake page returning sizes
`[9999, 4801]``_snap_with_blank_retry` keeps **4801** (the worse frame).
**Severity:** LOW. R7 holds (cosmetic only, never affects verdict); my M2 per-PNG visual check is the
backstop — any actually-blank final PNG will FAIL that recipe regardless. Filed for hardening, not a veto.
**Suggested guard (trivial, strictly safer):** keep the larger frame — only overwrite if
`getsize(retry) >= getsize(first)` (or snap retry to a temp path and pick `max`). Then extend the unit
test with a bigger→smaller case asserting the larger frame survives.
**Closes:** only I close this, after re-test. Non-blocking for an M2 claim, but I will re-check at M2.