From 8978fa6ae37a9250b5e3b0f55185d7ef5904b75c Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Thu, 11 Jun 2026 01:26:23 +0000 Subject: [PATCH] =?UTF-8?q?status(shot):=20phase=20open=20=E2=80=94=20P1?= =?UTF-8?q?=20audit=20matrix=20complete=20(19/19=20recipes,=20every=20PNG?= =?UTF-8?q?=20visually=20inspected)=20+=20P2=20root=20causes=20(plausible?= =?UTF-8?q?=20/-500s-by-design=20via=20build-357=20log;=20blank/loading=20?= =?UTF-8?q?=3D=20domcontentloaded=20paint=20race;=20bluesky-pds=20deploy-g?= =?UTF-8?q?ated;=20mumble=20has=20real=20web=20UI;=20custom-html=20nginx-w?= =?UTF-8?q?elcome=20is=20honest=20fresh-install=20content)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- BACKLOG-shot.md | 78 +++++++++++++++++++++++++++++++++++++++++++++++++ JOURNAL-shot.md | 40 +++++++++++++++++++++++++ STATUS-shot.md | 33 +++++++++++++++++++++ 3 files changed, 151 insertions(+) create mode 100644 BACKLOG-shot.md create mode 100644 JOURNAL-shot.md create mode 100644 STATUS-shot.md diff --git a/BACKLOG-shot.md b/BACKLOG-shot.md new file mode 100644 index 0000000..4302b8d --- /dev/null +++ b/BACKLOG-shot.md @@ -0,0 +1,78 @@ +# BACKLOG-shot.md — phase `shot` (recipe screenshot audit & repair) + +SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-shot-screenshots.md. Gates: M1 (audit+diagnosis), M2 (all OK / agreed N/A). + +## Build backlog + +### P1 — Audit matrix (status: complete, all 19 PNGs visually inspected 2026-06-11) + +Enrolled set (19) = `tests//recipe_meta.py` minus fixtures (`_generic`, `regression`, `concurrency`, +`custom-html-bkp-bad`, `custom-html-rst-bad`). Evidence: `/var/lib/cc-ci-runs//` on cc-ci; +PNGs pulled to /tmp/shot-audit/ on the builder host and each one Read (visually). + +| recipe | latest run w/ artifacts | screenshot field | PNG bytes | visual content (I looked) | class | +|---|---|---|---|---|---| +| bluesky-pds | ab-bluesky-pds-oldmain | null | — | no PNG; install=fail level=0 (upstream image breakage, rcust DEFERRED) → capture correctly skipped (`if deploy_ok`) | N-A-candidate (blocked upstream) | +| cryptpad | m2r-cryptpad | screenshot.png | 4802 | solid light-grey frame, nothing else | BLANK | +| custom-html | m2r-custom-html | screenshot.png | 35707 | "Welcome to nginx!" default page | OK? (diagnose: is this the recipe's true fresh-install content?) | +| custom-html-tiny | m2r-custom-html-tiny | screenshot.png | 12950 | seeded CI content ("cc-ci custom-html-tiny … DG5") | OK | +| discourse | m2p-discourse | screenshot.png | 66121 | real forum UI, welcome topic, Sign Up/Log In | OK | +| ghost | m2r-ghost | screenshot.png | 444183 | real blog landing ("Thoughts, stories and ideas") | OK | +| hedgedoc | m2r-hedgedoc | screenshot.png | 131967 | real landing (logo, Sign In, feature intro) | OK | +| immich | 356 | screenshot.png | 4801 | pure white frame | BLANK | +| keycloak | m2r-keycloak | screenshot.png | 8764 | spinner + "Loading the Administration Console" | LOADING | +| lasuite-docs | m2r-lasuite-docs | screenshot.png | 6022 | lone spinner on white | LOADING | +| lasuite-drive | m2p2-lasuite-drive | screenshot.png | 5895 | lone spinner on white | LOADING | +| lasuite-meet | m2r-lasuite-meet | screenshot.png | 4801 | pure white frame | BLANK | +| mailu | m2r-mailu | screenshot.png | 33800 | real sign-in page (empty fields) | OK | +| matrix-synapse | m2r-matrix-synapse | screenshot.png | 33296 | "It works! Synapse is running" landing | OK | +| mattermost-lts | m2b-mattermost-lts | screenshot.png | 242139 | brand splash/loading screen (logo on blue), NOT the login form | LOADING (borderline — brand-recognizable but a loading state) | +| mumble | m2r-mumble | screenshot.png | 7913 | spinner on grey — a web page IS served on the domain | LOADING (diagnose what serves it; N/A may NOT be justified) | +| n8n | m2r-n8n | screenshot.png | 4801 | off-white blank frame. Flaky: run 197 (30256 B) shows the real "Set up owner account" form (empty fields, credential-free) | BLANK (flaky) | +| plausible | 357 | null | — | no PNG on ANY run (122→357) | NULL | +| uptime-kuma | m2r-uptime-kuma | screenshot.png | 30858 | real "Create your admin account" setup form (empty fields) | OK | + +PNG-size note: 4801/4802 B at 1280×800 is a byte-stable blank-frame fingerprint (3 different apps, same size). + +### P2 — Root-cause diagnoses + +- [x] **NULL — plausible** (evidence: Drone build 357 ci-step log, t=73s): + `screenshot: capture failed (non-fatal, verdict unaffected): page.goto(https://plau-b51425.ci.commoninternet.net/) never returned a status in (200, 301, 302, 303, 401, 403) after 15 attempts (45s); last status=500`. + Plausible's `/` 500s **by design** under `DISABLE_AUTH=true` (auth_controller; documented in + `tests/plausible/functional/test_health_check.py` docstring and recipe_meta — that's why HEALTH_PATH + is `/api/health`). Default landing-page capture can NEVER succeed → needs a per-recipe SCREENSHOT + hook to a path that actually renders (probe live: e.g. /login or /sites). +- [x] **NULL — bluesky-pds**: install fails (level=0) before the app is up → `if deploy_ok:` gate in + runner/run_recipe_ci.py:1024 correctly skips capture. Not a screenshot defect; upstream image + breakage already filed in machine-docs/DEFERRED.md (rcust). → documented N/A while upstream is broken. +- [x] **BLANK class — immich, lasuite-meet, n8n(flaky), cryptpad**: SPA paint race. capture() navigates + with `wait_until="domcontentloaded"` (runner/harness/screenshot.py:91) and screenshots immediately; + SPA shell HTML has loaded but JS hasn't painted → solid 4801-2 B frame. n8n flakiness = same race, + sometimes JS wins (run 197 captured the real form). +- [x] **LOADING class — keycloak, lasuite-docs, lasuite-drive, mumble, mattermost-lts(borderline)**: + same race, caught mid-paint (spinner/splash rendered, app JS still loading/connecting). +- [x] **mumble** web stack identified: recipe deploys a `web` service (mumble-web client) on the domain — + spinner is its connecting state; landing renders a connect dialog once JS settles. NOT an N/A. +- [x] **custom-html** nginx-welcome question: the recipe's fresh install genuinely serves the nginx + default page at `/` (no content seeded for this recipe's install; only custom-html-tiny seeds via + install_steps.sh). Screenshot is an honest representative view of a fresh install. → OK as-is. + +### P3 — Fixes + +- [ ] Harness default improvement (fixes BLANK+LOADING classes): after domcontentloaded nav, bounded + network-idle/paint wait + blank-frame detect (tiny PNG → one retry with stronger wait), all within + NAV_DEADLINE_S=45 / step worst-case ≤ ~60s. Unit tests in tests/unit/test_screenshot.py. +- [ ] plausible SCREENSHOT hook (tests/plausible/recipe_meta.py) to a rendering, credential-free path. +- [ ] Re-audit mattermost-lts / mumble / keycloak / lasuite-* after harness fix; per-recipe hooks only + where the default still can't work. +- [ ] bluesky-pds: document N/A in matrix (Adversary agreement at M1/M2). + +### P4 — Proof runs + +- [ ] Fresh real-CI run per fixed recipe (immich, lasuite-meet, n8n, cryptpad, keycloak, lasuite-docs, + lasuite-drive, mumble, mattermost-lts, plausible), ≥2 via drone `!testme`; visual check each PNG; + card + dashboard render. Healthy class: cite existing artifact + visual check (done in P1). + +## Adversary findings + +(Adversary-owned section.) diff --git a/JOURNAL-shot.md b/JOURNAL-shot.md new file mode 100644 index 0000000..93fa2a1 --- /dev/null +++ b/JOURNAL-shot.md @@ -0,0 +1,40 @@ +# JOURNAL-shot.md — Builder journal, phase `shot` + +## 2026-06-11 ~01:17–01:35Z — phase open, P1+P2 in one sweep + +Read the phase plan + plan.md §6.1/§7/§9. Enumerated enrolled recipes (19). Pulled per-recipe +latest-run data off cc-ci (`results.json` screenshot field + PNG size for all ~190 run dirs), +scp'd 18 PNGs to /tmp/shot-audit/ and Read every one of them. + +Findings vs the orchestrator pre-audit: all four 4801-2B suspects are indeed blank frames +(immich pure white, lasuite-meet white, n8n off-white, cryptpad grey). keycloak 8.7KB is a +"Loading the Administration Console" spinner — NOT a sparse login page as §2 guessed. +lasuite-docs/drive ~5.9KB are lone spinners. Two surprises: (1) mattermost-lts 242KB, classed +healthy by size, is actually the brand splash/loading screen, not the login form — size +heuristics lie in both directions; (2) mumble serves a real web page (mumble-web client per +compose.mumbleweb.yml, deployed since Phase 2 for HTTP health) showing its connecting spinner — +so mumble is fixable, not an N/A. + +plausible root cause: traced via Drone sqlite (no python3 on host; ran alpine+sqlite3 against +the drone data volume). Build 357 log t=73s: capture failed, last status=500 after 45s. Cross-ref +tests/plausible/functional/test_health_check.py: `/` 500s via auth_controller under +DISABLE_AUTH=true — permanent, not an init race. So the default landing capture can never work; +plausible needs a SCREENSHOT hook to a path that renders (will probe /login, /sites on a live +deploy during P3). + +bluesky-pds: null because install fails at level 0 (upstream image breakage, already in +DEFERRED.md from rcust) — capture gated on deploy_ok, correctly skipped. N/A while upstream broken. + +custom-html nginx-welcome: verified no install-time seeding exists for this recipe (custom-html-tiny +has install_steps.sh; custom-html only seeds in pre_backup/pre_upgrade ops, after capture). The +nginx default page IS the honest fresh-install view. Leaving OK; flagged in matrix for Adversary. + +Adversary opened REVIEW-shot.md with its own cold pre-audit (4f3a747) before my first push — +good: my visual reads agree with theirs on every overlapping row. + +Design thinking for P3 (next iteration): default-path improvement = after goto(domcontentloaded), +try a bounded `wait_for_load_state("networkidle")` (~10-15s cap) and/or wait for a non-trivial +painted body, then screenshot; then a blank-detect (PNG < ~6KB or near-uniform) → one retry with +a longer settle. Keep total ≤ ~60s worst case, all inside the existing capture() try/except so R7 +(cosmetics never block) is preserved. Unit tests: blank-detector pure function + retry logic with +a fake page. Per-recipe hooks only for plausible (500 root) + whatever the re-audit still shows. diff --git a/STATUS-shot.md b/STATUS-shot.md new file mode 100644 index 0000000..dfaf559 --- /dev/null +++ b/STATUS-shot.md @@ -0,0 +1,33 @@ +# STATUS-shot.md — Builder status, phase `shot` + +SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-shot-screenshots.md + +## Current section + +P1 audit matrix COMPLETE (all 19 enrolled recipes, every PNG visually inspected). +P2 diagnoses COMPLETE (see BACKLOG-shot.md P2 — each with evidence). +Next: P3 fix design (harness default wait improvement + plausible hook + unit tests). +Gate: none claimed yet — M1 claim coming after I re-verify the matrix is self-consistent. + +## Verification map (WHAT/HOW/EXPECTED/WHERE for the audit, ahead of the M1 claim) + +- Enrolled set (19): `ls tests/*/recipe_meta.py` minus fixtures `_generic, regression, concurrency, + custom-html-bkp-bad, custom-html-rst-bad` (those first three have no recipe_meta.py; the two + `-bad` ones do but are harness canaries). +- Matrix: BACKLOG-shot.md "P1 — Audit matrix". Reproduce any row: + `ssh cc-ci 'grep -o "\"screenshot\": *[^,}]*" /var/lib/cc-ci-runs//results.json; stat -c%s /var/lib/cc-ci-runs//screenshot.png'` + then scp the PNG and Read it. Run ids are in the matrix "latest run" column. +- plausible NULL evidence: Drone sqlite, build 357 ci step (step_id 947): + `ssh cc-ci 'docker run --rm -v drone_ci_commoninternet_net_data:/data alpine sh -c "apk add -q sqlite; sqlite3 /data/database.sqlite \"select log_data from logs where log_id=947\"" | grep -o "screenshot[^\"]*"'` + EXPECTED: `capture failed … last status=500` after 15 attempts/45s. +- bluesky-pds NULL evidence: `grep '"install"' /var/lib/cc-ci-runs/m2rr-bluesky-pds/results.json` + → fail, level=0; capture is gated on deploy_ok (runner/run_recipe_ci.py:1024). +- Default capture path under audit: runner/harness/screenshot.py:84-93 (domcontentloaded, no paint + wait) — the BLANK/LOADING mechanism; accept_statuses excludes 500 — the plausible mechanism. +- mumble web UI exists: tests/mumble/recipe_meta.py header (compose.mumbleweb.yml, HEALTH_PATH "/"). +- custom-html fresh install serves nginx default: no install_steps.sh in tests/custom-html/ (only + pre_backup/pre_upgrade seeds in ops.py, which run AFTER the capture moment). + +## Blocked + +(nothing)