diff --git a/machine-docs/REVIEW-3.md b/machine-docs/REVIEW-3.md index 68098c1..0d3a03b 100644 --- a/machine-docs/REVIEW-3.md +++ b/machine-docs/REVIEW-3.md @@ -11,8 +11,8 @@ JOURNAL-3.md / BACKLOG-3.md `## Build backlog`. I own this file + BACKLOG-3.md ` status/level badge + summary image, both linking to run/dashboard; re-run updates same comment. - [ ] **R3 — Summary card image.** Per-run PNG: recipe+version, level, per-stage/per-test ✔/✘ breakdown, embedded deployed-app screenshot; stable URL; in comment + dashboard. -- [ ] **R4 — App screenshot.** Runner captures real screenshot of deployed app (Playwright, post-login - where needed) for the card. +- [x] **R4 — App screenshot.** Runner captures real screenshot of deployed app (Playwright, post-login + where needed) for the card. **COLD-VERIFIED @U1 07:15Z.** - [ ] **R5 — Dashboard polish.** Overview at ci.commoninternet.net resembles ci-apps.yunohost.org: recipe grid w/ level badge, latest pass/fail, last version, app screenshot, history link. - [ ] **R6 — Badges.** Per-recipe level/status SVG badge endpoint embeddable in READMEs + dashboard. @@ -23,7 +23,7 @@ JOURNAL-3.md / BACKLOG-3.md `## Build backlog`. I own this file + BACKLOG-3.md ` ## Milestone gates (each ends with an Adversary gate) — U0..U5 - [x] U0 — Results schema + level (results.json per-stage/per-test; level correct for L4-pass & L2-cap). **PASS @07:05Z.** -- [ ] U1 — App screenshot (real, post-login, secret-safe). +- [x] U1 — App screenshot (real, post-login, secret-safe). **PASS @07:15Z.** - [ ] U2 — Summary card + badge (HTML→PNG; level/✔✘/screenshot; SVG badge; stable URLs; pass+fail). - [ ] U3 — YunoHost-style PR comment (marker+badge+card, linked; updates on re-run; no secrets). - [ ] U4 — Dashboard polish (grid mirrors underlying results across several runs). @@ -158,3 +158,68 @@ may proceed past U0. (rendered card/level/screenshot never greener than raw results.json + actual outcomes) at U2–U4. - Pre-existing repo-wide lint RED on origin/main (Builder-flagged) is not a Phase-3 DoD item and not introduced by U0 — noted, not a finding. + +### @2026-05-31T07:15Z — U1 GATE: **PASS** (App screenshot; R4) + +**Claim (STATUS-3, `claim(3 U1)` @d7e812e).** The harness captures a real Playwright screenshot of +the deployed app while it is up (after deploy+readiness, before teardown), writes `screenshot.png` to +the run artifact dir, is secret-safe by default (landing page, never a credentials page), and is +best-effort so it never blocks/fails/hangs the run (R7); `results.json` `screenshot` is set to +`"screenshot.png"` only when a file was produced. + +**Verification COLD + INDEPENDENT** (my clone tar'd to a fresh `/tmp/advverify` on cc-ci, run under +the real `cc-ci-run`; JOURNAL-3 not read before this verdict). + +**1. Pure-helper unit tests.** `cc-ci-run -m pytest tests/unit/test_screenshot.py -q` → **3 passed**. +(STATUS EXPECTED said "4 passed"; the file has exactly **3** test functions. Minor over-count in the +claim doc — NOT a defect, recorded for honesty.) + +**2. Real positive capture — MY OWN live run.** `RECIPE=uptime-kuma STAGES=install,custom +CCCI_RUN_ID=u1-adv cc-ci-run runner/run_recipe_ci.py` ran to completion (install pass, custom pass, +exit clean). Artifacts: `/var/lib/cc-ci-runs/u1-adv/{screenshot.png,results.json,junit/}`. +- I `scp`'d `screenshot.png` to the VM and **EYEBALLED it with the image viewer**: a valid PNG header, + **1280×800, 39 773 bytes**, showing uptime-kuma's live **"Create your admin account"** setup page — + empty Username / Password / Repeat-Password fields + a Create button. This is **real working app UI** + and displays **NO secret values** (a setup form asks the user to *choose* a password; it reveals + none). Secret-safe ✔. +- `results.json`: `screenshot="screenshot.png"`, `level=1` (cap "L2 upgrade … N/A" — correct for an + install-only run), `flags={clean_teardown:true, no_secret_leak:true}`, `results={install:pass, + custom:pass}`. The screenshot field is set BECAUSE a file was produced. ✔ + +**3. Clean teardown (live).** Post-run `docker service ls` shows only infra (backups / bridge / +dashboard / drone / traefik×2) — **no orphan uptime-kuma stack**. ✔ + +**4. Graceful degradation (R7) — the key cosmetics-never-block invariant.** I drove +`screenshot.capture("adv-noexist-xyz.ci.commoninternet.net", "/tmp/advx.png")` against an +unresolvable host: it printed `screenshot: capture failed (non-fatal, verdict unaffected): +... ERR_NAME_NOT_RESOLVED`, **returned `None`, wrote no file, raised nothing**. A screenshot failure +cannot fail/hang the run or flip the verdict. ✔ + +**5. Wiring is R7-safe (code inspection, cold).** `run_recipe_ci.py:968-979` places the capture +under `if deploy_ok:` AFTER `lifecycle.wait_healthy(...)` and BEFORE any tier mutates state and BEFORE +the `finally` teardown — so the app is genuinely up and in its cleanest state when shot. It is +**outside** the deploy `try/except`, so a screenshot issue can never flip `deploy_ok`. `capture()` +itself wraps everything in `try/except Exception → return None` with a hard `NAV_DEADLINE_S=45` +cap (can't hang). `screenshot_rel` is `basename(shot) if shot else None`, and the whole +`build_results`/`write_results` block is itself R7-wrapped. Cosmetics provably cannot change `overall`. + +**6. Secret-safety by design.** Default capture is the app landing page (login/setup forms show +*fields*, not secrets); `full_page=False` (viewport only, no scroll into a secrets panel); the harness +**never auto-fills an install wizard**; a post-login view is only reachable via an opt-in recipe +`SCREENSHOT` hook that owns the no-secret-page guarantee — **none used yet**, so no recipe currently +risks a credential page. + +**Cardinal U1 invariant** (screenshot is a faithful live-app capture, never a credentials page, and +its presence/absence never changes the verdict): **HELD**. + +**VERDICT: U1 PASS @2026-05-31T07:15Z.** **R4 (app screenshot) cold-verified.** No VETO. Builder may +proceed to U2. + +**Carry-forward (NOT blocking U1):** +- The plan's "post-login where the landing page requires it" path (the `SCREENSHOT` hook) is + *implemented* but *unexercised on any real recipe* — uptime-kuma's informative landing/setup page + doesn't need it. Fine for U1's accept criterion ("working UI, no secrets"); I'll re-scrutinise the + hook + secret-safety once a recipe whose landing page is blank/uninformative opts in, and over the + served card/dashboard images at U2–U5 (R7 leak authority is mine). +- STATUS EXPECTED's "4 passed" vs actual 3 unit tests — doc-only over-count; flag to Builder via the + honest-reporting rule, no behavioural impact.