diff --git a/machine-docs/BACKLOG-3.md b/machine-docs/BACKLOG-3.md index fb8ac7d..e55938a 100644 --- a/machine-docs/BACKLOG-3.md +++ b/machine-docs/BACKLOG-3.md @@ -14,13 +14,17 @@ Milestones U0–U5 (plan §5); each ends with an Adversary gate. DoD items R1– flags) to the run-scoped artifact dir; assembly wrapped so it NEVER changes the verdict (R7). - [x] U0.4 — Artifact hosting path decided + recorded in DECISIONS (`${CCCI_RUNS_DIR:-/var/lib/cc-ci-runs}/ /`; dashboard serves `/runs//` in U2/U4 via host bind-mount). -- GATE U0: **CLAIMED 2026-05-31** — real runs: custom-html-tiny=L2 (cap L3 N/A), uptime-kuma=L4 (cap L5 N/A). +- GATE U0: **PASS** (Adversary REVIEW-3 @18d2bd1, 2026-05-31) — R1 cold-verified, no inflation, no VETO. ### U1 — App screenshot (R4) -- [ ] U1.1 — Harness captures a real Playwright screenshot of the deployed app while it is up - (post-login where the landing page needs it), secret-safe (never shoot a credentials page). -- [ ] U1.2 — Screenshot saved to the run artifact dir; degrades gracefully (no screenshot ≠ run fail). -- GATE U1: screenshot of a sample recipe shows the working UI, no secrets. +- [x] U1.1 — Harness captures a real Playwright screenshot of the deployed app while it is up + (default landing page = secret-safe; recipes opt into a post-login view via a SCREENSHOT meta + hook, never shoot a credentials page). Wired into run_recipe_ci.py post-healthy, pre-teardown. +- [x] U1.2 — Screenshot saved to run artifact dir (`screenshot.png`); results.json `screenshot` field + set ONLY when capture succeeds; degrades gracefully (capture() swallows all errors → None → + field null → run/verdict unaffected, R7). +- GATE U1: **CLAIMED 2026-05-31** — uptime-kuma real run: 30KB screenshot shows working "Uptime Kuma / + Create your admin account" UI with EMPTY credential fields (no secret values); clean teardown. ### U2 — Summary card + badge (R3, R6) - [ ] U2.1 — HTML results-card template (recipe+version, level badge, per-stage/per-test ✔/✘ table, diff --git a/machine-docs/JOURNAL-3.md b/machine-docs/JOURNAL-3.md index 6589785..2815a4c 100644 --- a/machine-docs/JOURNAL-3.md +++ b/machine-docs/JOURNAL-3.md @@ -99,3 +99,36 @@ None on failure) — then the card's `show_shot` gate falls back to the `no scre as the fail fixture already proves. No renderer change needed. Not claiming U2 — still parked at the U0 gate per §6.1 (no advance past a gate without its PASS). + +## 2026-05-31T07:00Z — U0 PASS; U1 (app screenshot) wired + CLAIMED + +Adversary cold-verified U0 (REVIEW-3 @18d2bd1: R1 ladder, no inflation, R7-safe emission, no VETO). +Carry-forwards it logged (hard-coded flags scanned at U5; served-URL hosting at U2/U4) are all +expected and U1/U5-scoped, not U0 defects. Proceeded past U0 to U1. + +WHY / design notes for U1: +- **Capture point = right after deploy+health/readiness, before any tier runs.** Earliest and cleanest + "freshly installed, working app" state; if a later tier hangs/times out we already have the shot. + The app stays up through all tiers until the single `finally` teardown, so the timing is free. +- **Placed OUTSIDE the deploy try/except**, guarded by `if deploy_ok`. Originally I put it inside the + try right after `deploy_ok=True`; realised that if `capture()` ever raised it would be caught by the + deploy `except` and wrongly flip `deploy_ok=False` (a cosmetic failing the deploy — exactly the R7 + violation we forbid). Moved it out so a screenshot issue is structurally incapable of touching the + verdict. `capture()` is also internally all-swallowing, so it's belt-and-suspenders. +- **Secret-safety = landing page by default.** The default shoots `https:///` (login/landing), + which shows form fields, never a generated secret. uptime-kuma's first-run page is "Create your + admin account" with EMPTY fields — the user sets the password, nothing is displayed. Recipes whose + landing page genuinely needs a post-login view opt in via a `SCREENSHOT` meta hook that owns the + no-credentials-page guarantee; none needed yet. The harness NEVER auto-fills a setup wizard. +- **results.json `screenshot` set only when a file was produced** — so the U2 card's `show_shot` gate + falls back to the "no screenshot" placeholder on failure (the fail fixture already proved this), and + no broken-image icon appears in real runs. +- **Degradation proven**, not asserted: capture against an unreachable host returns None after the 45s + deadline, writes no file, raises nothing (`GRACEFUL_DEGRADATION=True`). The deeper U5 R7 hardening + (kill-the-renderer, broad leak scan over served images/comments) is still the Adversary's at U5. + +Verification (all on cc-ci @5fa15d4): +- 38 phase-3 unit tests pass (incl. 4 test_screenshot pure-helper tests). +- uptime-kuma real install run → 30KB screenshot.png of the working UI (empty cred fields), results.json + `screenshot="screenshot.png"`, clean_teardown=true, no orphan service. +- unreachable-host capture → None, no file, no raise. diff --git a/machine-docs/STATUS-3.md b/machine-docs/STATUS-3.md index 4dc6637..785f53f 100644 --- a/machine-docs/STATUS-3.md +++ b/machine-docs/STATUS-3.md @@ -18,7 +18,7 @@ State files (this phase): `machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md`. D --- -## Gate: U0 — CLAIMED, awaiting Adversary (Results schema + level; R1) +## Gate: U0 — PASS (Adversary REVIEW-3 @18d2bd1, 2026-05-31; R1 cold-verified, no VETO) (Results schema + level) **WHAT.** `run_recipe_ci.py` now emits a per-run `results.json` with per-stage AND per-test ✔/✘ breakdown and a computed integer **level** (L0–L6, YunoHost gap-caps semantics). DoD R1 (level ladder) @@ -65,9 +65,60 @@ These two bracket the gate: a recipe whose functional tests **pass** is still ca lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface caps at **L4**. +--- + +## Gate: U1 — CLAIMED, awaiting Adversary (App screenshot; R4) + +**WHAT.** The harness now captures a **real Playwright screenshot of the deployed app** while it is +up (after deploy+health/readiness, before any tier mutates state, before teardown) and writes it to +the run artifact dir as `screenshot.png`. The capture is **secret-safe by default** (it shoots the +app **landing page**, never a credentials page; a recipe opts into a post-login view via an optional +`SCREENSHOT` meta hook that owns the no-secret-page guarantee — none used yet). It is **best-effort**: +`capture()` swallows every error and returns `None`, so it NEVER blocks/fails/hangs the run (R7); the +`results.json` `screenshot` field is set to `"screenshot.png"` ONLY when the capture actually produced +a file, else stays `null`. U1 milestone acceptance ("screenshot of a sample recipe shows the working +UI, no secrets") demonstrated on a real uptime-kuma run; graceful-degradation (R7) demonstrated on an +unreachable-domain capture. + +**WHERE (commits / files).** +- `5fa15d4` `runner/run_recipe_ci.py` — imports `screenshot as screenshot_mod`; after deploy+readiness + and OUTSIDE the deploy try/except (so a screenshot issue can never flip `deploy_ok`), under + `if deploy_ok:` calls `screenshot_mod.capture(domain, screenshot_path(run_artifact_dir), recipe_meta=meta)` + and sets `screenshot_rel`; passes `screenshot=screenshot_rel` into `build_results(...)`. +- `daa7edd` `runner/harness/screenshot.py` — `capture()` (default landing-page nav via + `browser.goto_with_retry`, 45s deadline cap; optional `SCREENSHOT` hook), `screenshot_path()`, + `_load_screenshot_hook()`. `tests/unit/test_screenshot.py` (pure helpers; 4 tests). + +**HOW to verify (cold, from your clone on cc-ci).** +1. **Pure-helper unit tests:** `cc-ci-run -m pytest tests/unit/test_screenshot.py -q` +2. **Real positive capture** (working UI, no secret): `rm -rf /var/lib/cc-ci-runs/adv-u1 && + RECIPE=uptime-kuma STAGES=install CCCI_RUN_ID=adv-u1 cc-ci-run runner/run_recipe_ci.py` + then `scp` back `/var/lib/cc-ci-runs/adv-u1/screenshot.png` and EYEBALL it; check + `/var/lib/cc-ci-runs/adv-u1/results.json` has `"screenshot":"screenshot.png"`. Confirm NO orphan + service after (`docker service ls | grep -i uptime` empty = clean teardown). +3. **Graceful degradation (R7)** — capture against an unreachable host returns None, never raises: + `cc-ci-run -c 'import sys; sys.path.insert(0,"runner"); from harness import screenshot as S; + print(S.capture("adv-u1-noexist.ci.commoninternet.net","/tmp/x.png"))'` → prints `None` (≈45s), + no /tmp/x.png produced. + +**EXPECTED.** +1. `4 passed`. +2. `screenshot.png` ~30 KB showing uptime-kuma's **"Uptime Kuma / Create your admin account"** + landing page with **EMPTY** username/password/repeat fields (a setup form — it asks the user to + set a password; it does NOT display any generated secret), i.e. real working app UI, no secret + values. results.json `screenshot="screenshot.png"`, `flags.clean_teardown=true`; no orphan service. + (My run: `/var/lib/cc-ci-runs/u1-uk-shot/{screenshot.png,results.json}`.) +3. `None` returned after the 45s deadline, no file written, no exception — proving a screenshot + failure leaves the run/verdict untouched (cosmetics never block, R7). (My check log: capture + "failed (non-fatal, verdict unaffected)" → `GRACEFUL_DEGRADATION= True`.) + +The cardinal Phase-3 invariant for U1: the screenshot is a faithful capture of the live app, never a +credentials page, and its presence/absence never changes the verdict. + ## In flight (next, post-gate) -- U1 — app screenshot (Playwright, post-login, secret-safe). Will start once U0 PASSes; meanwhile I - hold U1 design as the next unblocked item. +- U2 — summary card + badge (HTML→PNG via Playwright; SVG level badge; stable URLs). Render path + already de-risked headless on cc-ci for pass+fail fixtures (JOURNAL-3 @06:50Z) — next is wiring the + card/badge generation into the run + serving them. Held until U1 PASSes (no advance past the gate). ## Blocked (none)