review(3 U4): PASS — dashboard grid + history cold-verified (R5, R3 full); never-greener vs results.json, honest #11 failure row (no results.json→failure/—), no secrets, 9 tests

2026-05-31 10:04:09 +00:00
parent 1be4492b90
commit 9ca39dc179
1 changed files with 59 additions and 3 deletions
--- a/machine-docs/REVIEW-3.md
+++ b/machine-docs/REVIEW-3.md
@ -9,11 +9,11 @@ JOURNAL-3.md / BACKLOG-3.md `## Build backlog`. I own this file + BACKLOG-3.md `
      run; a missing lower rung caps the level (YunoHost semantics). **COLD-VERIFIED @U0 07:05Z.**
 - [x] **R2 — Image-forward PR comment.** `!testme` posts/updates a Gitea PR comment: marker (🌻) +
      status/level badge + summary image, both linking to run/dashboard; re-run updates same comment.
- [ ] **R3 — Summary card image.** Per-run PNG: recipe+version, level, per-stage/per-test ✔/✘
+- [x] **R3 — Summary card image.** Per-run PNG: recipe+version, level, per-stage/per-test ✔/✘
      breakdown, embedded deployed-app screenshot; stable URL; in comment + dashboard.
 - [x] **R4 — App screenshot.** Runner captures real screenshot of deployed app (Playwright, post-login
      where needed) for the card. **COLD-VERIFIED @U1 07:15Z.**
- [ ] **R5 — Dashboard polish.** Overview at ci.commoninternet.net resembles ci-apps.yunohost.org:
+- [x] **R5 — Dashboard polish.** Overview at ci.commoninternet.net resembles ci-apps.yunohost.org:
      recipe grid w/ level badge, latest pass/fail, last version, app screenshot, history link.
 - [ ] **R6 — Badges.** Per-recipe level/status SVG badge endpoint embeddable in READMEs + dashboard.
 - [ ] **R7 — Safe & robust.** No secrets in images/comments/badges/screenshots (reuse P1 §4.4
@ -26,7 +26,7 @@ JOURNAL-3.md / BACKLOG-3.md `## Build backlog`. I own this file + BACKLOG-3.md `
 - [x] U1 — App screenshot (real, post-login, secret-safe). **PASS @07:15Z.**
 - [x] U2 — Summary card + badge (HTML→PNG; level/✔✘/screenshot; SVG badge; stable URLs; pass+fail). **PASS @07:48Z.**
 - [x] U3 — YunoHost-style PR comment (marker+badge+card, linked; updates on re-run; no secrets). **PASS @09:51Z.**
- [ ] U4 — Dashboard polish (grid mirrors underlying results across several runs).
+- [x] U4 — Dashboard polish (grid mirrors underlying results across several runs). **PASS @10:04Z.**
 - [ ] U5 — Badges + docs + hardening (leak scan clean; renderer-kill degrades to text; flip DONE).

 ## Adversary invariants to attack this phase (from §6 guardrails)
@ -422,3 +422,59 @@ may proceed to U4.
  degrades to text, verdict unaffected" demonstration is **U5** hardening scope, not U3.
 - **Placeholder (⏳) not observed live** this run (build completed inside one 30s poll window); covered
  by unit test + Builder's #3→#4 demo. Not re-tested — acceptable.
+
+### @2026-05-31T10:04Z — U4 GATE: PASS (Dashboard polish; R5 + R3 "in dashboard") — COLD-VERIFIED
+Claim `fb8f382 claim(3 U4)`. Verified cold from my clone + the VM. Verdict formed WITHOUT reading
+JOURNAL-3 (anti-anchoring); inbox artifact-map consumed @1be4492.
+
+**1. Deployed == committed source.** `sha256(dashboard/dashboard.py)` first-12 in MY clone =
+`7b34ec8761df` == host `/etc/cc-ci/dashboard/dashboard.py` == swarm image tag
+`cc-ci-dashboard:7b34ec8761df` (`ccci-dashboard_app` 1/1). Live dashboard IS the claimed source. ✔
+
+**2. Unit tests (cold, cc-ci devshell):** `cc-ci-run -m pytest tests/unit/test_dashboard.py -q` →
+**9 passed**. ✔
+
+**3. Live grid (R5)** — `GET https://ci.commoninternet.net/` → 200, YunoHost-style grid, two recipe
+cards: **custom-html** (level 4, success, `db9a95024e9d`, cap "L5 integration N/A", ✔ teardown / ✔
+no-leak, screenshot thumb `/runs/7/screenshot.png` → `/runs/7/summary.png`, `history →`
+`/recipe/custom-html`) and **uptime-kuma** (level 4, success, `dfed87a39f8a`, `/runs/12/...`). Each has
+level badge + latest pass/fail + last version + app screenshot + history link — mirrors
+`ci-apps.yunohost.org` shape (plan R5). ✔
+
+**4. Live history** — `/recipe/custom-html` → 200, rows #7/#4/#3/#1 each success/L4/version + per-run
+`card` link to `/runs/<n>/summary.png`. `/recipe/uptime-kuma` → 200, **#12 success L4** + **#11 failure,
+level —, no card** — a real failed run shown HONESTLY. ✔
+
+**5. CARDINAL — no inflation, grid/history vs raw results.json (make-or-break).**
+- custom-html grid "level 4" == `/runs/7/results.json` `level=4`, all tiers pass (verified @U3). ✔
+- uptime-kuma grid "level 4" == `/runs/12/results.json` `recipe=uptime-kuma`, `version=dfed87a39f8a`,
+  `level=4`, results all-pass, flags both true. **Exact match.** ✔
+- **Honest failure (the key adversarial probe):** `/runs/11/results.json` → **HTTP 404 (genuinely
+  absent** — run #11 failed at `fetch_recipe` on a bogus ref, wrote no artifact). The dashboard shows
+  #11 as **`failure / level — / no card`** — derived faithfully from the artifact's ABSENCE, **not a
+  fabricated or inflated level, and no screenshot/card it never produced.** ✔
+- **Live-read proof (not hardcoded):** the grid surfaces custom-html **run #7** (my U3 re-`!testme`,
+  newer than #4) with a dynamic "12m ago" — it picks the latest Drone build + its results.json live,
+  so the displayed level cannot drift greener than the actual latest run. ✔
+
+**6. No secrets (R7).** Scan of the grid + both history pages → the only `secret` hits are the
+`title="no secret leak"` flag label (2×); zero real secret values. Embedded screenshot thumbnails are
+the U1-verified secret-safe **setup pages** — eyeballed `/runs/12/screenshot.png`: Uptime Kuma "Create
+your admin account" with **EMPTY** username/password fields (a form to SET a password — displays no
+generated credential). ✔
+
+**7. HEAD parity / A3-1 stays closed.** `HEAD /`, `HEAD /recipe/custom-html`, `HEAD /recipe/uptime-kuma`
+→ all **200** (shared `_route` w/ GET). ✔
+
+**VERDICT: U4 PASS @2026-05-31T10:04Z.** The overview grid + per-recipe history are a faithful,
+never-greener projection of each run's `results.json`; a failed/levelless run (#11) is shown honestly
+(failure pill, level —, no card); rendering is read-only over RO-bind-mounted artifacts and reads the
+latest build live; no secrets; deployed dashboard == committed source; 9 unit tests pass.
+**R5 satisfied. R3 now FULLY satisfied** (card embedded in both the PR comment (U3) and the dashboard
+(U4)). No VETO. Builder may proceed to U5 (per-recipe badge + docs + hardening + final leak scan).
+
+**Scope / carry-forward (NOT defects):**
+- **R6** (per-recipe latest-level badge endpoint embeddable in READMEs) — still **U5** scope; the
+  per-RUN `badge.svg` is U2-verified, but the per-RECIPE endpoint isn't present yet. R6 stays unticked.
+- **R7 full hardening** (render-kill degrades to text, broad leak scan over ALL published artifacts),
+  **R8 docs** — **U5** scope.