Files
cc-ci/machine-docs/STATUS-3.md

360 lines
27 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 3 — Beautiful YunoHost-style results — STATUS
SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. DoD = R1R8. Milestones U0U5.
State files (this phase): `machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md`. DECISIONS.md shared.
**WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.**
## Phase context
- Phase 2b is `## DONE` (Adversary-verified, no VETO). Phase 3 kicked off **manually by the operator**.
Note for honesty: Phase-2 `## DONE` not yet flipped (REVIEW-2 standing VETO on full Phase-2 DONE
authorization); cross-phase sequencing is an operator call. Adversary concurs it's not a P3 blocker
(REVIEW-3 @05:42Z).
- **Pre-existing repo-wide lint is RED on origin/main** (94 files `ruff format`-dirty + 36 `ruff check`
errors; confirmed on cc-ci CI devshell against clean `origin/main`, ruff 0.7.3). This predates Phase 3
and is NOT introduced by my work — my NEW Phase-3 files are fully `ruff`-clean, and I left
`run_recipe_ci.py` with fewer ruff errors than main (1 vs 4). Flagged for the operator; not a Phase-3
DoD item, and the U0 gate is verified by unit tests + real-run results.json, not repo-wide lint.
---
## Gate: U0 — PASS (Adversary REVIEW-3 @18d2bd1, 2026-05-31; R1 cold-verified, no VETO) (Results schema + level)
**WHAT.** `run_recipe_ci.py` now emits a per-run `results.json` with per-stage AND per-test ✔/✘
breakdown and a computed integer **level** (L0L6, YunoHost gap-caps semantics). DoD R1 (level ladder)
satisfied; U0 milestone acceptance ("level correct for a recipe through L4 and one capped at L2")
demonstrated on two real end-to-end runs.
**WHERE (commits / files).**
- `9773e3f` `runner/harness/level.py` — pure `compute_level(rungs)->(level,cap_reason)` + helpers
`backup_restore_status`, `tier_to_rung`. `tests/unit/test_level.py` (15 tests).
- `52e5d21` `runner/harness/results.py` — JUnit-XML parse, `collect_stages`, `derive_rungs` (the
tier+deps/SSO→rung translation), `build_results`, `write_results`. `tests/unit/test_results.py`
(13 tests). `runner/run_recipe_ci.py` — tiers emit `--junitxml` + append `{tier,source,file,rc,junit}`
records; `main()` assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7),
incl. a narrow self leak-scan of the serialised artifact.
- `757511e` `machine-docs/DECISIONS.md` (Phase-3 section) — the documented ladder + exact rung-mapping
contract `derive_rungs` implements + results.json schema + artifact-hosting decision.
**HOW to verify (cold, from your clone on cc-ci).**
1. **Unit tests** (deterministic; also fuzz-verifiable):
`cc-ci-run -m pytest tests/unit/test_level.py tests/unit/test_results.py -q`
2. **Real-run L2-cap** (stateless, not backup-capable, ≥2 versions):
`RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-cht cc-ci-run runner/run_recipe_ci.py`
then read `/var/lib/cc-ci-runs/adv-cht/results.json`.
3. **Real-run L4-pass** (backup-capable, 3 functional tests, no deps):
`RECIPE=uptime-kuma STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-uk cc-ci-run runner/run_recipe_ci.py`
then read `/var/lib/cc-ci-runs/adv-uk/results.json`.
(Compare the `level`/`rungs` against the `results` dict + DECISIONS contract — a level greener than
the tiers would be a FAIL. Verify clean teardown: no orphan `*-pr*`/recipe service after.)
**EXPECTED.**
1. `28 passed`.
2. custom-html-tiny: `level=2`, `level_cap_reason="L3 backup/restore (data integrity) N/A"`,
`rungs={install:pass, upgrade:pass, backup_restore:na, functional:na, integration:na, recipe_local:na}`,
`results={install:pass, upgrade:pass, backup:skip, restore:skip, custom:skip}`,
`flags={clean_teardown:true, no_secret_leak:true}`, stages=[install,upgrade] each w/ per-test rows.
(My run: `/var/lib/cc-ci-runs/u0-cht-L2/results.json`.)
3. uptime-kuma: `level=4`, `level_cap_reason="L5 integration (SSO/OIDC + cross-app) N/A"`,
`rungs={install:pass, upgrade:pass, backup_restore:pass, functional:pass, integration:na, recipe_local:na}`,
all five tiers pass, `flags.clean_teardown=true`, stages=[install,upgrade,backup,restore,custom]
with per-test rows (incl. 3 uptime-kuma functional tests, source `cc-ci`).
(My run: `/var/lib/cc-ci-runs/u0-uk-L4/results.json`.)
These two bracket the gate: a recipe whose functional tests **pass** is still capped at **L2** when a
lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface
caps at **L4**.
---
## Gate: U1 — PASS (Adversary REVIEW-3 @74a6993, 2026-05-31; R4 cold-verified, no VETO) (App screenshot)
**WHAT.** The harness now captures a **real Playwright screenshot of the deployed app** while it is
up (after deploy+health/readiness, before any tier mutates state, before teardown) and writes it to
the run artifact dir as `screenshot.png`. The capture is **secret-safe by default** (it shoots the
app **landing page**, never a credentials page; a recipe opts into a post-login view via an optional
`SCREENSHOT` meta hook that owns the no-secret-page guarantee — none used yet). It is **best-effort**:
`capture()` swallows every error and returns `None`, so it NEVER blocks/fails/hangs the run (R7); the
`results.json` `screenshot` field is set to `"screenshot.png"` ONLY when the capture actually produced
a file, else stays `null`. U1 milestone acceptance ("screenshot of a sample recipe shows the working
UI, no secrets") demonstrated on a real uptime-kuma run; graceful-degradation (R7) demonstrated on an
unreachable-domain capture.
**WHERE (commits / files).**
- `5fa15d4` `runner/run_recipe_ci.py` — imports `screenshot as screenshot_mod`; after deploy+readiness
and OUTSIDE the deploy try/except (so a screenshot issue can never flip `deploy_ok`), under
`if deploy_ok:` calls `screenshot_mod.capture(domain, screenshot_path(run_artifact_dir), recipe_meta=meta)`
and sets `screenshot_rel`; passes `screenshot=screenshot_rel` into `build_results(...)`.
- `daa7edd` `runner/harness/screenshot.py``capture()` (default landing-page nav via
`browser.goto_with_retry`, 45s deadline cap; optional `SCREENSHOT` hook), `screenshot_path()`,
`_load_screenshot_hook()`. `tests/unit/test_screenshot.py` (pure helpers; 4 tests).
**HOW to verify (cold, from your clone on cc-ci).**
1. **Pure-helper unit tests:** `cc-ci-run -m pytest tests/unit/test_screenshot.py -q`
2. **Real positive capture** (working UI, no secret): `rm -rf /var/lib/cc-ci-runs/adv-u1 &&
RECIPE=uptime-kuma STAGES=install CCCI_RUN_ID=adv-u1 cc-ci-run runner/run_recipe_ci.py`
then `scp` back `/var/lib/cc-ci-runs/adv-u1/screenshot.png` and EYEBALL it; check
`/var/lib/cc-ci-runs/adv-u1/results.json` has `"screenshot":"screenshot.png"`. Confirm NO orphan
service after (`docker service ls | grep -i uptime` empty = clean teardown).
3. **Graceful degradation (R7)** — capture against an unreachable host returns None, never raises:
`cc-ci-run -c 'import sys; sys.path.insert(0,"runner"); from harness import screenshot as S;
print(S.capture("adv-u1-noexist.ci.commoninternet.net","/tmp/x.png"))'` → prints `None` (≈45s),
no /tmp/x.png produced.
**EXPECTED.**
1. `3 passed` (test_screenshot.py has 3 pure-helper tests; corrected from an earlier "4" over-count
per the Adversary's honest-reporting flag, REVIEW-3 @74a6993 — doc-only, no behavioural impact).
2. `screenshot.png` ~30 KB showing uptime-kuma's **"Uptime Kuma / Create your admin account"**
landing page with **EMPTY** username/password/repeat fields (a setup form — it asks the user to
set a password; it does NOT display any generated secret), i.e. real working app UI, no secret
values. results.json `screenshot="screenshot.png"`, `flags.clean_teardown=true`; no orphan service.
(My run: `/var/lib/cc-ci-runs/u1-uk-shot/{screenshot.png,results.json}`.)
3. `None` returned after the 45s deadline, no file written, no exception — proving a screenshot
failure leaves the run/verdict untouched (cosmetics never block, R7). (My check log: capture
"failed (non-fatal, verdict unaffected)" → `GRACEFUL_DEGRADATION= True`.)
The cardinal Phase-3 invariant for U1: the screenshot is a faithful capture of the live app, never a
credentials page, and its presence/absence never changes the verdict.
---
## Gate: U2 — PASS (Adversary REVIEW-3 @324d84d, 2026-05-31; R3/R6 partial cold-verified, no VETO) (Summary card + badge)
**WHAT.** Each run now renders a **summary card PNG** (recipe+version, level badge, per-stage/per-test
✔/✘ table, embedded **real app screenshot**) and an **SVG level badge**, written into the run artifact
dir and **served at stable URLs** `https://ci.commoninternet.net/runs/<run_id>/{summary.png,badge.svg,
screenshot.png,results.json}`. The card REPORTS results.json verbatim — it computes nothing, so it can
never look greener than the tiers (cardinal invariant). U2 acceptance ("card + badge render correctly
for a pass run AND a fail run") demonstrated: a real PASS run served live; a deterministic FAIL render
shown honest (L0/red/✘/no-screenshot).
**WHERE (commits / files).**
- `afe5e51` `runner/run_recipe_ci.py` — after results.json is written, a separate best-effort block
renders `summary.html`→`summary.png` + `badge.svg` via `harness.card` (passes
`screenshot_rel=data["screenshot"]` so the real shot embeds iff present). R7-wrapped — any failure
is swallowed, never changes `overall`.
- `daa7edd`/`7217e0c`/`8179d3f` `runner/harness/card.py` — pure `render_card_html`, `render_badge_svg`/
`level_badge_svg` (deterministic string builders), `render_card_png` (best-effort Playwright). Inline
SVG sunflower (headless chromium has no colour-emoji font). `tests/unit/test_card.py` (8 tests).
- `fa56f6b` `dashboard/dashboard.py` + `nix/modules/dashboard.nix` — `/runs/<id>/<file>` route
(allow-list + `run_id` regex + realpath-inside-runs-dir traversal guard); `/var/lib/cc-ci-runs`
bind-mounted READ-ONLY into the dashboard swarm service; `CCCI_RUNS_DIR` env.
**HOW to verify (cold).** (See ADVERSARY-INBOX for the deploy gotcha — do NOT `nixos-rebuild switch`
the live host; `#cc-ci` targets the hetzner migration host. U2.3 was rolled via the dashboard module
reconcile only. DECISIONS.md Phase-3/U2 has the `diff-closures` evidence.)
1. **Unit tests:** `cc-ci-run -m pytest tests/unit/test_card.py -q` → `8 passed`.
2. **PASS card served live (real):**
`curl -s -o /tmp/c.png -w '%{http_code} %{content_type} %{size_download}\n'
https://ci.commoninternet.net/runs/u1-uk-shot/summary.png` → `200 image/png ~69313`. Eyeball
`/tmp/c.png`: uptime-kuma, **orange LEVEL 1**, "capped: L2 upgrade N/A", install/test_serving ✔
PASS rows, clean-teardown+no-secret-leak flags, and the **real uptime-kuma screenshot embedded**.
Also `…/screenshot.png` (200 ~30858), `…/badge.svg` (200 image/svg+xml), `…/results.json` (200).
3. **Traversal/whitelist guard:** `…/runs/u1-uk-shot/../../../etc/passwd`, `…/runs/u1-uk-shot/evil.sh`,
`…/runs/nonexist/results.json` → **404** with a **9-byte** body (the dashboard's own "not found",
NOT Traefik's 19-byte 404 — proves the request reached the app and the guard rejected it).
4. **FAIL render is honest (cardinal invariant):** feed the card a fail dict (cmd in ADVERSARY-INBOX
§3) → card shows **level 0**, `level_color(0)` (red), the **✘ FAIL** mark on the install row, and
the **"no screenshot"** placeholder — never greener than the data.
**EXPECTED.** (1) `8 passed`. (2) PASS card 200/image-png/~69KB, embeds the real screenshot, level/marks
match results.json (`u1-uk-shot`: level 1, install pass). (3) all three guarded paths 404 with a 9B
body. (4) fail render: `>0<` (level 0), red colour, ✘ present, "no screenshot" present — no inflation.
The cardinal U2 invariant: the rendered card/level/badge are a faithful, never-greener projection of
results.json + the actual test outcomes, served at a stable URL, generated best-effort so a render
failure never blocks the run.
## Gate: U3 — PASS (Adversary REVIEW-3 @778b577, 2026-05-31T09:51Z; R2 cold-verified, no VETO) (YunoHost-style PR comment)
(Adversary cold-reproduced update-in-place via its own `!testme` → build #7; comment 13792 never
stacked; card == results.json, no inflation; no secrets. R3 "in comment" verified; R3 ticks at U4.)
**WHAT.** On a `!testme` run the bridge now posts/updates ONE Gitea PR comment in the YunoHost shape:
on run start a 🌻 + ⏳ **placeholder** ("level pending", live-logs link); on completion it edits the
**SAME** comment in place to 🌻 + a **level badge** image + a **summary card** image, BOTH linked to
the full run, plus full-logs/dashboard links. A re-`!testme` refreshes that same comment (back to ⏳,
then to the new result) — never stacks a new one (R2 "one comment per PR, updated in place"). Falls
back to a compact text verdict if the rendered card isn't served (R7). DoD **R2** satisfied; U3
acceptance ("live on a scratch PR — comment shows badge + card + screenshot, updates on re-run, no
secrets") demonstrated on a real scratch PR. (This also lands R3's "embedded in the comment"
sub-requirement; R3 still needs "in dashboard" at U4.)
**WHERE (commits / files).**
- `9a47aa2` `bridge/bridge.py` — `COMMENT_MARKER` (hidden HTML comment `<!-- cc-ci:testme -->`),
`start_comment_body` (⏳ placeholder), `result_comment_body` (🌻 + badge + card, linked; text
fallback), `find_existing_comment` (marker → update-in-place), `artifact_available` (HEAD existence
check → image-vs-text), `watch_and_reflect` now edits to `result_comment_body`. Card/badge URLs are
`${DASH_URL}/runs/<DRONE_BUILD_NUMBER>/{summary.png,badge.svg}` (run_id == Drone build number, see
`runner/harness/results.py::run_id`).
- `9a47aa2` `dashboard/dashboard.py` — `do_HEAD` (shared `_route` with GET) so HEAD existence-checks +
strict image clients get 200, not 501 (closes Adversary A3-1, already re-verified @8807240).
- `9a47aa2` `tests/unit/test_bridge_trigger.py` — covers placeholder shape, image-forward result,
**text fallback when card missing**, marker-based find/update-in-place.
- **Deployed:** bridge swarm image `cc-ci-bridge:6377f9571f3b` == `sha256(bridge.py)` first-12 (content
tag, confirmed live); dashboard image live with `do_HEAD`.
**HOW to verify (cold, from your clone / the VM).**
1. **Unit tests** (on cc-ci): `cc-ci-run -m pytest tests/unit/test_bridge_trigger.py tests/unit/test_card.py -q` → `15 passed`.
2. **Deployed bridge == source:** `ssh cc-ci 'sha256sum /etc/cc-ci/bridge/bridge.py | cut -c1-12'` →
`6377f9571f3b`; `ssh cc-ci 'docker service ls | grep ccci-bridge'` shows image tag `6377f9571f3b`.
3. **LIVE demo on scratch PR** `recipe-maintainers/custom-html` **PR #2** (recipe == repo name; the
bridge poller, 30s, fires on a NEW `!testme`). The bot comment carrying the marker is **id 13792**:
`curl -s -u "$GITEA_USERNAME:$GITEA_PASSWORD" https://git.autonomic.zone/api/v1/repos/recipe-maintainers/custom-html/issues/comments/13792`
→ body has `<!-- cc-ci:testme -->`, 🌻, `✅ passed`, `[![cc-ci result card](…/runs/4/summary.png)](…/4)`,
`[![level](…/runs/4/badge.svg)](…/4)`, full-logs+dashboard links. (You may post your own `!testme`
on PR #2 — the repo is active in Drone; it will refresh **the same** comment 13792.)
4. **Images render (served):** `for f in summary.png badge.svg screenshot.png results.json; do
curl -s -o /dev/null -w "$f %{http_code}\n" https://ci.commoninternet.net/runs/4/$f; done` → all 200.
5. **Updates in place / no stacking:** the marked-comment set on PR #2 stays exactly `[13792]` across
runs #3 (first `!testme`) and #4 (re-`!testme`); the comment cycled ⏳→result both times. (Filter
comments for `<!-- cc-ci:testme -->` — there is exactly one.)
6. **No secrets:** scan the comment body + `/var/lib/cc-ci-runs/{3,4}/{results.json,summary.html}` for
`password|secret|token|passwd|api_key|privkey|PRIVATE` → only the `no_secret_leak` flag-name matches;
the embedded app screenshot is custom-html's **"Welcome to nginx!"** page (no values).
7. **No inflation:** the card for run #4 shows `level 4` / `capped: L5 integration N/A`, all
install/upgrade/backup/restore/custom rows ✔ — matches `/runs/4/results.json` verbatim.
**EXPECTED.**
1. `15 passed`. 2. tag `6377f9571f3b` both places. 3. comment 13792 body exactly as above (run 4).
4. all four `/runs/4/` files 200 (`summary.png` ~178 KB, `badge.svg` 342 B, `screenshot.png` 35707 B).
5. exactly one marked comment (`13792`); no new comment stacked on re-run. 6. zero real secret hits.
7. card level 4, all rows ✔, == results.json (`recipe=custom-html`, `level=4`, all tiers pass,
`flags.clean_teardown=true,no_secret_leak=true`).
The cardinal U3 invariant: ONE comment per PR, refreshed in place; the embedded card/badge are a
faithful never-greener projection of the run; image-gen failure degrades to text and never blocks the
run or the verdict.
## Gate: U4 — PASS (Adversary REVIEW-3 @9ca39dc, 2026-05-31T10:04Z; R5 + R3-full cold-verified, no VETO) (Dashboard polish)
(Grid + history cold-verified never-greener vs results.json; honest #11 failure row (404 results.json
→ failure/level —/no card); no secrets; deployed == source; 9 tests. R5 satisfied, R3 fully satisfied.)
**WHAT.** The overview at `https://ci.commoninternet.net/` is now a **YunoHost-CI-style grid**: one
card per enrolled recipe showing a **level badge** (coloured by level), latest **pass/fail** status,
last-tested **version**, an **app screenshot thumbnail** (the run's `screenshot.png`, clickable →
the full `summary.png` card), the clean-teardown/no-secret-leak flags, and a **history** link. A new
per-recipe **history page** `/recipe/<name>` lists every run of that recipe (newest first): run #,
status, level, version, when, and a per-run card link. Every field is read from the run's
**`results.json`** (level/version/screenshot/flags) so the grid mirrors the artifact and is
**never greener than the run** (cardinal guardrail). It re-renders live each request (30s cache +
auto-refresh), i.e. "regenerated on build completion". DoD **R5** satisfied; **R3** now also embedded
in the dashboard (was U3-verified in the comment) → R3 fully satisfied.
**WHERE (commits / files).**
- `e1d837e` `dashboard/dashboard.py` — `level_color`, `_results_for` (traversal-guarded results.json
reader), `_custom_recipe_builds` (cached, shared by overview+history), `_build_row` (Drone build +
results.json → display row), `latest_per_recipe` (augmented), `history_for`, `render_overview`
(grid), `render_history`, `/recipe/<name>` route. `tests/unit/test_dashboard.py` (9 tests).
- **Deployed:** `cc-ci-dashboard:7b34ec8761df` (== `sha256(dashboard.py)` first-12, confirmed live),
rolled via the dashboard **module reconcile** only (`nixos-rebuild build` non-activating →
`cc-ci-reconcile-dashboard` = `docker load` + `docker stack deploy`). NOT `nixos-rebuild switch`
(the `#cc-ci` config targets the migration host — DECISIONS Phase-3/U2; reconcile = zero host-config
impact, reversible).
**HOW to verify (cold, from your clone / the VM).**
1. **Unit tests** (on cc-ci): `cc-ci-run -m pytest tests/unit/test_dashboard.py -q` → `9 passed`.
2. **Deployed == source:** `ssh cc-ci 'sha256sum /etc/cc-ci/dashboard/dashboard.py | cut -c1-12'` →
`7b34ec8761df`; `docker service ls | grep ccci-dashboard` shows that tag.
3. **Live grid:** `curl -s https://ci.commoninternet.net/` (200) → two recipe cards: **custom-html**
(level 4, success, `db9a95024e9d`, thumbnail `/runs/7/screenshot.png` linking `/runs/7/summary.png`,
✔ teardown / ✔ no-leak, `history →` `/recipe/custom-html`) and **uptime-kuma** (level 4, success,
`dfed87a39f8a`, `/runs/12/...`).
4. **Live history:** `curl -s https://ci.commoninternet.net/recipe/custom-html` (200) → rows #7/#4/#3/#1
each L4/success/version + per-run `card` link to `/runs/<n>/summary.png`; `…/recipe/uptime-kuma` →
#12 (success L4) **and #11 (failure, level —, no card)** — a real failed run shown honestly (it
failed at `fetch_recipe` on a bogus ref, wrote no results.json → grid shows failure/level —).
5. **No inflation (cardinal):** each card's level/status/version == `/runs/<n>/results.json`
(`curl -s https://ci.commoninternet.net/runs/7/results.json` → custom-html level 4 all-pass;
`/runs/12/results.json` → uptime-kuma level 4 all-pass). A failed/absent run shows `level —` +
the failure pill + the "no screenshot" placeholder — never a level/screenshot it didn't earn.
6. **No secrets (R7):** scan the grid + both history pages → only the `title="no secret leak"` flag
label matches `secret`; embedded thumbnails are the U1-verified secret-safe landing pages.
7. **HEAD parity:** `curl -sI https://ci.commoninternet.net/` and `…/recipe/custom-html` → 200 (the
`do_HEAD`/`_route` share with GET; A3-1 stays closed).
**EXPECTED.** (1) `9 passed`. (2) tag `7b34ec8761df` both places. (3) grid 200 with the two cards as
described; (4) history 200 with the run rows + card links incl. the honest uptime-kuma failure row;
(5) card fields == results.json (custom-html L4, uptime-kuma L4); (6) zero real secret hits; (7) HEAD 200.
The cardinal U4 invariant: the grid + history are a faithful, never-greener projection of each run's
`results.json`; a failed/levelless run is shown as such (no inflated level, no screenshot it didn't
produce); rendering is read-only over the RO-bind-mounted artifacts.
## Gate: U5 — CLAIMED, awaiting Adversary (Badges + docs + hardening; R6, R7, R8 — FINAL gate)
**WHAT.** The last milestone: (a) **R6** — a per-recipe **latest-level badge** endpoint
`/badge/<recipe>.svg` (shields-style, coloured by level, embeddable in a recipe README; falls back to
a status badge for a recipe with no level yet); (b) **R8** — `docs/results-ux.md` now fully explains
the level ladder + tier→rung mapping, results.json schema, card/screenshot generation, the PR-comment
shape, and the badge endpoints + README embed snippet; (c) **R7 hardening** — render failure degrades
to text/omission and **never affects the verdict**, proven by a forced render-kill run; a broad secret
scan over every published artifact + all PR comments finds **zero** real secret values; plus a new
defense-in-depth try/except around the screenshot call site so a screenshot can never crash the run.
**WHERE (commits / files).**
- `91a69b8` `dashboard/dashboard.py` — `render_level_badge` + `_badge_svg`; `/badge/<recipe>.svg`
route prefers the latest-run level (from results.json), status fallback. Deployed
`cc-ci-dashboard:8acd8b9cc51c` (== `sha256(dashboard.py)`, confirmed live). `tests/unit/test_dashboard.py`
(+2 badge tests → 11 total).
- `91a69b8` `docs/results-ux.md` §1-5 complete (R8).
- `799cceb` `runner/run_recipe_ci.py` — defense-in-depth try/except around `screenshot_mod.capture`
call site (R7); a screenshot raise is now caught + logged non-fatal, verdict unaffected.
**HOW to verify (cold, from your clone / the VM).**
1. **R6 per-recipe level badge (live):**
`curl -s https://ci.commoninternet.net/badge/custom-html.svg` → SVG `cc-ci: custom-html | level 4`,
message-box `fill="#a0b93f"` (= `level_color(4)`); `…/badge/uptime-kuma.svg` → `level 4`;
`…/badge/keycloak.svg` (no runs) → 200, status-fallback `cc-ci | unknown`. README embed snippet in
`docs/results-ux.md` §5.
2. **R8 docs:** read `docs/results-ux.md` — §1 ladder + tier→rung mapping, §2 schema, §3 card+screenshot
+ stable URLs, §4 PR comment, §5 badges + embed snippet. No remaining TODOs.
3. **R7 render-kill degradation (verdict unaffected) — reproduce:** drive `run_recipe_ci.main()` with
the orchestrator-side cosmetic renderers forced to raise but the real (subprocess) test browser
intact — monkeypatch `run_recipe_ci.card_mod.render_card_html`/`render_card_png` and
`run_recipe_ci.screenshot_mod.capture` to raise, `RECIPE=custom-html STAGES=install`. Result
(`/var/lib/cc-ci-runs/u5-renderkill3` from my run): **EXIT 0**, install **pass** (test_serving +
test_serving_and_content PASSED — real browser unaffected), `results.json` written
(`level=1, install=pass, screenshot=null`), and **NO summary.png / NO screenshot.png** — both
cosmetic failures swallowed (`screenshot capture raised (non-fatal…)` + `summary card/badge render
failed (non-fatal)`). A renderer kill cannot change the verdict or block the run.
(Note: globally breaking the *browser path* instead — `/var/lib/cc-ci-runs/u5-renderkill2` — fails
the install tier, because custom-html's `test_serving_and_content` is a REAL browser test; that is a
real test failing correctly, NOT a cosmetics-vs-verdict datapoint. The clean isolation above breaks
ONLY the cosmetic renderers.)
4. **R7 broad leak scan:** over every published text artifact —
`for f in $(find /var/lib/cc-ci-runs -maxdepth 2 \( -name results.json -o -name summary.html -o -name badge.svg \)); do grep -EaoH 'password|passwd|secret|token|api_key|privkey|BEGIN [A-Z ]*PRIVATE KEY|AKIA[0-9A-Z]{16}|[0-9a-f]{40}' "$f"; done`
→ the ONLY matches are the `no_secret_leak` JSON field + the `✔ no secret leak` card label (a
flag name, not a value); **zero real secret values**. Same scan over all bot comments on
custom-html PR#2 → **0**. The embedded screenshots are the U1/U4-verified secret-safe setup/landing
pages (empty credential fields). (You are the R7 leak authority — this is my own pre-claim scan.)
5. **R7 comment text-fallback** (render fail → text, not a broken image): unit-covered
(`tests/unit/test_bridge_trigger.py::test_result_comment_text_fallback_when_card_missing`) + the
bridge checks `artifact_available` (HEAD) before embedding (U3-verified structurally).
6. **Unit tests** (cold): `cc-ci-run -m pytest tests/unit/test_dashboard.py tests/unit/test_card.py
tests/unit/test_bridge_trigger.py tests/unit/test_screenshot.py tests/unit/test_level.py
tests/unit/test_results.py -q` → all green (11+8+7+3+15+13).
**EXPECTED.** (1) badges render with level colour + status fallback; (2) docs complete, no TODOs;
(3) render-kill: exit 0, install pass, results.json intact, no card/screenshot; (4) leak scan: only the
flag name/label, zero real values, 0 in comments; (6) all unit tests green.
The cardinal U5 invariant: cosmetics (card, screenshot, badge, comment image) **never** block/fail a
run or change its verdict — they degrade to text/omission; and no published artifact leaks a secret.
**When the Adversary's U5 PASS lands and REVIEW-3 shows all R1R8 verified <24h with no VETO → I flip
STATUS-3 to `## DONE`.**
## In flight
(none — U5 (final gate) claimed; parked awaiting the Adversary. On U5 PASS + all R1R8 verified → DONE.)
## Blocked
(none)
## Note — Drone repo reactivation (infra, recorded for the Adversary)
The Hetzner-migration Drone DB reset left `recipe-maintainers/cc-ci` **inactive** (bridge log `drone
trigger failed 404`); the bridge can't trigger builds when the repo is inactive. I reactivated it
(in-scope reconfig of my own CI, reversible): `POST /api/user/repos?async=false` then `POST
/api/repos/recipe-maintainers/cc-ci` → `active=true`, config_path `.drone.yml`, timeout 60. This is
why builds #1#4 above exist (counter reset to 1 by the DB reset). Self-heal hardening filed as
BACKLOG-3 U3.3 (fold activation into the drone reconcile) — not a U3 DoD item.