Files
cc-ci/machine-docs/STATUS-3.md
autonomic-bot e487b7febd
Some checks failed
continuous-integration/drone/push Build is failing
status(3): ## DONE — U5 PASS (Adversary @15b3057); all R1–R8 Adversary-verified, no VETO
Phase 3 complete. U5 gate PASS @2026-05-31T13:13Z:
- R6 per-recipe badge endpoint live (custom-html/uptime-kuma level 4, keycloak unknown fallback)
- R8 docs/results-ux.md §1-5 complete, no TODOs
- R7 render-kill: exit 0, install pass, results.json intact, no card/screenshot (u5-renderkill3)
- R7 broad leak scan: 0 real secret values in any artifact or PR comment
All R1–R8 verified <24h; STATUS-3 flipped to ## DONE.
2026-05-31 13:17:44 +00:00

27 KiB
Raw Permalink Blame History

Phase 3 — Beautiful YunoHost-style results — STATUS

SSOT: /srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md. DoD = R1R8. Milestones U0U5. State files (this phase): machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md. DECISIONS.md shared.

WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.

Phase context

  • Phase 2b is ## DONE (Adversary-verified, no VETO). Phase 3 kicked off manually by the operator. Note for honesty: Phase-2 ## DONE not yet flipped (REVIEW-2 standing VETO on full Phase-2 DONE authorization); cross-phase sequencing is an operator call. Adversary concurs it's not a P3 blocker (REVIEW-3 @05:42Z).
  • Pre-existing repo-wide lint is RED on origin/main (94 files ruff format-dirty + 36 ruff check errors; confirmed on cc-ci CI devshell against clean origin/main, ruff 0.7.3). This predates Phase 3 and is NOT introduced by my work — my NEW Phase-3 files are fully ruff-clean, and I left run_recipe_ci.py with fewer ruff errors than main (1 vs 4). Flagged for the operator; not a Phase-3 DoD item, and the U0 gate is verified by unit tests + real-run results.json, not repo-wide lint.

Gate: U0 — PASS (Adversary REVIEW-3 @18d2bd1, 2026-05-31; R1 cold-verified, no VETO) (Results schema + level)

WHAT. run_recipe_ci.py now emits a per-run results.json with per-stage AND per-test ✔/✘ breakdown and a computed integer level (L0L6, YunoHost gap-caps semantics). DoD R1 (level ladder) satisfied; U0 milestone acceptance ("level correct for a recipe through L4 and one capped at L2") demonstrated on two real end-to-end runs.

WHERE (commits / files).

  • 9773e3f runner/harness/level.py — pure compute_level(rungs)->(level,cap_reason) + helpers backup_restore_status, tier_to_rung. tests/unit/test_level.py (15 tests).
  • 52e5d21 runner/harness/results.py — JUnit-XML parse, collect_stages, derive_rungs (the tier+deps/SSO→rung translation), build_results, write_results. tests/unit/test_results.py (13 tests). runner/run_recipe_ci.py — tiers emit --junitxml + append {tier,source,file,rc,junit} records; main() assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7), incl. a narrow self leak-scan of the serialised artifact.
  • 757511e machine-docs/DECISIONS.md (Phase-3 section) — the documented ladder + exact rung-mapping contract derive_rungs implements + results.json schema + artifact-hosting decision.

HOW to verify (cold, from your clone on cc-ci).

  1. Unit tests (deterministic; also fuzz-verifiable): cc-ci-run -m pytest tests/unit/test_level.py tests/unit/test_results.py -q
  2. Real-run L2-cap (stateless, not backup-capable, ≥2 versions): RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-cht cc-ci-run runner/run_recipe_ci.py then read /var/lib/cc-ci-runs/adv-cht/results.json.
  3. Real-run L4-pass (backup-capable, 3 functional tests, no deps): RECIPE=uptime-kuma STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-uk cc-ci-run runner/run_recipe_ci.py then read /var/lib/cc-ci-runs/adv-uk/results.json. (Compare the level/rungs against the results dict + DECISIONS contract — a level greener than the tiers would be a FAIL. Verify clean teardown: no orphan *-pr*/recipe service after.)

EXPECTED.

  1. 28 passed.
  2. custom-html-tiny: level=2, level_cap_reason="L3 backup/restore (data integrity) N/A", rungs={install:pass, upgrade:pass, backup_restore:na, functional:na, integration:na, recipe_local:na}, results={install:pass, upgrade:pass, backup:skip, restore:skip, custom:skip}, flags={clean_teardown:true, no_secret_leak:true}, stages=[install,upgrade] each w/ per-test rows. (My run: /var/lib/cc-ci-runs/u0-cht-L2/results.json.)
  3. uptime-kuma: level=4, level_cap_reason="L5 integration (SSO/OIDC + cross-app) N/A", rungs={install:pass, upgrade:pass, backup_restore:pass, functional:pass, integration:na, recipe_local:na}, all five tiers pass, flags.clean_teardown=true, stages=[install,upgrade,backup,restore,custom] with per-test rows (incl. 3 uptime-kuma functional tests, source cc-ci). (My run: /var/lib/cc-ci-runs/u0-uk-L4/results.json.)

These two bracket the gate: a recipe whose functional tests pass is still capped at L2 when a lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface caps at L4.


Gate: U1 — PASS (Adversary REVIEW-3 @74a6993, 2026-05-31; R4 cold-verified, no VETO) (App screenshot)

WHAT. The harness now captures a real Playwright screenshot of the deployed app while it is up (after deploy+health/readiness, before any tier mutates state, before teardown) and writes it to the run artifact dir as screenshot.png. The capture is secret-safe by default (it shoots the app landing page, never a credentials page; a recipe opts into a post-login view via an optional SCREENSHOT meta hook that owns the no-secret-page guarantee — none used yet). It is best-effort: capture() swallows every error and returns None, so it NEVER blocks/fails/hangs the run (R7); the results.json screenshot field is set to "screenshot.png" ONLY when the capture actually produced a file, else stays null. U1 milestone acceptance ("screenshot of a sample recipe shows the working UI, no secrets") demonstrated on a real uptime-kuma run; graceful-degradation (R7) demonstrated on an unreachable-domain capture.

WHERE (commits / files).

  • 5fa15d4 runner/run_recipe_ci.py — imports screenshot as screenshot_mod; after deploy+readiness and OUTSIDE the deploy try/except (so a screenshot issue can never flip deploy_ok), under if deploy_ok: calls screenshot_mod.capture(domain, screenshot_path(run_artifact_dir), recipe_meta=meta) and sets screenshot_rel; passes screenshot=screenshot_rel into build_results(...).
  • daa7edd runner/harness/screenshot.pycapture() (default landing-page nav via browser.goto_with_retry, 45s deadline cap; optional SCREENSHOT hook), screenshot_path(), _load_screenshot_hook(). tests/unit/test_screenshot.py (pure helpers; 4 tests).

HOW to verify (cold, from your clone on cc-ci).

  1. Pure-helper unit tests: cc-ci-run -m pytest tests/unit/test_screenshot.py -q
  2. Real positive capture (working UI, no secret): rm -rf /var/lib/cc-ci-runs/adv-u1 && RECIPE=uptime-kuma STAGES=install CCCI_RUN_ID=adv-u1 cc-ci-run runner/run_recipe_ci.py then scp back /var/lib/cc-ci-runs/adv-u1/screenshot.png and EYEBALL it; check /var/lib/cc-ci-runs/adv-u1/results.json has "screenshot":"screenshot.png". Confirm NO orphan service after (docker service ls | grep -i uptime empty = clean teardown).
  3. Graceful degradation (R7) — capture against an unreachable host returns None, never raises: cc-ci-run -c 'import sys; sys.path.insert(0,"runner"); from harness import screenshot as S; print(S.capture("adv-u1-noexist.ci.commoninternet.net","/tmp/x.png"))' → prints None (≈45s), no /tmp/x.png produced.

EXPECTED.

  1. 3 passed (test_screenshot.py has 3 pure-helper tests; corrected from an earlier "4" over-count per the Adversary's honest-reporting flag, REVIEW-3 @74a6993 — doc-only, no behavioural impact).
  2. screenshot.png ~30 KB showing uptime-kuma's "Uptime Kuma / Create your admin account" landing page with EMPTY username/password/repeat fields (a setup form — it asks the user to set a password; it does NOT display any generated secret), i.e. real working app UI, no secret values. results.json screenshot="screenshot.png", flags.clean_teardown=true; no orphan service. (My run: /var/lib/cc-ci-runs/u1-uk-shot/{screenshot.png,results.json}.)
  3. None returned after the 45s deadline, no file written, no exception — proving a screenshot failure leaves the run/verdict untouched (cosmetics never block, R7). (My check log: capture "failed (non-fatal, verdict unaffected)" → GRACEFUL_DEGRADATION= True.)

The cardinal Phase-3 invariant for U1: the screenshot is a faithful capture of the live app, never a credentials page, and its presence/absence never changes the verdict.


Gate: U2 — PASS (Adversary REVIEW-3 @324d84d, 2026-05-31; R3/R6 partial cold-verified, no VETO) (Summary card + badge)

WHAT. Each run now renders a summary card PNG (recipe+version, level badge, per-stage/per-test ✔/✘ table, embedded real app screenshot) and an SVG level badge, written into the run artifact dir and served at stable URLs https://ci.commoninternet.net/runs/<run_id>/{summary.png,badge.svg, screenshot.png,results.json}. The card REPORTS results.json verbatim — it computes nothing, so it can never look greener than the tiers (cardinal invariant). U2 acceptance ("card + badge render correctly for a pass run AND a fail run") demonstrated: a real PASS run served live; a deterministic FAIL render shown honest (L0/red/✘/no-screenshot).

WHERE (commits / files).

  • afe5e51 runner/run_recipe_ci.py — after results.json is written, a separate best-effort block renders summary.htmlsummary.png + badge.svg via harness.card (passes screenshot_rel=data["screenshot"] so the real shot embeds iff present). R7-wrapped — any failure is swallowed, never changes overall.
  • daa7edd/7217e0c/8179d3f runner/harness/card.py — pure render_card_html, render_badge_svg/ level_badge_svg (deterministic string builders), render_card_png (best-effort Playwright). Inline SVG sunflower (headless chromium has no colour-emoji font). tests/unit/test_card.py (8 tests).
  • fa56f6b dashboard/dashboard.py + nix/modules/dashboard.nix/runs/<id>/<file> route (allow-list + run_id regex + realpath-inside-runs-dir traversal guard); /var/lib/cc-ci-runs bind-mounted READ-ONLY into the dashboard swarm service; CCCI_RUNS_DIR env.

HOW to verify (cold). (See ADVERSARY-INBOX for the deploy gotcha — do NOT nixos-rebuild switch the live host; #cc-ci targets the hetzner migration host. U2.3 was rolled via the dashboard module reconcile only. DECISIONS.md Phase-3/U2 has the diff-closures evidence.)

  1. Unit tests: cc-ci-run -m pytest tests/unit/test_card.py -q8 passed.
  2. PASS card served live (real): curl -s -o /tmp/c.png -w '%{http_code} %{content_type} %{size_download}\n' https://ci.commoninternet.net/runs/u1-uk-shot/summary.png200 image/png ~69313. Eyeball /tmp/c.png: uptime-kuma, orange LEVEL 1, "capped: L2 upgrade N/A", install/test_serving ✔ PASS rows, clean-teardown+no-secret-leak flags, and the real uptime-kuma screenshot embedded. Also …/screenshot.png (200 ~30858), …/badge.svg (200 image/svg+xml), …/results.json (200).
  3. Traversal/whitelist guard: …/runs/u1-uk-shot/../../../etc/passwd, …/runs/u1-uk-shot/evil.sh, …/runs/nonexist/results.json404 with a 9-byte body (the dashboard's own "not found", NOT Traefik's 19-byte 404 — proves the request reached the app and the guard rejected it).
  4. FAIL render is honest (cardinal invariant): feed the card a fail dict (cmd in ADVERSARY-INBOX §3) → card shows level 0, level_color(0) (red), the ✘ FAIL mark on the install row, and the "no screenshot" placeholder — never greener than the data.

EXPECTED. (1) 8 passed. (2) PASS card 200/image-png/~69KB, embeds the real screenshot, level/marks match results.json (u1-uk-shot: level 1, install pass). (3) all three guarded paths 404 with a 9B body. (4) fail render: >0< (level 0), red colour, ✘ present, "no screenshot" present — no inflation.

The cardinal U2 invariant: the rendered card/level/badge are a faithful, never-greener projection of results.json + the actual test outcomes, served at a stable URL, generated best-effort so a render failure never blocks the run.

Gate: U3 — PASS (Adversary REVIEW-3 @778b577, 2026-05-31T09:51Z; R2 cold-verified, no VETO) (YunoHost-style PR comment)

(Adversary cold-reproduced update-in-place via its own !testme → build #7; comment 13792 never stacked; card == results.json, no inflation; no secrets. R3 "in comment" verified; R3 ticks at U4.)

WHAT. On a !testme run the bridge now posts/updates ONE Gitea PR comment in the YunoHost shape: on run start a 🌻 + placeholder ("level pending", live-logs link); on completion it edits the SAME comment in place to 🌻 + a level badge image + a summary card image, BOTH linked to the full run, plus full-logs/dashboard links. A re-!testme refreshes that same comment (back to , then to the new result) — never stacks a new one (R2 "one comment per PR, updated in place"). Falls back to a compact text verdict if the rendered card isn't served (R7). DoD R2 satisfied; U3 acceptance ("live on a scratch PR — comment shows badge + card + screenshot, updates on re-run, no secrets") demonstrated on a real scratch PR. (This also lands R3's "embedded in the comment" sub-requirement; R3 still needs "in dashboard" at U4.)

WHERE (commits / files).

  • 9a47aa2 bridge/bridge.pyCOMMENT_MARKER (hidden HTML comment <!-- cc-ci:testme -->), start_comment_body ( placeholder), result_comment_body (🌻 + badge + card, linked; text fallback), find_existing_comment (marker → update-in-place), artifact_available (HEAD existence check → image-vs-text), watch_and_reflect now edits to result_comment_body. Card/badge URLs are ${DASH_URL}/runs/<DRONE_BUILD_NUMBER>/{summary.png,badge.svg} (run_id == Drone build number, see runner/harness/results.py::run_id).
  • 9a47aa2 dashboard/dashboard.pydo_HEAD (shared _route with GET) so HEAD existence-checks + strict image clients get 200, not 501 (closes Adversary A3-1, already re-verified @8807240).
  • 9a47aa2 tests/unit/test_bridge_trigger.py — covers placeholder shape, image-forward result, text fallback when card missing, marker-based find/update-in-place.
  • Deployed: bridge swarm image cc-ci-bridge:6377f9571f3b == sha256(bridge.py) first-12 (content tag, confirmed live); dashboard image live with do_HEAD.

HOW to verify (cold, from your clone / the VM).

  1. Unit tests (on cc-ci): cc-ci-run -m pytest tests/unit/test_bridge_trigger.py tests/unit/test_card.py -q15 passed.
  2. Deployed bridge == source: ssh cc-ci 'sha256sum /etc/cc-ci/bridge/bridge.py | cut -c1-12'6377f9571f3b; ssh cc-ci 'docker service ls | grep ccci-bridge' shows image tag 6377f9571f3b.
  3. LIVE demo on scratch PR recipe-maintainers/custom-html PR #2 (recipe == repo name; the bridge poller, 30s, fires on a NEW !testme). The bot comment carrying the marker is id 13792: curl -s -u "$GITEA_USERNAME:$GITEA_PASSWORD" https://git.autonomic.zone/api/v1/repos/recipe-maintainers/custom-html/issues/comments/13792 → body has <!-- cc-ci:testme -->, 🌻, ✅ passed, [![cc-ci result card](…/runs/4/summary.png)](…/4), [![level](…/runs/4/badge.svg)](…/4), full-logs+dashboard links. (You may post your own !testme on PR #2 — the repo is active in Drone; it will refresh the same comment 13792.)
  4. Images render (served): for f in summary.png badge.svg screenshot.png results.json; do curl -s -o /dev/null -w "$f %{http_code}\n" https://ci.commoninternet.net/runs/4/$f; done → all 200.
  5. Updates in place / no stacking: the marked-comment set on PR #2 stays exactly [13792] across runs #3 (first !testme) and #4 (re-!testme); the comment cycled →result both times. (Filter comments for <!-- cc-ci:testme --> — there is exactly one.)
  6. No secrets: scan the comment body + /var/lib/cc-ci-runs/{3,4}/{results.json,summary.html} for password|secret|token|passwd|api_key|privkey|PRIVATE → only the no_secret_leak flag-name matches; the embedded app screenshot is custom-html's "Welcome to nginx!" page (no values).
  7. No inflation: the card for run #4 shows level 4 / capped: L5 integration N/A, all install/upgrade/backup/restore/custom rows ✔ — matches /runs/4/results.json verbatim.

EXPECTED.

  1. 15 passed. 2. tag 6377f9571f3b both places. 3. comment 13792 body exactly as above (run 4).
  2. all four /runs/4/ files 200 (summary.png ~178 KB, badge.svg 342 B, screenshot.png 35707 B).
  3. exactly one marked comment (13792); no new comment stacked on re-run. 6. zero real secret hits.
  4. card level 4, all rows ✔, == results.json (recipe=custom-html, level=4, all tiers pass, flags.clean_teardown=true,no_secret_leak=true).

The cardinal U3 invariant: ONE comment per PR, refreshed in place; the embedded card/badge are a faithful never-greener projection of the run; image-gen failure degrades to text and never blocks the run or the verdict.

Gate: U4 — PASS (Adversary REVIEW-3 @9ca39dc, 2026-05-31T10:04Z; R5 + R3-full cold-verified, no VETO) (Dashboard polish)

(Grid + history cold-verified never-greener vs results.json; honest #11 failure row (404 results.json → failure/level —/no card); no secrets; deployed == source; 9 tests. R5 satisfied, R3 fully satisfied.)

WHAT. The overview at https://ci.commoninternet.net/ is now a YunoHost-CI-style grid: one card per enrolled recipe showing a level badge (coloured by level), latest pass/fail status, last-tested version, an app screenshot thumbnail (the run's screenshot.png, clickable → the full summary.png card), the clean-teardown/no-secret-leak flags, and a history link. A new per-recipe history page /recipe/<name> lists every run of that recipe (newest first): run #, status, level, version, when, and a per-run card link. Every field is read from the run's results.json (level/version/screenshot/flags) so the grid mirrors the artifact and is never greener than the run (cardinal guardrail). It re-renders live each request (30s cache + auto-refresh), i.e. "regenerated on build completion". DoD R5 satisfied; R3 now also embedded in the dashboard (was U3-verified in the comment) → R3 fully satisfied.

WHERE (commits / files).

  • e1d837e dashboard/dashboard.pylevel_color, _results_for (traversal-guarded results.json reader), _custom_recipe_builds (cached, shared by overview+history), _build_row (Drone build + results.json → display row), latest_per_recipe (augmented), history_for, render_overview (grid), render_history, /recipe/<name> route. tests/unit/test_dashboard.py (9 tests).
  • Deployed: cc-ci-dashboard:7b34ec8761df (== sha256(dashboard.py) first-12, confirmed live), rolled via the dashboard module reconcile only (nixos-rebuild build non-activating → cc-ci-reconcile-dashboard = docker load + docker stack deploy). NOT nixos-rebuild switch (the #cc-ci config targets the migration host — DECISIONS Phase-3/U2; reconcile = zero host-config impact, reversible).

HOW to verify (cold, from your clone / the VM).

  1. Unit tests (on cc-ci): cc-ci-run -m pytest tests/unit/test_dashboard.py -q9 passed.
  2. Deployed == source: ssh cc-ci 'sha256sum /etc/cc-ci/dashboard/dashboard.py | cut -c1-12'7b34ec8761df; docker service ls | grep ccci-dashboard shows that tag.
  3. Live grid: curl -s https://ci.commoninternet.net/ (200) → two recipe cards: custom-html (level 4, success, db9a95024e9d, thumbnail /runs/7/screenshot.png linking /runs/7/summary.png, ✔ teardown / ✔ no-leak, history → /recipe/custom-html) and uptime-kuma (level 4, success, dfed87a39f8a, /runs/12/...).
  4. Live history: curl -s https://ci.commoninternet.net/recipe/custom-html (200) → rows #7/#4/#3/#1 each L4/success/version + per-run card link to /runs/<n>/summary.png; …/recipe/uptime-kuma → #12 (success L4) and #11 (failure, level —, no card) — a real failed run shown honestly (it failed at fetch_recipe on a bogus ref, wrote no results.json → grid shows failure/level —).
  5. No inflation (cardinal): each card's level/status/version == /runs/<n>/results.json (curl -s https://ci.commoninternet.net/runs/7/results.json → custom-html level 4 all-pass; /runs/12/results.json → uptime-kuma level 4 all-pass). A failed/absent run shows level — + the failure pill + the "no screenshot" placeholder — never a level/screenshot it didn't earn.
  6. No secrets (R7): scan the grid + both history pages → only the title="no secret leak" flag label matches secret; embedded thumbnails are the U1-verified secret-safe landing pages.
  7. HEAD parity: curl -sI https://ci.commoninternet.net/ and …/recipe/custom-html → 200 (the do_HEAD/_route share with GET; A3-1 stays closed).

EXPECTED. (1) 9 passed. (2) tag 7b34ec8761df both places. (3) grid 200 with the two cards as described; (4) history 200 with the run rows + card links incl. the honest uptime-kuma failure row; (5) card fields == results.json (custom-html L4, uptime-kuma L4); (6) zero real secret hits; (7) HEAD 200.

The cardinal U4 invariant: the grid + history are a faithful, never-greener projection of each run's results.json; a failed/levelless run is shown as such (no inflated level, no screenshot it didn't produce); rendering is read-only over the RO-bind-mounted artifacts.

Gate: U5 — PASS (Adversary REVIEW-3 @15b3057, 2026-05-31T13:13Z; R6+R7+R8 cold-verified, no VETO) (Badges + docs + hardening; R6, R7, R8 — FINAL gate)

WHAT. The last milestone: (a) R6 — a per-recipe latest-level badge endpoint /badge/<recipe>.svg (shields-style, coloured by level, embeddable in a recipe README; falls back to a status badge for a recipe with no level yet); (b) R8docs/results-ux.md now fully explains the level ladder + tier→rung mapping, results.json schema, card/screenshot generation, the PR-comment shape, and the badge endpoints + README embed snippet; (c) R7 hardening — render failure degrades to text/omission and never affects the verdict, proven by a forced render-kill run; a broad secret scan over every published artifact + all PR comments finds zero real secret values; plus a new defense-in-depth try/except around the screenshot call site so a screenshot can never crash the run.

WHERE (commits / files).

  • 91a69b8 dashboard/dashboard.pyrender_level_badge + _badge_svg; /badge/<recipe>.svg route prefers the latest-run level (from results.json), status fallback. Deployed cc-ci-dashboard:8acd8b9cc51c (== sha256(dashboard.py), confirmed live). tests/unit/test_dashboard.py (+2 badge tests → 11 total).
  • 91a69b8 docs/results-ux.md §1-5 complete (R8).
  • 799cceb runner/run_recipe_ci.py — defense-in-depth try/except around screenshot_mod.capture call site (R7); a screenshot raise is now caught + logged non-fatal, verdict unaffected.

HOW to verify (cold, from your clone / the VM).

  1. R6 per-recipe level badge (live): curl -s https://ci.commoninternet.net/badge/custom-html.svg → SVG cc-ci: custom-html | level 4, message-box fill="#a0b93f" (= level_color(4)); …/badge/uptime-kuma.svglevel 4; …/badge/keycloak.svg (no runs) → 200, status-fallback cc-ci | unknown. README embed snippet in docs/results-ux.md §5.
  2. R8 docs: read docs/results-ux.md — §1 ladder + tier→rung mapping, §2 schema, §3 card+screenshot
    • stable URLs, §4 PR comment, §5 badges + embed snippet. No remaining TODOs.
  3. R7 render-kill degradation (verdict unaffected) — reproduce: drive run_recipe_ci.main() with the orchestrator-side cosmetic renderers forced to raise but the real (subprocess) test browser intact — monkeypatch run_recipe_ci.card_mod.render_card_html/render_card_png and run_recipe_ci.screenshot_mod.capture to raise, RECIPE=custom-html STAGES=install. Result (/var/lib/cc-ci-runs/u5-renderkill3 from my run): EXIT 0, install pass (test_serving + test_serving_and_content PASSED — real browser unaffected), results.json written (level=1, install=pass, screenshot=null), and NO summary.png / NO screenshot.png — both cosmetic failures swallowed (screenshot capture raised (non-fatal…) + summary card/badge render failed (non-fatal)). A renderer kill cannot change the verdict or block the run. (Note: globally breaking the browser path instead — /var/lib/cc-ci-runs/u5-renderkill2 — fails the install tier, because custom-html's test_serving_and_content is a REAL browser test; that is a real test failing correctly, NOT a cosmetics-vs-verdict datapoint. The clean isolation above breaks ONLY the cosmetic renderers.)
  4. R7 broad leak scan: over every published text artifact — for f in $(find /var/lib/cc-ci-runs -maxdepth 2 \( -name results.json -o -name summary.html -o -name badge.svg \)); do grep -EaoH 'password|passwd|secret|token|api_key|privkey|BEGIN [A-Z ]*PRIVATE KEY|AKIA[0-9A-Z]{16}|[0-9a-f]{40}' "$f"; done → the ONLY matches are the no_secret_leak JSON field + the ✔ no secret leak card label (a flag name, not a value); zero real secret values. Same scan over all bot comments on custom-html PR#2 → 0. The embedded screenshots are the U1/U4-verified secret-safe setup/landing pages (empty credential fields). (You are the R7 leak authority — this is my own pre-claim scan.)
  5. R7 comment text-fallback (render fail → text, not a broken image): unit-covered (tests/unit/test_bridge_trigger.py::test_result_comment_text_fallback_when_card_missing) + the bridge checks artifact_available (HEAD) before embedding (U3-verified structurally).
  6. Unit tests (cold): cc-ci-run -m pytest tests/unit/test_dashboard.py tests/unit/test_card.py tests/unit/test_bridge_trigger.py tests/unit/test_screenshot.py tests/unit/test_level.py tests/unit/test_results.py -q → all green (11+8+7+3+15+13).

EXPECTED. (1) badges render with level colour + status fallback; (2) docs complete, no TODOs; (3) render-kill: exit 0, install pass, results.json intact, no card/screenshot; (4) leak scan: only the flag name/label, zero real values, 0 in comments; (6) all unit tests green.

The cardinal U5 invariant: cosmetics (card, screenshot, badge, comment image) never block/fail a run or change its verdict — they degrade to text/omission; and no published artifact leaks a secret.

Adversary U5 PASS @15b3057 (2026-05-31T13:13Z) — all R1R8 verified <24h, no VETO → STATUS-3 ## DONE flipped.

DONE

Phase 3 complete. All R1R8 Adversary-verified (U0U5 all PASS, no VETO, all within 24h).

  • R1 (level ladder) ← U0 PASS @07:05Z
  • R2 (image PR comment) ← U3 PASS @09:51Z
  • R3 (summary card) ← U2+U3+U4 PASS @07:48Z+09:51Z+10:04Z
  • R4 (screenshot) ← U1 PASS @07:15Z
  • R5 (dashboard polish) ← U4 PASS @10:04Z
  • R6 (badges) ← U5 PASS @13:13Z
  • R7 (safe & robust) ← U1+U2+U3+U5
  • R8 (docs) ← U5 PASS @13:13Z

Note — Drone repo reactivation (infra, recorded for the Adversary)

The Hetzner-migration Drone DB reset left recipe-maintainers/cc-ci inactive (bridge log drone trigger failed 404); the bridge can't trigger builds when the repo is inactive. I reactivated it (in-scope reconfig of my own CI, reversible): POST /api/user/repos?async=false then POST /api/repos/recipe-maintainers/cc-ciactive=true, config_path .drone.yml, timeout 60. This is why builds #1#4 above exist (counter reset to 1 by the DB reset). Self-heal hardening filed as BACKLOG-3 U3.3 (fold activation into the drone reconcile) — not a U3 DoD item.