360 lines
27 KiB
Markdown
360 lines
27 KiB
Markdown
# Phase 3 — Beautiful YunoHost-style results — STATUS
|
||
|
||
SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. DoD = R1–R8. Milestones U0–U5.
|
||
State files (this phase): `machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md`. DECISIONS.md shared.
|
||
|
||
**WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.**
|
||
|
||
## Phase context
|
||
- Phase 2b is `## DONE` (Adversary-verified, no VETO). Phase 3 kicked off **manually by the operator**.
|
||
Note for honesty: Phase-2 `## DONE` not yet flipped (REVIEW-2 standing VETO on full Phase-2 DONE
|
||
authorization); cross-phase sequencing is an operator call. Adversary concurs it's not a P3 blocker
|
||
(REVIEW-3 @05:42Z).
|
||
- **Pre-existing repo-wide lint is RED on origin/main** (94 files `ruff format`-dirty + 36 `ruff check`
|
||
errors; confirmed on cc-ci CI devshell against clean `origin/main`, ruff 0.7.3). This predates Phase 3
|
||
and is NOT introduced by my work — my NEW Phase-3 files are fully `ruff`-clean, and I left
|
||
`run_recipe_ci.py` with fewer ruff errors than main (1 vs 4). Flagged for the operator; not a Phase-3
|
||
DoD item, and the U0 gate is verified by unit tests + real-run results.json, not repo-wide lint.
|
||
|
||
---
|
||
|
||
## Gate: U0 — PASS (Adversary REVIEW-3 @18d2bd1, 2026-05-31; R1 cold-verified, no VETO) (Results schema + level)
|
||
|
||
**WHAT.** `run_recipe_ci.py` now emits a per-run `results.json` with per-stage AND per-test ✔/✘
|
||
breakdown and a computed integer **level** (L0–L6, YunoHost gap-caps semantics). DoD R1 (level ladder)
|
||
satisfied; U0 milestone acceptance ("level correct for a recipe through L4 and one capped at L2")
|
||
demonstrated on two real end-to-end runs.
|
||
|
||
**WHERE (commits / files).**
|
||
- `9773e3f` `runner/harness/level.py` — pure `compute_level(rungs)->(level,cap_reason)` + helpers
|
||
`backup_restore_status`, `tier_to_rung`. `tests/unit/test_level.py` (15 tests).
|
||
- `52e5d21` `runner/harness/results.py` — JUnit-XML parse, `collect_stages`, `derive_rungs` (the
|
||
tier+deps/SSO→rung translation), `build_results`, `write_results`. `tests/unit/test_results.py`
|
||
(13 tests). `runner/run_recipe_ci.py` — tiers emit `--junitxml` + append `{tier,source,file,rc,junit}`
|
||
records; `main()` assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7),
|
||
incl. a narrow self leak-scan of the serialised artifact.
|
||
- `757511e` `machine-docs/DECISIONS.md` (Phase-3 section) — the documented ladder + exact rung-mapping
|
||
contract `derive_rungs` implements + results.json schema + artifact-hosting decision.
|
||
|
||
**HOW to verify (cold, from your clone on cc-ci).**
|
||
1. **Unit tests** (deterministic; also fuzz-verifiable):
|
||
`cc-ci-run -m pytest tests/unit/test_level.py tests/unit/test_results.py -q`
|
||
2. **Real-run L2-cap** (stateless, not backup-capable, ≥2 versions):
|
||
`RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-cht cc-ci-run runner/run_recipe_ci.py`
|
||
then read `/var/lib/cc-ci-runs/adv-cht/results.json`.
|
||
3. **Real-run L4-pass** (backup-capable, 3 functional tests, no deps):
|
||
`RECIPE=uptime-kuma STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-uk cc-ci-run runner/run_recipe_ci.py`
|
||
then read `/var/lib/cc-ci-runs/adv-uk/results.json`.
|
||
(Compare the `level`/`rungs` against the `results` dict + DECISIONS contract — a level greener than
|
||
the tiers would be a FAIL. Verify clean teardown: no orphan `*-pr*`/recipe service after.)
|
||
|
||
**EXPECTED.**
|
||
1. `28 passed`.
|
||
2. custom-html-tiny: `level=2`, `level_cap_reason="L3 backup/restore (data integrity) N/A"`,
|
||
`rungs={install:pass, upgrade:pass, backup_restore:na, functional:na, integration:na, recipe_local:na}`,
|
||
`results={install:pass, upgrade:pass, backup:skip, restore:skip, custom:skip}`,
|
||
`flags={clean_teardown:true, no_secret_leak:true}`, stages=[install,upgrade] each w/ per-test rows.
|
||
(My run: `/var/lib/cc-ci-runs/u0-cht-L2/results.json`.)
|
||
3. uptime-kuma: `level=4`, `level_cap_reason="L5 integration (SSO/OIDC + cross-app) N/A"`,
|
||
`rungs={install:pass, upgrade:pass, backup_restore:pass, functional:pass, integration:na, recipe_local:na}`,
|
||
all five tiers pass, `flags.clean_teardown=true`, stages=[install,upgrade,backup,restore,custom]
|
||
with per-test rows (incl. 3 uptime-kuma functional tests, source `cc-ci`).
|
||
(My run: `/var/lib/cc-ci-runs/u0-uk-L4/results.json`.)
|
||
|
||
These two bracket the gate: a recipe whose functional tests **pass** is still capped at **L2** when a
|
||
lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface
|
||
caps at **L4**.
|
||
|
||
---
|
||
|
||
## Gate: U1 — PASS (Adversary REVIEW-3 @74a6993, 2026-05-31; R4 cold-verified, no VETO) (App screenshot)
|
||
|
||
**WHAT.** The harness now captures a **real Playwright screenshot of the deployed app** while it is
|
||
up (after deploy+health/readiness, before any tier mutates state, before teardown) and writes it to
|
||
the run artifact dir as `screenshot.png`. The capture is **secret-safe by default** (it shoots the
|
||
app **landing page**, never a credentials page; a recipe opts into a post-login view via an optional
|
||
`SCREENSHOT` meta hook that owns the no-secret-page guarantee — none used yet). It is **best-effort**:
|
||
`capture()` swallows every error and returns `None`, so it NEVER blocks/fails/hangs the run (R7); the
|
||
`results.json` `screenshot` field is set to `"screenshot.png"` ONLY when the capture actually produced
|
||
a file, else stays `null`. U1 milestone acceptance ("screenshot of a sample recipe shows the working
|
||
UI, no secrets") demonstrated on a real uptime-kuma run; graceful-degradation (R7) demonstrated on an
|
||
unreachable-domain capture.
|
||
|
||
**WHERE (commits / files).**
|
||
- `5fa15d4` `runner/run_recipe_ci.py` — imports `screenshot as screenshot_mod`; after deploy+readiness
|
||
and OUTSIDE the deploy try/except (so a screenshot issue can never flip `deploy_ok`), under
|
||
`if deploy_ok:` calls `screenshot_mod.capture(domain, screenshot_path(run_artifact_dir), recipe_meta=meta)`
|
||
and sets `screenshot_rel`; passes `screenshot=screenshot_rel` into `build_results(...)`.
|
||
- `daa7edd` `runner/harness/screenshot.py` — `capture()` (default landing-page nav via
|
||
`browser.goto_with_retry`, 45s deadline cap; optional `SCREENSHOT` hook), `screenshot_path()`,
|
||
`_load_screenshot_hook()`. `tests/unit/test_screenshot.py` (pure helpers; 4 tests).
|
||
|
||
**HOW to verify (cold, from your clone on cc-ci).**
|
||
1. **Pure-helper unit tests:** `cc-ci-run -m pytest tests/unit/test_screenshot.py -q`
|
||
2. **Real positive capture** (working UI, no secret): `rm -rf /var/lib/cc-ci-runs/adv-u1 &&
|
||
RECIPE=uptime-kuma STAGES=install CCCI_RUN_ID=adv-u1 cc-ci-run runner/run_recipe_ci.py`
|
||
then `scp` back `/var/lib/cc-ci-runs/adv-u1/screenshot.png` and EYEBALL it; check
|
||
`/var/lib/cc-ci-runs/adv-u1/results.json` has `"screenshot":"screenshot.png"`. Confirm NO orphan
|
||
service after (`docker service ls | grep -i uptime` empty = clean teardown).
|
||
3. **Graceful degradation (R7)** — capture against an unreachable host returns None, never raises:
|
||
`cc-ci-run -c 'import sys; sys.path.insert(0,"runner"); from harness import screenshot as S;
|
||
print(S.capture("adv-u1-noexist.ci.commoninternet.net","/tmp/x.png"))'` → prints `None` (≈45s),
|
||
no /tmp/x.png produced.
|
||
|
||
**EXPECTED.**
|
||
1. `3 passed` (test_screenshot.py has 3 pure-helper tests; corrected from an earlier "4" over-count
|
||
per the Adversary's honest-reporting flag, REVIEW-3 @74a6993 — doc-only, no behavioural impact).
|
||
2. `screenshot.png` ~30 KB showing uptime-kuma's **"Uptime Kuma / Create your admin account"**
|
||
landing page with **EMPTY** username/password/repeat fields (a setup form — it asks the user to
|
||
set a password; it does NOT display any generated secret), i.e. real working app UI, no secret
|
||
values. results.json `screenshot="screenshot.png"`, `flags.clean_teardown=true`; no orphan service.
|
||
(My run: `/var/lib/cc-ci-runs/u1-uk-shot/{screenshot.png,results.json}`.)
|
||
3. `None` returned after the 45s deadline, no file written, no exception — proving a screenshot
|
||
failure leaves the run/verdict untouched (cosmetics never block, R7). (My check log: capture
|
||
"failed (non-fatal, verdict unaffected)" → `GRACEFUL_DEGRADATION= True`.)
|
||
|
||
The cardinal Phase-3 invariant for U1: the screenshot is a faithful capture of the live app, never a
|
||
credentials page, and its presence/absence never changes the verdict.
|
||
|
||
---
|
||
|
||
## Gate: U2 — PASS (Adversary REVIEW-3 @324d84d, 2026-05-31; R3/R6 partial cold-verified, no VETO) (Summary card + badge)
|
||
|
||
**WHAT.** Each run now renders a **summary card PNG** (recipe+version, level badge, per-stage/per-test
|
||
✔/✘ table, embedded **real app screenshot**) and an **SVG level badge**, written into the run artifact
|
||
dir and **served at stable URLs** `https://ci.commoninternet.net/runs/<run_id>/{summary.png,badge.svg,
|
||
screenshot.png,results.json}`. The card REPORTS results.json verbatim — it computes nothing, so it can
|
||
never look greener than the tiers (cardinal invariant). U2 acceptance ("card + badge render correctly
|
||
for a pass run AND a fail run") demonstrated: a real PASS run served live; a deterministic FAIL render
|
||
shown honest (L0/red/✘/no-screenshot).
|
||
|
||
**WHERE (commits / files).**
|
||
- `afe5e51` `runner/run_recipe_ci.py` — after results.json is written, a separate best-effort block
|
||
renders `summary.html`→`summary.png` + `badge.svg` via `harness.card` (passes
|
||
`screenshot_rel=data["screenshot"]` so the real shot embeds iff present). R7-wrapped — any failure
|
||
is swallowed, never changes `overall`.
|
||
- `daa7edd`/`7217e0c`/`8179d3f` `runner/harness/card.py` — pure `render_card_html`, `render_badge_svg`/
|
||
`level_badge_svg` (deterministic string builders), `render_card_png` (best-effort Playwright). Inline
|
||
SVG sunflower (headless chromium has no colour-emoji font). `tests/unit/test_card.py` (8 tests).
|
||
- `fa56f6b` `dashboard/dashboard.py` + `nix/modules/dashboard.nix` — `/runs/<id>/<file>` route
|
||
(allow-list + `run_id` regex + realpath-inside-runs-dir traversal guard); `/var/lib/cc-ci-runs`
|
||
bind-mounted READ-ONLY into the dashboard swarm service; `CCCI_RUNS_DIR` env.
|
||
|
||
**HOW to verify (cold).** (See ADVERSARY-INBOX for the deploy gotcha — do NOT `nixos-rebuild switch`
|
||
the live host; `#cc-ci` targets the hetzner migration host. U2.3 was rolled via the dashboard module
|
||
reconcile only. DECISIONS.md Phase-3/U2 has the `diff-closures` evidence.)
|
||
1. **Unit tests:** `cc-ci-run -m pytest tests/unit/test_card.py -q` → `8 passed`.
|
||
2. **PASS card served live (real):**
|
||
`curl -s -o /tmp/c.png -w '%{http_code} %{content_type} %{size_download}\n'
|
||
https://ci.commoninternet.net/runs/u1-uk-shot/summary.png` → `200 image/png ~69313`. Eyeball
|
||
`/tmp/c.png`: uptime-kuma, **orange LEVEL 1**, "capped: L2 upgrade N/A", install/test_serving ✔
|
||
PASS rows, clean-teardown+no-secret-leak flags, and the **real uptime-kuma screenshot embedded**.
|
||
Also `…/screenshot.png` (200 ~30858), `…/badge.svg` (200 image/svg+xml), `…/results.json` (200).
|
||
3. **Traversal/whitelist guard:** `…/runs/u1-uk-shot/../../../etc/passwd`, `…/runs/u1-uk-shot/evil.sh`,
|
||
`…/runs/nonexist/results.json` → **404** with a **9-byte** body (the dashboard's own "not found",
|
||
NOT Traefik's 19-byte 404 — proves the request reached the app and the guard rejected it).
|
||
4. **FAIL render is honest (cardinal invariant):** feed the card a fail dict (cmd in ADVERSARY-INBOX
|
||
§3) → card shows **level 0**, `level_color(0)` (red), the **✘ FAIL** mark on the install row, and
|
||
the **"no screenshot"** placeholder — never greener than the data.
|
||
|
||
**EXPECTED.** (1) `8 passed`. (2) PASS card 200/image-png/~69KB, embeds the real screenshot, level/marks
|
||
match results.json (`u1-uk-shot`: level 1, install pass). (3) all three guarded paths 404 with a 9B
|
||
body. (4) fail render: `>0<` (level 0), red colour, ✘ present, "no screenshot" present — no inflation.
|
||
|
||
The cardinal U2 invariant: the rendered card/level/badge are a faithful, never-greener projection of
|
||
results.json + the actual test outcomes, served at a stable URL, generated best-effort so a render
|
||
failure never blocks the run.
|
||
|
||
## Gate: U3 — PASS (Adversary REVIEW-3 @778b577, 2026-05-31T09:51Z; R2 cold-verified, no VETO) (YunoHost-style PR comment)
|
||
(Adversary cold-reproduced update-in-place via its own `!testme` → build #7; comment 13792 never
|
||
stacked; card == results.json, no inflation; no secrets. R3 "in comment" verified; R3 ticks at U4.)
|
||
|
||
**WHAT.** On a `!testme` run the bridge now posts/updates ONE Gitea PR comment in the YunoHost shape:
|
||
on run start a 🌻 + ⏳ **placeholder** ("level pending", live-logs link); on completion it edits the
|
||
**SAME** comment in place to 🌻 + a **level badge** image + a **summary card** image, BOTH linked to
|
||
the full run, plus full-logs/dashboard links. A re-`!testme` refreshes that same comment (back to ⏳,
|
||
then to the new result) — never stacks a new one (R2 "one comment per PR, updated in place"). Falls
|
||
back to a compact text verdict if the rendered card isn't served (R7). DoD **R2** satisfied; U3
|
||
acceptance ("live on a scratch PR — comment shows badge + card + screenshot, updates on re-run, no
|
||
secrets") demonstrated on a real scratch PR. (This also lands R3's "embedded in the comment"
|
||
sub-requirement; R3 still needs "in dashboard" at U4.)
|
||
|
||
**WHERE (commits / files).**
|
||
- `9a47aa2` `bridge/bridge.py` — `COMMENT_MARKER` (hidden HTML comment `<!-- cc-ci:testme -->`),
|
||
`start_comment_body` (⏳ placeholder), `result_comment_body` (🌻 + badge + card, linked; text
|
||
fallback), `find_existing_comment` (marker → update-in-place), `artifact_available` (HEAD existence
|
||
check → image-vs-text), `watch_and_reflect` now edits to `result_comment_body`. Card/badge URLs are
|
||
`${DASH_URL}/runs/<DRONE_BUILD_NUMBER>/{summary.png,badge.svg}` (run_id == Drone build number, see
|
||
`runner/harness/results.py::run_id`).
|
||
- `9a47aa2` `dashboard/dashboard.py` — `do_HEAD` (shared `_route` with GET) so HEAD existence-checks +
|
||
strict image clients get 200, not 501 (closes Adversary A3-1, already re-verified @8807240).
|
||
- `9a47aa2` `tests/unit/test_bridge_trigger.py` — covers placeholder shape, image-forward result,
|
||
**text fallback when card missing**, marker-based find/update-in-place.
|
||
- **Deployed:** bridge swarm image `cc-ci-bridge:6377f9571f3b` == `sha256(bridge.py)` first-12 (content
|
||
tag, confirmed live); dashboard image live with `do_HEAD`.
|
||
|
||
**HOW to verify (cold, from your clone / the VM).**
|
||
1. **Unit tests** (on cc-ci): `cc-ci-run -m pytest tests/unit/test_bridge_trigger.py tests/unit/test_card.py -q` → `15 passed`.
|
||
2. **Deployed bridge == source:** `ssh cc-ci 'sha256sum /etc/cc-ci/bridge/bridge.py | cut -c1-12'` →
|
||
`6377f9571f3b`; `ssh cc-ci 'docker service ls | grep ccci-bridge'` shows image tag `6377f9571f3b`.
|
||
3. **LIVE demo on scratch PR** `recipe-maintainers/custom-html` **PR #2** (recipe == repo name; the
|
||
bridge poller, 30s, fires on a NEW `!testme`). The bot comment carrying the marker is **id 13792**:
|
||
`curl -s -u "$GITEA_USERNAME:$GITEA_PASSWORD" https://git.autonomic.zone/api/v1/repos/recipe-maintainers/custom-html/issues/comments/13792`
|
||
→ body has `<!-- cc-ci:testme -->`, 🌻, `✅ passed`, `[](…/4)`,
|
||
`[](…/4)`, full-logs+dashboard links. (You may post your own `!testme`
|
||
on PR #2 — the repo is active in Drone; it will refresh **the same** comment 13792.)
|
||
4. **Images render (served):** `for f in summary.png badge.svg screenshot.png results.json; do
|
||
curl -s -o /dev/null -w "$f %{http_code}\n" https://ci.commoninternet.net/runs/4/$f; done` → all 200.
|
||
5. **Updates in place / no stacking:** the marked-comment set on PR #2 stays exactly `[13792]` across
|
||
runs #3 (first `!testme`) and #4 (re-`!testme`); the comment cycled ⏳→result both times. (Filter
|
||
comments for `<!-- cc-ci:testme -->` — there is exactly one.)
|
||
6. **No secrets:** scan the comment body + `/var/lib/cc-ci-runs/{3,4}/{results.json,summary.html}` for
|
||
`password|secret|token|passwd|api_key|privkey|PRIVATE` → only the `no_secret_leak` flag-name matches;
|
||
the embedded app screenshot is custom-html's **"Welcome to nginx!"** page (no values).
|
||
7. **No inflation:** the card for run #4 shows `level 4` / `capped: L5 integration N/A`, all
|
||
install/upgrade/backup/restore/custom rows ✔ — matches `/runs/4/results.json` verbatim.
|
||
|
||
**EXPECTED.**
|
||
1. `15 passed`. 2. tag `6377f9571f3b` both places. 3. comment 13792 body exactly as above (run 4).
|
||
4. all four `/runs/4/` files 200 (`summary.png` ~178 KB, `badge.svg` 342 B, `screenshot.png` 35707 B).
|
||
5. exactly one marked comment (`13792`); no new comment stacked on re-run. 6. zero real secret hits.
|
||
7. card level 4, all rows ✔, == results.json (`recipe=custom-html`, `level=4`, all tiers pass,
|
||
`flags.clean_teardown=true,no_secret_leak=true`).
|
||
|
||
The cardinal U3 invariant: ONE comment per PR, refreshed in place; the embedded card/badge are a
|
||
faithful never-greener projection of the run; image-gen failure degrades to text and never blocks the
|
||
run or the verdict.
|
||
|
||
## Gate: U4 — PASS (Adversary REVIEW-3 @9ca39dc, 2026-05-31T10:04Z; R5 + R3-full cold-verified, no VETO) (Dashboard polish)
|
||
(Grid + history cold-verified never-greener vs results.json; honest #11 failure row (404 results.json
|
||
→ failure/level —/no card); no secrets; deployed == source; 9 tests. R5 satisfied, R3 fully satisfied.)
|
||
|
||
**WHAT.** The overview at `https://ci.commoninternet.net/` is now a **YunoHost-CI-style grid**: one
|
||
card per enrolled recipe showing a **level badge** (coloured by level), latest **pass/fail** status,
|
||
last-tested **version**, an **app screenshot thumbnail** (the run's `screenshot.png`, clickable →
|
||
the full `summary.png` card), the clean-teardown/no-secret-leak flags, and a **history** link. A new
|
||
per-recipe **history page** `/recipe/<name>` lists every run of that recipe (newest first): run #,
|
||
status, level, version, when, and a per-run card link. Every field is read from the run's
|
||
**`results.json`** (level/version/screenshot/flags) so the grid mirrors the artifact and is
|
||
**never greener than the run** (cardinal guardrail). It re-renders live each request (30s cache +
|
||
auto-refresh), i.e. "regenerated on build completion". DoD **R5** satisfied; **R3** now also embedded
|
||
in the dashboard (was U3-verified in the comment) → R3 fully satisfied.
|
||
|
||
**WHERE (commits / files).**
|
||
- `e1d837e` `dashboard/dashboard.py` — `level_color`, `_results_for` (traversal-guarded results.json
|
||
reader), `_custom_recipe_builds` (cached, shared by overview+history), `_build_row` (Drone build +
|
||
results.json → display row), `latest_per_recipe` (augmented), `history_for`, `render_overview`
|
||
(grid), `render_history`, `/recipe/<name>` route. `tests/unit/test_dashboard.py` (9 tests).
|
||
- **Deployed:** `cc-ci-dashboard:7b34ec8761df` (== `sha256(dashboard.py)` first-12, confirmed live),
|
||
rolled via the dashboard **module reconcile** only (`nixos-rebuild build` non-activating →
|
||
`cc-ci-reconcile-dashboard` = `docker load` + `docker stack deploy`). NOT `nixos-rebuild switch`
|
||
(the `#cc-ci` config targets the migration host — DECISIONS Phase-3/U2; reconcile = zero host-config
|
||
impact, reversible).
|
||
|
||
**HOW to verify (cold, from your clone / the VM).**
|
||
1. **Unit tests** (on cc-ci): `cc-ci-run -m pytest tests/unit/test_dashboard.py -q` → `9 passed`.
|
||
2. **Deployed == source:** `ssh cc-ci 'sha256sum /etc/cc-ci/dashboard/dashboard.py | cut -c1-12'` →
|
||
`7b34ec8761df`; `docker service ls | grep ccci-dashboard` shows that tag.
|
||
3. **Live grid:** `curl -s https://ci.commoninternet.net/` (200) → two recipe cards: **custom-html**
|
||
(level 4, success, `db9a95024e9d`, thumbnail `/runs/7/screenshot.png` linking `/runs/7/summary.png`,
|
||
✔ teardown / ✔ no-leak, `history →` `/recipe/custom-html`) and **uptime-kuma** (level 4, success,
|
||
`dfed87a39f8a`, `/runs/12/...`).
|
||
4. **Live history:** `curl -s https://ci.commoninternet.net/recipe/custom-html` (200) → rows #7/#4/#3/#1
|
||
each L4/success/version + per-run `card` link to `/runs/<n>/summary.png`; `…/recipe/uptime-kuma` →
|
||
#12 (success L4) **and #11 (failure, level —, no card)** — a real failed run shown honestly (it
|
||
failed at `fetch_recipe` on a bogus ref, wrote no results.json → grid shows failure/level —).
|
||
5. **No inflation (cardinal):** each card's level/status/version == `/runs/<n>/results.json`
|
||
(`curl -s https://ci.commoninternet.net/runs/7/results.json` → custom-html level 4 all-pass;
|
||
`/runs/12/results.json` → uptime-kuma level 4 all-pass). A failed/absent run shows `level —` +
|
||
the failure pill + the "no screenshot" placeholder — never a level/screenshot it didn't earn.
|
||
6. **No secrets (R7):** scan the grid + both history pages → only the `title="no secret leak"` flag
|
||
label matches `secret`; embedded thumbnails are the U1-verified secret-safe landing pages.
|
||
7. **HEAD parity:** `curl -sI https://ci.commoninternet.net/` and `…/recipe/custom-html` → 200 (the
|
||
`do_HEAD`/`_route` share with GET; A3-1 stays closed).
|
||
|
||
**EXPECTED.** (1) `9 passed`. (2) tag `7b34ec8761df` both places. (3) grid 200 with the two cards as
|
||
described; (4) history 200 with the run rows + card links incl. the honest uptime-kuma failure row;
|
||
(5) card fields == results.json (custom-html L4, uptime-kuma L4); (6) zero real secret hits; (7) HEAD 200.
|
||
|
||
The cardinal U4 invariant: the grid + history are a faithful, never-greener projection of each run's
|
||
`results.json`; a failed/levelless run is shown as such (no inflated level, no screenshot it didn't
|
||
produce); rendering is read-only over the RO-bind-mounted artifacts.
|
||
|
||
## Gate: U5 — CLAIMED, awaiting Adversary (Badges + docs + hardening; R6, R7, R8 — FINAL gate)
|
||
|
||
**WHAT.** The last milestone: (a) **R6** — a per-recipe **latest-level badge** endpoint
|
||
`/badge/<recipe>.svg` (shields-style, coloured by level, embeddable in a recipe README; falls back to
|
||
a status badge for a recipe with no level yet); (b) **R8** — `docs/results-ux.md` now fully explains
|
||
the level ladder + tier→rung mapping, results.json schema, card/screenshot generation, the PR-comment
|
||
shape, and the badge endpoints + README embed snippet; (c) **R7 hardening** — render failure degrades
|
||
to text/omission and **never affects the verdict**, proven by a forced render-kill run; a broad secret
|
||
scan over every published artifact + all PR comments finds **zero** real secret values; plus a new
|
||
defense-in-depth try/except around the screenshot call site so a screenshot can never crash the run.
|
||
|
||
**WHERE (commits / files).**
|
||
- `91a69b8` `dashboard/dashboard.py` — `render_level_badge` + `_badge_svg`; `/badge/<recipe>.svg`
|
||
route prefers the latest-run level (from results.json), status fallback. Deployed
|
||
`cc-ci-dashboard:8acd8b9cc51c` (== `sha256(dashboard.py)`, confirmed live). `tests/unit/test_dashboard.py`
|
||
(+2 badge tests → 11 total).
|
||
- `91a69b8` `docs/results-ux.md` §1-5 complete (R8).
|
||
- `799cceb` `runner/run_recipe_ci.py` — defense-in-depth try/except around `screenshot_mod.capture`
|
||
call site (R7); a screenshot raise is now caught + logged non-fatal, verdict unaffected.
|
||
|
||
**HOW to verify (cold, from your clone / the VM).**
|
||
1. **R6 per-recipe level badge (live):**
|
||
`curl -s https://ci.commoninternet.net/badge/custom-html.svg` → SVG `cc-ci: custom-html | level 4`,
|
||
message-box `fill="#a0b93f"` (= `level_color(4)`); `…/badge/uptime-kuma.svg` → `level 4`;
|
||
`…/badge/keycloak.svg` (no runs) → 200, status-fallback `cc-ci | unknown`. README embed snippet in
|
||
`docs/results-ux.md` §5.
|
||
2. **R8 docs:** read `docs/results-ux.md` — §1 ladder + tier→rung mapping, §2 schema, §3 card+screenshot
|
||
+ stable URLs, §4 PR comment, §5 badges + embed snippet. No remaining TODOs.
|
||
3. **R7 render-kill degradation (verdict unaffected) — reproduce:** drive `run_recipe_ci.main()` with
|
||
the orchestrator-side cosmetic renderers forced to raise but the real (subprocess) test browser
|
||
intact — monkeypatch `run_recipe_ci.card_mod.render_card_html`/`render_card_png` and
|
||
`run_recipe_ci.screenshot_mod.capture` to raise, `RECIPE=custom-html STAGES=install`. Result
|
||
(`/var/lib/cc-ci-runs/u5-renderkill3` from my run): **EXIT 0**, install **pass** (test_serving +
|
||
test_serving_and_content PASSED — real browser unaffected), `results.json` written
|
||
(`level=1, install=pass, screenshot=null`), and **NO summary.png / NO screenshot.png** — both
|
||
cosmetic failures swallowed (`screenshot capture raised (non-fatal…)` + `summary card/badge render
|
||
failed (non-fatal)`). A renderer kill cannot change the verdict or block the run.
|
||
(Note: globally breaking the *browser path* instead — `/var/lib/cc-ci-runs/u5-renderkill2` — fails
|
||
the install tier, because custom-html's `test_serving_and_content` is a REAL browser test; that is a
|
||
real test failing correctly, NOT a cosmetics-vs-verdict datapoint. The clean isolation above breaks
|
||
ONLY the cosmetic renderers.)
|
||
4. **R7 broad leak scan:** over every published text artifact —
|
||
`for f in $(find /var/lib/cc-ci-runs -maxdepth 2 \( -name results.json -o -name summary.html -o -name badge.svg \)); do grep -EaoH 'password|passwd|secret|token|api_key|privkey|BEGIN [A-Z ]*PRIVATE KEY|AKIA[0-9A-Z]{16}|[0-9a-f]{40}' "$f"; done`
|
||
→ the ONLY matches are the `no_secret_leak` JSON field + the `✔ no secret leak` card label (a
|
||
flag name, not a value); **zero real secret values**. Same scan over all bot comments on
|
||
custom-html PR#2 → **0**. The embedded screenshots are the U1/U4-verified secret-safe setup/landing
|
||
pages (empty credential fields). (You are the R7 leak authority — this is my own pre-claim scan.)
|
||
5. **R7 comment text-fallback** (render fail → text, not a broken image): unit-covered
|
||
(`tests/unit/test_bridge_trigger.py::test_result_comment_text_fallback_when_card_missing`) + the
|
||
bridge checks `artifact_available` (HEAD) before embedding (U3-verified structurally).
|
||
6. **Unit tests** (cold): `cc-ci-run -m pytest tests/unit/test_dashboard.py tests/unit/test_card.py
|
||
tests/unit/test_bridge_trigger.py tests/unit/test_screenshot.py tests/unit/test_level.py
|
||
tests/unit/test_results.py -q` → all green (11+8+7+3+15+13).
|
||
|
||
**EXPECTED.** (1) badges render with level colour + status fallback; (2) docs complete, no TODOs;
|
||
(3) render-kill: exit 0, install pass, results.json intact, no card/screenshot; (4) leak scan: only the
|
||
flag name/label, zero real values, 0 in comments; (6) all unit tests green.
|
||
|
||
The cardinal U5 invariant: cosmetics (card, screenshot, badge, comment image) **never** block/fail a
|
||
run or change its verdict — they degrade to text/omission; and no published artifact leaks a secret.
|
||
|
||
**When the Adversary's U5 PASS lands and REVIEW-3 shows all R1–R8 verified <24h with no VETO → I flip
|
||
STATUS-3 to `## DONE`.**
|
||
|
||
## In flight
|
||
(none — U5 (final gate) claimed; parked awaiting the Adversary. On U5 PASS + all R1–R8 verified → DONE.)
|
||
|
||
## Blocked
|
||
(none)
|
||
|
||
## Note — Drone repo reactivation (infra, recorded for the Adversary)
|
||
The Hetzner-migration Drone DB reset left `recipe-maintainers/cc-ci` **inactive** (bridge log `drone
|
||
trigger failed 404`); the bridge can't trigger builds when the repo is inactive. I reactivated it
|
||
(in-scope reconfig of my own CI, reversible): `POST /api/user/repos?async=false` then `POST
|
||
/api/repos/recipe-maintainers/cc-ci` → `active=true`, config_path `.drone.yml`, timeout 60. This is
|
||
why builds #1–#4 above exist (counter reset to 1 by the DB reset). Self-heal hardening filed as
|
||
BACKLOG-3 U3.3 (fold activation into the drone reconcile) — not a U3 DoD item.
|