# cc-ci Results UX — level ladder, summary card, screenshot & badges (Phase 3, R8) This doc explains how a cc-ci run is presented: the **level** a run earns, the **summary card** + **app screenshot** rendered for it, the **PR comment** it posts, and the **badges** you can embed. It is the R8 reference for Phase 3 (`plan-phase3-results-ux.md`). > Presentation never changes the verdict. The level and card *report* the test outcomes; they can > only ever understate, never overstate, what the tests actually verified (the cardinal guardrail). > The authoritative pass/fail is the run's exit status + the per-tier results; the level is a summary. --- ## 1. The level ladder (R1) Every run earns a single integer **level 0–6**. The ladder is cumulative with **YunoHost gap-caps-the-level** semantics: you earn level `L` only if **every rung 1..L was a clean PASS**. The first rung that is not a clean PASS — a real **FAIL** *or* genuinely **N/A** for this recipe — stops the climb, and `level_cap_reason` records which rung and why. | Level | Rung | Earned when | |------:|------|-------------| | **L0** | — | install failed / the app never became healthy. | | **L1** | install | deploys and passes health/readiness. | | **L2** | upgrade | previous published version → PR/latest, stays healthy, data intact. | | **L3** | backup/restore | seeded data survives backup → wipe → restore. | | **L4** | functional | the recipe-specific functional tests pass. | | **L5** | integration | SSO/OIDC + cross-app integration tests pass. | | **L6** | recipe-local | the recipe repo's own `tests/` (D4) pass and are merged. | **N/A caps, fairly.** A rung that does not apply to a recipe (only one published version → no upgrade; not backup-capable; no SSO/integration surface; no recipe-local tests) is **N/A**, which caps the climb at the rung below it with a recorded reason — it is *not* counted as a failure. This is the only fair reading of "a missing lower rung caps the level": e.g. a recipe with **no integration surface caps at L4 by definition**, shown as `level_cap_reason = "L5 integration … N/A"`. A stateless app whose functional tests pass but which cannot be backed up is honestly capped at **L2** (`"L3 backup/restore … N/A"`) rather than shown as L4 — understating is safe; overstating is forbidden. Worked examples (real runs): - `uptime-kuma` — install+upgrade+backup+restore+functional all pass, no SSO surface → **L4** (`cap = "L5 integration (SSO/OIDC + cross-app) N/A"`). - `custom-html-tiny` — stateless, not backup-capable: install+upgrade pass, backup/restore N/A → **L2** (`cap = "L3 backup/restore (data integrity) N/A"`). ### How tiers map to rungs (the translation layer) `run_recipe_ci.py` holds the run's per-tier results (`install/upgrade/backup/restore/custom`) + deps/SSO signals; `runner/harness/results.py::derive_rungs` maps them to the rung-status dict that `runner/harness/level.py::compute_level` scores. The mapping (also in `DECISIONS.md`, Phase 3): - **install** ← install tier (pass/fail). - **upgrade** ← upgrade tier; `skip` → **na** (only one published version). - **backup_restore** ← backup AND restore tiers both pass → pass; either fail → fail; not backup-capable → **na**. - **functional** ← the custom tier minus its SSO tests; a custom failure conservatively fails this rung (we don't split functional-vs-SSO failure → never inflate); no custom tests → **na**. - **integration** ← applies only if the recipe declares deps; pass iff deps wired and SSO verified and custom didn't fail; recipes with no declared deps → **na** (the "caps at L4" rule). - **recipe_local** ← the recipe repo's own `tests/` (discovery source `repo-local`) ran and passed; none present → **na**. The pure scorer is exhaustively unit-tested + fuzz-verified (all 729 rung combinations: level == count of leading consecutive passes, zero inflation). ### Invariant flags (shown, not climbed) Two Phase-1 gating invariants are surfaced as flags on the card, not as ladder rungs: `clean_teardown` (the run left no orphaned app/volume/secret and stayed within the deploy budget) and `no_secret_leak` (no known secret value appears in the published artifact — the Adversary's broader leak scan is the authority). --- ## 2. `results.json` (per run) Each run writes `${CCCI_RUNS_DIR:-/var/lib/cc-ci-runs}//results.json` (`run_id` = the Drone build number, or the run's unique app domain for a hand-run). Schema: ```json { "schema": 1, "run_id": "...", "recipe": "...", "version": "...", "pr": "...", "ref": "...", "finished": 0.0, "level": 4, "level_cap_reason": "L5 integration (SSO/OIDC + cross-app) N/A", "rungs": {"install":"pass","upgrade":"pass","backup_restore":"pass","functional":"pass", "integration":"na","recipe_local":"na"}, "stages": [{"name":"install","status":"pass", "tests":[{"name":"test_serving","status":"pass","ms":168,"source":"generic"}]}], "results": {"install":"pass","upgrade":"pass","backup":"pass","restore":"pass","custom":"pass"}, "flags": {"clean_teardown": true, "no_secret_leak": true}, "screenshot": "screenshot.png", "summary_card": "summary.png" } ``` Assembly is **best-effort**: a failure to build/write `results.json` is logged but never changes the run's exit code (cosmetics never block the pipeline, R7). --- ## 3. Summary card + app screenshot (R3/R4) **App screenshot** (`runner/harness/screenshot.py`). After the app deploys and passes health/readiness and **before any tier mutates state or teardown runs**, the harness captures a real Playwright screenshot of the live app and writes `screenshot.png` to the run dir. It is **secret-safe by default**: it shoots the **landing page** (login/setup forms show input *fields*, not secret values), viewport-only (`full_page=False`, no scroll into a secrets panel), and the harness never auto-fills an install wizard. A recipe whose landing page is uninformative may opt into a post-login view via an optional `SCREENSHOT` hook in `tests//recipe_meta.py` — **that hook owns the no-credential-page guarantee**. Capture is **best-effort**: any error returns `None`, writes no file, and never blocks the run (R7); `results.json.screenshot` is set only when a file was actually produced. **Summary card** (`runner/harness/card.py`). After `results.json` is written, the harness builds an HTML results card — recipe + version, the level badge, a per-stage/per-test ✔/✘ table with timings, the embedded app screenshot (base64 data-URI so the PNG is self-contained), and the invariant flags — and screenshots that HTML to `summary.png` via the harness Playwright browser. The card **reports `results.json` verbatim — it computes nothing**, so it can never show a run greener than its tests (cardinal guardrail). Rendering is best-effort (returns `None` on failure → no card, run unaffected). **Stable URLs.** The dashboard serves the run artifact dir read-only at: ``` https://ci.commoninternet.net/runs//summary.png # the card https://ci.commoninternet.net/runs//screenshot.png # the app screenshot https://ci.commoninternet.net/runs//badge.svg # the per-run level badge https://ci.commoninternet.net/runs//results.json # the raw data ``` `` is the Drone build number. The route is whitelist + traversal-guarded (filenames from a fixed set; `run_id` charset-restricted; realpath must stay inside the runs dir) and read-only. ## 4. PR comment (R2) On a `!testme` run the comment-bridge (`bridge/bridge.py`) maintains **one comment per PR, updated in place** (it carries a hidden `` marker so re-`!testme` finds and refreshes the same comment rather than stacking new ones): 1. **On start** — a 🌻 + ⏳ placeholder: `testing @ ` + a live-logs link, "level pending". 2. **On completion** — the same comment is edited to the YunoHost-shaped result: 🌻 + a **level badge** image + the **summary card** image, **both linking to the run**, plus full-logs/dashboard links. If the rendered card isn't served (render failed, build didn't finish), the comment **falls back to a compact text verdict** with the run link (the bridge checks artifact availability with a cheap HEAD request) — R7: a cosmetics failure degrades to text, never a broken image, never affecting the verdict. ## 5. Badges (R6) + how to embed one Two SVG badge endpoints, both shields-style and coloured by level (`level_color`): - **Per-recipe latest-level** (for a recipe README): `https://ci.commoninternet.net/badge/.svg` → `cc-ci: | level N` for that recipe's most recent run (falls back to a status badge if the recipe has no level yet). Re-rendered live from the latest `results.json`. - **Per-run** (pinned to one run, e.g. in the PR comment): `https://ci.commoninternet.net/runs//badge.svg`. Embed the per-recipe badge in a recipe README (Markdown), linking to the cc-ci dashboard: ```markdown [![cc-ci level](https://ci.commoninternet.net/badge/.svg)](https://ci.commoninternet.net/recipe/) ``` The link target `…/recipe/` is that recipe's run-history page (level/version/status per run, with a link to each run's summary card).