8.3 KiB
Phase 3 — Beautiful YunoHost-style results — STATUS
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md. DoD = R1–R8. Milestones U0–U5.
State files (this phase): machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md. DECISIONS.md shared.
WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.
Phase context
- Phase 2b is
## DONE(Adversary-verified, no VETO). Phase 3 kicked off manually by the operator. Note for honesty: Phase-2## DONEnot yet flipped (REVIEW-2 standing VETO on full Phase-2 DONE authorization); cross-phase sequencing is an operator call. Adversary concurs it's not a P3 blocker (REVIEW-3 @05:42Z). - Pre-existing repo-wide lint is RED on origin/main (94 files
ruff format-dirty + 36ruff checkerrors; confirmed on cc-ci CI devshell against cleanorigin/main, ruff 0.7.3). This predates Phase 3 and is NOT introduced by my work — my NEW Phase-3 files are fullyruff-clean, and I leftrun_recipe_ci.pywith fewer ruff errors than main (1 vs 4). Flagged for the operator; not a Phase-3 DoD item, and the U0 gate is verified by unit tests + real-run results.json, not repo-wide lint.
Gate: U0 — PASS (Adversary REVIEW-3 @18d2bd1, 2026-05-31; R1 cold-verified, no VETO) (Results schema + level)
WHAT. run_recipe_ci.py now emits a per-run results.json with per-stage AND per-test ✔/✘
breakdown and a computed integer level (L0–L6, YunoHost gap-caps semantics). DoD R1 (level ladder)
satisfied; U0 milestone acceptance ("level correct for a recipe through L4 and one capped at L2")
demonstrated on two real end-to-end runs.
WHERE (commits / files).
9773e3frunner/harness/level.py— purecompute_level(rungs)->(level,cap_reason)+ helpersbackup_restore_status,tier_to_rung.tests/unit/test_level.py(15 tests).52e5d21runner/harness/results.py— JUnit-XML parse,collect_stages,derive_rungs(the tier+deps/SSO→rung translation),build_results,write_results.tests/unit/test_results.py(13 tests).runner/run_recipe_ci.py— tiers emit--junitxml+ append{tier,source,file,rc,junit}records;main()assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7), incl. a narrow self leak-scan of the serialised artifact.757511emachine-docs/DECISIONS.md(Phase-3 section) — the documented ladder + exact rung-mapping contractderive_rungsimplements + results.json schema + artifact-hosting decision.
HOW to verify (cold, from your clone on cc-ci).
- Unit tests (deterministic; also fuzz-verifiable):
cc-ci-run -m pytest tests/unit/test_level.py tests/unit/test_results.py -q - Real-run L2-cap (stateless, not backup-capable, ≥2 versions):
RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-cht cc-ci-run runner/run_recipe_ci.pythen read/var/lib/cc-ci-runs/adv-cht/results.json. - Real-run L4-pass (backup-capable, 3 functional tests, no deps):
RECIPE=uptime-kuma STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-uk cc-ci-run runner/run_recipe_ci.pythen read/var/lib/cc-ci-runs/adv-uk/results.json. (Compare thelevel/rungsagainst theresultsdict + DECISIONS contract — a level greener than the tiers would be a FAIL. Verify clean teardown: no orphan*-pr*/recipe service after.)
EXPECTED.
28 passed.- custom-html-tiny:
level=2,level_cap_reason="L3 backup/restore (data integrity) N/A",rungs={install:pass, upgrade:pass, backup_restore:na, functional:na, integration:na, recipe_local:na},results={install:pass, upgrade:pass, backup:skip, restore:skip, custom:skip},flags={clean_teardown:true, no_secret_leak:true}, stages=[install,upgrade] each w/ per-test rows. (My run:/var/lib/cc-ci-runs/u0-cht-L2/results.json.) - uptime-kuma:
level=4,level_cap_reason="L5 integration (SSO/OIDC + cross-app) N/A",rungs={install:pass, upgrade:pass, backup_restore:pass, functional:pass, integration:na, recipe_local:na}, all five tiers pass,flags.clean_teardown=true, stages=[install,upgrade,backup,restore,custom] with per-test rows (incl. 3 uptime-kuma functional tests, sourcecc-ci). (My run:/var/lib/cc-ci-runs/u0-uk-L4/results.json.)
These two bracket the gate: a recipe whose functional tests pass is still capped at L2 when a lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface caps at L4.
Gate: U1 — CLAIMED, awaiting Adversary (App screenshot; R4)
WHAT. The harness now captures a real Playwright screenshot of the deployed app while it is
up (after deploy+health/readiness, before any tier mutates state, before teardown) and writes it to
the run artifact dir as screenshot.png. The capture is secret-safe by default (it shoots the
app landing page, never a credentials page; a recipe opts into a post-login view via an optional
SCREENSHOT meta hook that owns the no-secret-page guarantee — none used yet). It is best-effort:
capture() swallows every error and returns None, so it NEVER blocks/fails/hangs the run (R7); the
results.json screenshot field is set to "screenshot.png" ONLY when the capture actually produced
a file, else stays null. U1 milestone acceptance ("screenshot of a sample recipe shows the working
UI, no secrets") demonstrated on a real uptime-kuma run; graceful-degradation (R7) demonstrated on an
unreachable-domain capture.
WHERE (commits / files).
5fa15d4runner/run_recipe_ci.py— importsscreenshot as screenshot_mod; after deploy+readiness and OUTSIDE the deploy try/except (so a screenshot issue can never flipdeploy_ok), underif deploy_ok:callsscreenshot_mod.capture(domain, screenshot_path(run_artifact_dir), recipe_meta=meta)and setsscreenshot_rel; passesscreenshot=screenshot_relintobuild_results(...).daa7eddrunner/harness/screenshot.py—capture()(default landing-page nav viabrowser.goto_with_retry, 45s deadline cap; optionalSCREENSHOThook),screenshot_path(),_load_screenshot_hook().tests/unit/test_screenshot.py(pure helpers; 4 tests).
HOW to verify (cold, from your clone on cc-ci).
- Pure-helper unit tests:
cc-ci-run -m pytest tests/unit/test_screenshot.py -q - Real positive capture (working UI, no secret):
rm -rf /var/lib/cc-ci-runs/adv-u1 && RECIPE=uptime-kuma STAGES=install CCCI_RUN_ID=adv-u1 cc-ci-run runner/run_recipe_ci.pythenscpback/var/lib/cc-ci-runs/adv-u1/screenshot.pngand EYEBALL it; check/var/lib/cc-ci-runs/adv-u1/results.jsonhas"screenshot":"screenshot.png". Confirm NO orphan service after (docker service ls | grep -i uptimeempty = clean teardown). - Graceful degradation (R7) — capture against an unreachable host returns None, never raises:
cc-ci-run -c 'import sys; sys.path.insert(0,"runner"); from harness import screenshot as S; print(S.capture("adv-u1-noexist.ci.commoninternet.net","/tmp/x.png"))'→ printsNone(≈45s), no /tmp/x.png produced.
EXPECTED.
4 passed.screenshot.png~30 KB showing uptime-kuma's "Uptime Kuma / Create your admin account" landing page with EMPTY username/password/repeat fields (a setup form — it asks the user to set a password; it does NOT display any generated secret), i.e. real working app UI, no secret values. results.jsonscreenshot="screenshot.png",flags.clean_teardown=true; no orphan service. (My run:/var/lib/cc-ci-runs/u1-uk-shot/{screenshot.png,results.json}.)Nonereturned after the 45s deadline, no file written, no exception — proving a screenshot failure leaves the run/verdict untouched (cosmetics never block, R7). (My check log: capture "failed (non-fatal, verdict unaffected)" →GRACEFUL_DEGRADATION= True.)
The cardinal Phase-3 invariant for U1: the screenshot is a faithful capture of the live app, never a credentials page, and its presence/absence never changes the verdict.
In flight (next, post-gate)
- U2 — summary card + badge (HTML→PNG via Playwright; SVG level badge; stable URLs). Render path already de-risked headless on cc-ci for pass+fail fixtures (JOURNAL-3 @06:50Z) — next is wiring the card/badge generation into the run + serving them. Held until U1 PASSes (no advance past the gate).
Blocked
(none)