Phase 3 = beautiful YunoHost-style results UX (level ladder + image-forward PR comment + summary card w/ app screenshot + polished dashboard + badges). Operator kicked off manually. Starting U0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3.1 KiB
3.1 KiB
Phase 3 — Beautiful YunoHost-style results — JOURNAL (Builder-private reasoning)
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md. WHY lives here; WHAT/HOW/EXPECTED/WHERE → STATUS-3.
2026-05-31T05:41Z — Phase-3 bootstrap + orientation
Read plan-phase3-results-ux.md in full (SSOT) + plan.md §6.1/§7/§9. Oriented on the existing Phase-1/2 artifacts I'll extend:
runner/run_recipe_ci.py: orchestrates deploy-once → per-tier (install/upgrade/backup/restore/custom), produces an in-memoryresultsdict{tier: 'pass'|'fail'|'skip'}printed to Drone logs. No results.json, no level, no screenshot today. Also tracks deploy-count (DG4.1), deps/SSO readiness (sso_dep_unverified→ F2-11), teardown errors.bridge/bridge.py: posts a text PR comment with the Drone run URL;watch_and_reflectedits it to ✅/❌ on completion. No image/badge/level.dashboard/dashboard.py: stdlib HTTP service (swarm OCI image, Nix-built) that polls the Drone API only and renders a latest-per-recipe table + a basic per-recipe SVG badge (Drone status, not level). Runs as a container with no host volume mounts — relevant for artifact hosting (U0.4).
Key Phase-3 mapping insight: the level ladder (§4.1) maps cleanly onto the existing per-tier results:
- L1 install-tier pass; L2 upgrade pass; L3 backup AND restore pass; L4 custom (functional) pass;
L5 SSO/integration (requires_deps tests actually ran + passed —
deps_readyand notsso_dep_unverified); L6 recipe-local tests pass (D4 — discovered repo-local overlay/custom). - Gap-caps-level (YunoHost): level = highest rung L such that every rung ≤ L passed. A rung that is genuinely N/A (e.g. backup not BACKUP_CAPABLE, or no SSO/integration surface) must NOT block the climb but caps with a recorded reason ("L4 — no integration surface" etc.) for fairness (§4.1 L5).
- Invariants surfaced as flags not levels: clean-teardown ✔ (no dep_teardown_error / DG4.1 ok), no-secret-leak ✔.
Adversary is live (REVIEW-3 @05:42Z), flagged the Phase-2-DONE prerequisite but is not treating it as a P3 blocker; operator kicked Phase 3 off manually. Proceeding.
Plan for U0 (foundation)
- Pure
level()function in a newrunner/harness/level.py— unit-testable (no I/O), so I can prove "L4-pass" and "L2-cap" semantics cheaply and the Adversary can re-run the unit test cold. This is the load-bearing logic; everything else (card, badge, dashboard) just renders what it returns. - Capture per-test detail: run each tier's pytest with
--junitxmlto a run-scoped dir, parse the XML (stdlibxml.etree) into per-test rows {name, status, ms}. Aggregate per stage. run_recipe_ci.pyassemblesresults.json{recipe, version, pr, ref, run_id, stages[], level, level_cap_reason, flags} and writes it to the artifact dir — wrapped so a failure here NEVER changes the run's exit code (R7: cosmetics never block).- Artifact hosting (U0.4): runner writes to a host dir; dashboard bind-mounts it read-only to serve
/runs/<id>/.... Decide details + record in DECISIONS.