From 805fbba2ad042d91dca74942ca610fe08a6e5202 Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Sun, 31 May 2026 05:43:27 +0000 Subject: [PATCH] chore(3): bootstrap Phase-3 loop state (STATUS/BACKLOG/JOURNAL-3); seed U0-U5 backlog Phase 3 = beautiful YunoHost-style results UX (level ladder + image-forward PR comment + summary card w/ app screenshot + polished dashboard + badges). Operator kicked off manually. Starting U0. Co-Authored-By: Claude Opus 4.8 (1M context) --- machine-docs/BACKLOG-3.md | 53 +++++++++++++++++++++++++++++++++++++++ machine-docs/JOURNAL-3.md | 42 +++++++++++++++++++++++++++++++ machine-docs/STATUS-3.md | 27 ++++++++++++++++++++ 3 files changed, 122 insertions(+) create mode 100644 machine-docs/BACKLOG-3.md create mode 100644 machine-docs/JOURNAL-3.md create mode 100644 machine-docs/STATUS-3.md diff --git a/machine-docs/BACKLOG-3.md b/machine-docs/BACKLOG-3.md new file mode 100644 index 0000000..85c4709 --- /dev/null +++ b/machine-docs/BACKLOG-3.md @@ -0,0 +1,53 @@ +# Phase 3 — Beautiful YunoHost-style results — BACKLOG + +Single source of truth: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. +Milestones U0–U5 (plan §5); each ends with an Adversary gate. DoD items R1–R8 (plan §2). + +## Build backlog + +### U0 — Results schema + level (R1) +- [ ] U0.1 — Pure `level()` function: map per-tier results (+ deps/SSO/recipe-local signal) → integer + level L0–L6 with gap-caps-level semantics (§4.1). Unit-tested (pass-through-L4 and fail-at-L2-capped). +- [ ] U0.2 — Per-tier pytest emits structured per-test results (JUnit XML per tier → parsed) so + results.json carries per-stage AND per-test ✔/✘ breakdown. +- [ ] U0.3 — `run_recipe_ci.py` writes `results.json` per run (recipe, version, pr, ref, stages[], + per-test rows, level, level_cap_reason, invariant flags: clean-teardown, no-secret-leak) to a + run-scoped artifact dir. Never blocks/fails the test verdict (R7). +- [ ] U0.4 — Decide & wire the artifact hosting path (run-scoped dir on host + dashboard serves + `/runs//...`). Record in DECISIONS. +- GATE U0: level correct for a recipe through L4 and one capped at L2. + +### U1 — App screenshot (R4) +- [ ] U1.1 — Harness captures a real Playwright screenshot of the deployed app while it is up + (post-login where the landing page needs it), secret-safe (never shoot a credentials page). +- [ ] U1.2 — Screenshot saved to the run artifact dir; degrades gracefully (no screenshot ≠ run fail). +- GATE U1: screenshot of a sample recipe shows the working UI, no secrets. + +### U2 — Summary card + badge (R3, R6) +- [ ] U2.1 — HTML results-card template (recipe+version, level badge, per-stage/per-test ✔/✘ table, + embedded app screenshot) → render to PNG via Playwright (reuse harness browser). +- [ ] U2.2 — Per-run + per-recipe SVG level/status badge endpoint. +- [ ] U2.3 — Card + badge served at stable URLs (`/runs//summary.png`, `/badge/.svg`). +- GATE U2: card + badge render correctly for a pass run and a fail run. + +### U3 — YunoHost-style PR comment (R2) +- [ ] U3.1 — Bridge posts a placeholder comment on run start (⏳ + live-logs link). +- [ ] U3.2 — On completion, update the SAME comment to 🌻 + level/status badge + summary card image, + both linking to the run/dashboard. Re-`!testme` refreshes it. Fallback to text on render failure. +- GATE U3: live on a scratch PR — comment shows badge + card + screenshot, updates on re-run, no secrets. + +### U4 — Dashboard polish (R5) +- [ ] U4.1 — Overview grid like `ci-apps.yunohost.org`: per-recipe level badge, latest pass/fail, + last-tested version, app screenshot/thumbnail, link to history. +- [ ] U4.2 — Regenerated on build completion; reads results.json artifacts. +- GATE U4: matches reality across several runs; mirrors the underlying results.json. + +### U5 — Badges + docs + hardening (R6, R7, R8) +- [ ] U5.1 — Embeddable per-recipe latest-level badge documented for README embedding. +- [ ] U5.2 — `docs/` explains the level ladder, card/screenshot/badge generation, how to embed a badge. +- [ ] U5.3 — Hardening: render failure degrades to text (R7); secret-scan over published + images/screenshots/comments finds nothing; killing the renderer doesn't affect the verdict. +- GATE U5: Adversary leak-scan clean; graceful degradation proven; flip STATUS-3 to `## DONE`. + +## Adversary findings +(Adversary owns this section — Builder does not edit.) diff --git a/machine-docs/JOURNAL-3.md b/machine-docs/JOURNAL-3.md new file mode 100644 index 0000000..34b5661 --- /dev/null +++ b/machine-docs/JOURNAL-3.md @@ -0,0 +1,42 @@ +# Phase 3 — Beautiful YunoHost-style results — JOURNAL (Builder-private reasoning) + +SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. WHY lives here; WHAT/HOW/EXPECTED/WHERE → STATUS-3. + +## 2026-05-31T05:41Z — Phase-3 bootstrap + orientation + +Read plan-phase3-results-ux.md in full (SSOT) + plan.md §6.1/§7/§9. Oriented on the existing +Phase-1/2 artifacts I'll extend: +- `runner/run_recipe_ci.py`: orchestrates deploy-once → per-tier (install/upgrade/backup/restore/custom), + produces an in-memory `results` dict `{tier: 'pass'|'fail'|'skip'}` printed to Drone logs. **No + results.json, no level, no screenshot today.** Also tracks deploy-count (DG4.1), deps/SSO readiness + (`sso_dep_unverified` → F2-11), teardown errors. +- `bridge/bridge.py`: posts a text PR comment with the Drone run URL; `watch_and_reflect` edits it to + ✅/❌ on completion. No image/badge/level. +- `dashboard/dashboard.py`: stdlib HTTP service (swarm OCI image, Nix-built) that polls the **Drone API + only** and renders a latest-per-recipe table + a basic per-recipe SVG badge (Drone status, not level). + Runs as a container with **no host volume mounts** — relevant for artifact hosting (U0.4). + +Key Phase-3 mapping insight: the level ladder (§4.1) maps cleanly onto the existing per-tier results: +- L1 install-tier pass; L2 upgrade pass; L3 backup AND restore pass; L4 custom (functional) pass; + L5 SSO/integration (requires_deps tests actually ran + passed — `deps_ready` and not + `sso_dep_unverified`); L6 recipe-local tests pass (D4 — discovered repo-local overlay/custom). +- Gap-caps-level (YunoHost): level = highest rung L such that every rung ≤ L passed. A rung that is + genuinely N/A (e.g. backup not BACKUP_CAPABLE, or no SSO/integration surface) must NOT block the + climb but caps with a recorded reason ("L4 — no integration surface" etc.) for fairness (§4.1 L5). +- Invariants surfaced as flags not levels: clean-teardown ✔ (no dep_teardown_error / DG4.1 ok), + no-secret-leak ✔. + +Adversary is live (REVIEW-3 @05:42Z), flagged the Phase-2-DONE prerequisite but is not treating it as +a P3 blocker; operator kicked Phase 3 off manually. Proceeding. + +### Plan for U0 (foundation) +1. Pure `level()` function in a new `runner/harness/level.py` — unit-testable (no I/O), so I can prove + "L4-pass" and "L2-cap" semantics cheaply and the Adversary can re-run the unit test cold. This is + the load-bearing logic; everything else (card, badge, dashboard) just *renders* what it returns. +2. Capture per-test detail: run each tier's pytest with `--junitxml` to a run-scoped dir, parse the + XML (stdlib `xml.etree`) into per-test rows {name, status, ms}. Aggregate per stage. +3. `run_recipe_ci.py` assembles `results.json` {recipe, version, pr, ref, run_id, stages[], level, + level_cap_reason, flags} and writes it to the artifact dir — wrapped so a failure here NEVER changes + the run's exit code (R7: cosmetics never block). +4. Artifact hosting (U0.4): runner writes to a host dir; dashboard bind-mounts it read-only to serve + `/runs//...`. Decide details + record in DECISIONS. diff --git a/machine-docs/STATUS-3.md b/machine-docs/STATUS-3.md new file mode 100644 index 0000000..797bf36 --- /dev/null +++ b/machine-docs/STATUS-3.md @@ -0,0 +1,27 @@ +# Phase 3 — Beautiful YunoHost-style results — STATUS + +SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. DoD = R1–R8. Milestones U0–U5. +State files (this phase): `machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md`. DECISIONS.md shared. + +**WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.** + +## Phase context +- Phase 2b is `## DONE` (Adversary-verified, no VETO). Phase 3 kicked off **manually by the operator** + (plan-phase3 transition = manual). Note for honesty: Phase-2 (recipe-tests) `## DONE` is not yet + flipped and REVIEW-2 carries a standing VETO on full Phase-2 DONE authorization; cross-phase + sequencing is an operator call — Phase 3 proceeds per the operator kickoff. Adversary concurs this + is not a Phase-3 blocker (REVIEW-3 @05:42Z). + +## Current state +- Phase-3 loop live. Bootstrapping state files + settling open decisions, then executing **U0**. +- No gate claimed yet. + +## In flight +- **U0 — Results schema + level (R1).** Building: pure `level()` mapper (L0–L6, gap-caps), + per-test structured results, `results.json` per run, artifact hosting path. + +## Gate +(none claimed) + +## Blocked +(none)