chore(3): bootstrap Phase-3 loop state (STATUS/BACKLOG/JOURNAL-3); seed U0-U5 backlog
Phase 3 = beautiful YunoHost-style results UX (level ladder + image-forward PR comment + summary card w/ app screenshot + polished dashboard + badges). Operator kicked off manually. Starting U0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
53
machine-docs/BACKLOG-3.md
Normal file
53
machine-docs/BACKLOG-3.md
Normal file
@ -0,0 +1,53 @@
|
||||
# Phase 3 — Beautiful YunoHost-style results — BACKLOG
|
||||
|
||||
Single source of truth: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`.
|
||||
Milestones U0–U5 (plan §5); each ends with an Adversary gate. DoD items R1–R8 (plan §2).
|
||||
|
||||
## Build backlog
|
||||
|
||||
### U0 — Results schema + level (R1)
|
||||
- [ ] U0.1 — Pure `level()` function: map per-tier results (+ deps/SSO/recipe-local signal) → integer
|
||||
level L0–L6 with gap-caps-level semantics (§4.1). Unit-tested (pass-through-L4 and fail-at-L2-capped).
|
||||
- [ ] U0.2 — Per-tier pytest emits structured per-test results (JUnit XML per tier → parsed) so
|
||||
results.json carries per-stage AND per-test ✔/✘ breakdown.
|
||||
- [ ] U0.3 — `run_recipe_ci.py` writes `results.json` per run (recipe, version, pr, ref, stages[],
|
||||
per-test rows, level, level_cap_reason, invariant flags: clean-teardown, no-secret-leak) to a
|
||||
run-scoped artifact dir. Never blocks/fails the test verdict (R7).
|
||||
- [ ] U0.4 — Decide & wire the artifact hosting path (run-scoped dir on host + dashboard serves
|
||||
`/runs/<id>/...`). Record in DECISIONS.
|
||||
- GATE U0: level correct for a recipe through L4 and one capped at L2.
|
||||
|
||||
### U1 — App screenshot (R4)
|
||||
- [ ] U1.1 — Harness captures a real Playwright screenshot of the deployed app while it is up
|
||||
(post-login where the landing page needs it), secret-safe (never shoot a credentials page).
|
||||
- [ ] U1.2 — Screenshot saved to the run artifact dir; degrades gracefully (no screenshot ≠ run fail).
|
||||
- GATE U1: screenshot of a sample recipe shows the working UI, no secrets.
|
||||
|
||||
### U2 — Summary card + badge (R3, R6)
|
||||
- [ ] U2.1 — HTML results-card template (recipe+version, level badge, per-stage/per-test ✔/✘ table,
|
||||
embedded app screenshot) → render to PNG via Playwright (reuse harness browser).
|
||||
- [ ] U2.2 — Per-run + per-recipe SVG level/status badge endpoint.
|
||||
- [ ] U2.3 — Card + badge served at stable URLs (`/runs/<id>/summary.png`, `/badge/<recipe>.svg`).
|
||||
- GATE U2: card + badge render correctly for a pass run and a fail run.
|
||||
|
||||
### U3 — YunoHost-style PR comment (R2)
|
||||
- [ ] U3.1 — Bridge posts a placeholder comment on run start (⏳ + live-logs link).
|
||||
- [ ] U3.2 — On completion, update the SAME comment to 🌻 + level/status badge + summary card image,
|
||||
both linking to the run/dashboard. Re-`!testme` refreshes it. Fallback to text on render failure.
|
||||
- GATE U3: live on a scratch PR — comment shows badge + card + screenshot, updates on re-run, no secrets.
|
||||
|
||||
### U4 — Dashboard polish (R5)
|
||||
- [ ] U4.1 — Overview grid like `ci-apps.yunohost.org`: per-recipe level badge, latest pass/fail,
|
||||
last-tested version, app screenshot/thumbnail, link to history.
|
||||
- [ ] U4.2 — Regenerated on build completion; reads results.json artifacts.
|
||||
- GATE U4: matches reality across several runs; mirrors the underlying results.json.
|
||||
|
||||
### U5 — Badges + docs + hardening (R6, R7, R8)
|
||||
- [ ] U5.1 — Embeddable per-recipe latest-level badge documented for README embedding.
|
||||
- [ ] U5.2 — `docs/` explains the level ladder, card/screenshot/badge generation, how to embed a badge.
|
||||
- [ ] U5.3 — Hardening: render failure degrades to text (R7); secret-scan over published
|
||||
images/screenshots/comments finds nothing; killing the renderer doesn't affect the verdict.
|
||||
- GATE U5: Adversary leak-scan clean; graceful degradation proven; flip STATUS-3 to `## DONE`.
|
||||
|
||||
## Adversary findings
|
||||
(Adversary owns this section — Builder does not edit.)
|
||||
42
machine-docs/JOURNAL-3.md
Normal file
42
machine-docs/JOURNAL-3.md
Normal file
@ -0,0 +1,42 @@
|
||||
# Phase 3 — Beautiful YunoHost-style results — JOURNAL (Builder-private reasoning)
|
||||
|
||||
SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. WHY lives here; WHAT/HOW/EXPECTED/WHERE → STATUS-3.
|
||||
|
||||
## 2026-05-31T05:41Z — Phase-3 bootstrap + orientation
|
||||
|
||||
Read plan-phase3-results-ux.md in full (SSOT) + plan.md §6.1/§7/§9. Oriented on the existing
|
||||
Phase-1/2 artifacts I'll extend:
|
||||
- `runner/run_recipe_ci.py`: orchestrates deploy-once → per-tier (install/upgrade/backup/restore/custom),
|
||||
produces an in-memory `results` dict `{tier: 'pass'|'fail'|'skip'}` printed to Drone logs. **No
|
||||
results.json, no level, no screenshot today.** Also tracks deploy-count (DG4.1), deps/SSO readiness
|
||||
(`sso_dep_unverified` → F2-11), teardown errors.
|
||||
- `bridge/bridge.py`: posts a text PR comment with the Drone run URL; `watch_and_reflect` edits it to
|
||||
✅/❌ on completion. No image/badge/level.
|
||||
- `dashboard/dashboard.py`: stdlib HTTP service (swarm OCI image, Nix-built) that polls the **Drone API
|
||||
only** and renders a latest-per-recipe table + a basic per-recipe SVG badge (Drone status, not level).
|
||||
Runs as a container with **no host volume mounts** — relevant for artifact hosting (U0.4).
|
||||
|
||||
Key Phase-3 mapping insight: the level ladder (§4.1) maps cleanly onto the existing per-tier results:
|
||||
- L1 install-tier pass; L2 upgrade pass; L3 backup AND restore pass; L4 custom (functional) pass;
|
||||
L5 SSO/integration (requires_deps tests actually ran + passed — `deps_ready` and not
|
||||
`sso_dep_unverified`); L6 recipe-local tests pass (D4 — discovered repo-local overlay/custom).
|
||||
- Gap-caps-level (YunoHost): level = highest rung L such that every rung ≤ L passed. A rung that is
|
||||
genuinely N/A (e.g. backup not BACKUP_CAPABLE, or no SSO/integration surface) must NOT block the
|
||||
climb but caps with a recorded reason ("L4 — no integration surface" etc.) for fairness (§4.1 L5).
|
||||
- Invariants surfaced as flags not levels: clean-teardown ✔ (no dep_teardown_error / DG4.1 ok),
|
||||
no-secret-leak ✔.
|
||||
|
||||
Adversary is live (REVIEW-3 @05:42Z), flagged the Phase-2-DONE prerequisite but is not treating it as
|
||||
a P3 blocker; operator kicked Phase 3 off manually. Proceeding.
|
||||
|
||||
### Plan for U0 (foundation)
|
||||
1. Pure `level()` function in a new `runner/harness/level.py` — unit-testable (no I/O), so I can prove
|
||||
"L4-pass" and "L2-cap" semantics cheaply and the Adversary can re-run the unit test cold. This is
|
||||
the load-bearing logic; everything else (card, badge, dashboard) just *renders* what it returns.
|
||||
2. Capture per-test detail: run each tier's pytest with `--junitxml` to a run-scoped dir, parse the
|
||||
XML (stdlib `xml.etree`) into per-test rows {name, status, ms}. Aggregate per stage.
|
||||
3. `run_recipe_ci.py` assembles `results.json` {recipe, version, pr, ref, run_id, stages[], level,
|
||||
level_cap_reason, flags} and writes it to the artifact dir — wrapped so a failure here NEVER changes
|
||||
the run's exit code (R7: cosmetics never block).
|
||||
4. Artifact hosting (U0.4): runner writes to a host dir; dashboard bind-mounts it read-only to serve
|
||||
`/runs/<id>/...`. Decide details + record in DECISIONS.
|
||||
27
machine-docs/STATUS-3.md
Normal file
27
machine-docs/STATUS-3.md
Normal file
@ -0,0 +1,27 @@
|
||||
# Phase 3 — Beautiful YunoHost-style results — STATUS
|
||||
|
||||
SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. DoD = R1–R8. Milestones U0–U5.
|
||||
State files (this phase): `machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md`. DECISIONS.md shared.
|
||||
|
||||
**WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.**
|
||||
|
||||
## Phase context
|
||||
- Phase 2b is `## DONE` (Adversary-verified, no VETO). Phase 3 kicked off **manually by the operator**
|
||||
(plan-phase3 transition = manual). Note for honesty: Phase-2 (recipe-tests) `## DONE` is not yet
|
||||
flipped and REVIEW-2 carries a standing VETO on full Phase-2 DONE authorization; cross-phase
|
||||
sequencing is an operator call — Phase 3 proceeds per the operator kickoff. Adversary concurs this
|
||||
is not a Phase-3 blocker (REVIEW-3 @05:42Z).
|
||||
|
||||
## Current state
|
||||
- Phase-3 loop live. Bootstrapping state files + settling open decisions, then executing **U0**.
|
||||
- No gate claimed yet.
|
||||
|
||||
## In flight
|
||||
- **U0 — Results schema + level (R1).** Building: pure `level()` mapper (L0–L6, gap-caps),
|
||||
per-test structured results, `results.json` per run, artifact hosting path.
|
||||
|
||||
## Gate
|
||||
(none claimed)
|
||||
|
||||
## Blocked
|
||||
(none)
|
||||
Reference in New Issue
Block a user