126 lines
8.5 KiB
Markdown
126 lines
8.5 KiB
Markdown
# Phase 3 — Beautiful YunoHost-style results — STATUS
|
||
|
||
SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. DoD = R1–R8. Milestones U0–U5.
|
||
State files (this phase): `machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md`. DECISIONS.md shared.
|
||
|
||
**WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.**
|
||
|
||
## Phase context
|
||
- Phase 2b is `## DONE` (Adversary-verified, no VETO). Phase 3 kicked off **manually by the operator**.
|
||
Note for honesty: Phase-2 `## DONE` not yet flipped (REVIEW-2 standing VETO on full Phase-2 DONE
|
||
authorization); cross-phase sequencing is an operator call. Adversary concurs it's not a P3 blocker
|
||
(REVIEW-3 @05:42Z).
|
||
- **Pre-existing repo-wide lint is RED on origin/main** (94 files `ruff format`-dirty + 36 `ruff check`
|
||
errors; confirmed on cc-ci CI devshell against clean `origin/main`, ruff 0.7.3). This predates Phase 3
|
||
and is NOT introduced by my work — my NEW Phase-3 files are fully `ruff`-clean, and I left
|
||
`run_recipe_ci.py` with fewer ruff errors than main (1 vs 4). Flagged for the operator; not a Phase-3
|
||
DoD item, and the U0 gate is verified by unit tests + real-run results.json, not repo-wide lint.
|
||
|
||
---
|
||
|
||
## Gate: U0 — PASS (Adversary REVIEW-3 @18d2bd1, 2026-05-31; R1 cold-verified, no VETO) (Results schema + level)
|
||
|
||
**WHAT.** `run_recipe_ci.py` now emits a per-run `results.json` with per-stage AND per-test ✔/✘
|
||
breakdown and a computed integer **level** (L0–L6, YunoHost gap-caps semantics). DoD R1 (level ladder)
|
||
satisfied; U0 milestone acceptance ("level correct for a recipe through L4 and one capped at L2")
|
||
demonstrated on two real end-to-end runs.
|
||
|
||
**WHERE (commits / files).**
|
||
- `9773e3f` `runner/harness/level.py` — pure `compute_level(rungs)->(level,cap_reason)` + helpers
|
||
`backup_restore_status`, `tier_to_rung`. `tests/unit/test_level.py` (15 tests).
|
||
- `52e5d21` `runner/harness/results.py` — JUnit-XML parse, `collect_stages`, `derive_rungs` (the
|
||
tier+deps/SSO→rung translation), `build_results`, `write_results`. `tests/unit/test_results.py`
|
||
(13 tests). `runner/run_recipe_ci.py` — tiers emit `--junitxml` + append `{tier,source,file,rc,junit}`
|
||
records; `main()` assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7),
|
||
incl. a narrow self leak-scan of the serialised artifact.
|
||
- `757511e` `machine-docs/DECISIONS.md` (Phase-3 section) — the documented ladder + exact rung-mapping
|
||
contract `derive_rungs` implements + results.json schema + artifact-hosting decision.
|
||
|
||
**HOW to verify (cold, from your clone on cc-ci).**
|
||
1. **Unit tests** (deterministic; also fuzz-verifiable):
|
||
`cc-ci-run -m pytest tests/unit/test_level.py tests/unit/test_results.py -q`
|
||
2. **Real-run L2-cap** (stateless, not backup-capable, ≥2 versions):
|
||
`RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-cht cc-ci-run runner/run_recipe_ci.py`
|
||
then read `/var/lib/cc-ci-runs/adv-cht/results.json`.
|
||
3. **Real-run L4-pass** (backup-capable, 3 functional tests, no deps):
|
||
`RECIPE=uptime-kuma STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-uk cc-ci-run runner/run_recipe_ci.py`
|
||
then read `/var/lib/cc-ci-runs/adv-uk/results.json`.
|
||
(Compare the `level`/`rungs` against the `results` dict + DECISIONS contract — a level greener than
|
||
the tiers would be a FAIL. Verify clean teardown: no orphan `*-pr*`/recipe service after.)
|
||
|
||
**EXPECTED.**
|
||
1. `28 passed`.
|
||
2. custom-html-tiny: `level=2`, `level_cap_reason="L3 backup/restore (data integrity) N/A"`,
|
||
`rungs={install:pass, upgrade:pass, backup_restore:na, functional:na, integration:na, recipe_local:na}`,
|
||
`results={install:pass, upgrade:pass, backup:skip, restore:skip, custom:skip}`,
|
||
`flags={clean_teardown:true, no_secret_leak:true}`, stages=[install,upgrade] each w/ per-test rows.
|
||
(My run: `/var/lib/cc-ci-runs/u0-cht-L2/results.json`.)
|
||
3. uptime-kuma: `level=4`, `level_cap_reason="L5 integration (SSO/OIDC + cross-app) N/A"`,
|
||
`rungs={install:pass, upgrade:pass, backup_restore:pass, functional:pass, integration:na, recipe_local:na}`,
|
||
all five tiers pass, `flags.clean_teardown=true`, stages=[install,upgrade,backup,restore,custom]
|
||
with per-test rows (incl. 3 uptime-kuma functional tests, source `cc-ci`).
|
||
(My run: `/var/lib/cc-ci-runs/u0-uk-L4/results.json`.)
|
||
|
||
These two bracket the gate: a recipe whose functional tests **pass** is still capped at **L2** when a
|
||
lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface
|
||
caps at **L4**.
|
||
|
||
---
|
||
|
||
## Gate: U1 — PASS (Adversary REVIEW-3 @74a6993, 2026-05-31; R4 cold-verified, no VETO) (App screenshot)
|
||
|
||
**WHAT.** The harness now captures a **real Playwright screenshot of the deployed app** while it is
|
||
up (after deploy+health/readiness, before any tier mutates state, before teardown) and writes it to
|
||
the run artifact dir as `screenshot.png`. The capture is **secret-safe by default** (it shoots the
|
||
app **landing page**, never a credentials page; a recipe opts into a post-login view via an optional
|
||
`SCREENSHOT` meta hook that owns the no-secret-page guarantee — none used yet). It is **best-effort**:
|
||
`capture()` swallows every error and returns `None`, so it NEVER blocks/fails/hangs the run (R7); the
|
||
`results.json` `screenshot` field is set to `"screenshot.png"` ONLY when the capture actually produced
|
||
a file, else stays `null`. U1 milestone acceptance ("screenshot of a sample recipe shows the working
|
||
UI, no secrets") demonstrated on a real uptime-kuma run; graceful-degradation (R7) demonstrated on an
|
||
unreachable-domain capture.
|
||
|
||
**WHERE (commits / files).**
|
||
- `5fa15d4` `runner/run_recipe_ci.py` — imports `screenshot as screenshot_mod`; after deploy+readiness
|
||
and OUTSIDE the deploy try/except (so a screenshot issue can never flip `deploy_ok`), under
|
||
`if deploy_ok:` calls `screenshot_mod.capture(domain, screenshot_path(run_artifact_dir), recipe_meta=meta)`
|
||
and sets `screenshot_rel`; passes `screenshot=screenshot_rel` into `build_results(...)`.
|
||
- `daa7edd` `runner/harness/screenshot.py` — `capture()` (default landing-page nav via
|
||
`browser.goto_with_retry`, 45s deadline cap; optional `SCREENSHOT` hook), `screenshot_path()`,
|
||
`_load_screenshot_hook()`. `tests/unit/test_screenshot.py` (pure helpers; 4 tests).
|
||
|
||
**HOW to verify (cold, from your clone on cc-ci).**
|
||
1. **Pure-helper unit tests:** `cc-ci-run -m pytest tests/unit/test_screenshot.py -q`
|
||
2. **Real positive capture** (working UI, no secret): `rm -rf /var/lib/cc-ci-runs/adv-u1 &&
|
||
RECIPE=uptime-kuma STAGES=install CCCI_RUN_ID=adv-u1 cc-ci-run runner/run_recipe_ci.py`
|
||
then `scp` back `/var/lib/cc-ci-runs/adv-u1/screenshot.png` and EYEBALL it; check
|
||
`/var/lib/cc-ci-runs/adv-u1/results.json` has `"screenshot":"screenshot.png"`. Confirm NO orphan
|
||
service after (`docker service ls | grep -i uptime` empty = clean teardown).
|
||
3. **Graceful degradation (R7)** — capture against an unreachable host returns None, never raises:
|
||
`cc-ci-run -c 'import sys; sys.path.insert(0,"runner"); from harness import screenshot as S;
|
||
print(S.capture("adv-u1-noexist.ci.commoninternet.net","/tmp/x.png"))'` → prints `None` (≈45s),
|
||
no /tmp/x.png produced.
|
||
|
||
**EXPECTED.**
|
||
1. `3 passed` (test_screenshot.py has 3 pure-helper tests; corrected from an earlier "4" over-count
|
||
per the Adversary's honest-reporting flag, REVIEW-3 @74a6993 — doc-only, no behavioural impact).
|
||
2. `screenshot.png` ~30 KB showing uptime-kuma's **"Uptime Kuma / Create your admin account"**
|
||
landing page with **EMPTY** username/password/repeat fields (a setup form — it asks the user to
|
||
set a password; it does NOT display any generated secret), i.e. real working app UI, no secret
|
||
values. results.json `screenshot="screenshot.png"`, `flags.clean_teardown=true`; no orphan service.
|
||
(My run: `/var/lib/cc-ci-runs/u1-uk-shot/{screenshot.png,results.json}`.)
|
||
3. `None` returned after the 45s deadline, no file written, no exception — proving a screenshot
|
||
failure leaves the run/verdict untouched (cosmetics never block, R7). (My check log: capture
|
||
"failed (non-fatal, verdict unaffected)" → `GRACEFUL_DEGRADATION= True`.)
|
||
|
||
The cardinal Phase-3 invariant for U1: the screenshot is a faithful capture of the live app, never a
|
||
credentials page, and its presence/absence never changes the verdict.
|
||
|
||
## In flight (next, post-gate)
|
||
- U2 — summary card + badge (HTML→PNG via Playwright; SVG level badge; stable URLs). Render path
|
||
already de-risked headless on cc-ci for pass+fail fixtures (JOURNAL-3 @06:50Z) — next is wiring the
|
||
card/badge generation into the run + serving them. Held until U1 PASSes (no advance past the gate).
|
||
|
||
## Blocked
|
||
(none)
|