Files
cc-ci/machine-docs/REVIEW-dash.md

137 lines
8.8 KiB
Markdown

# REVIEW-dash — Adversary verdicts for phase `dash` (per-recipe run history fix)
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-dash-recipe-history.md
Gates: M1 (fix implemented + locally verified), M2 (deployed + verified live).
---
## Pre-claim independent ground truth (Adversary, @2026-06-17T16:20Z, cold)
Gathered directly from the host (`ssh cc-ci`), BEFORE any Builder claim — this is my own
baseline to verify the fix against, not the Builder's narrative.
**Run artifacts on host `/var/lib/cc-ci-runs`:**
- **432** run dirs total; **308** have a parseable `results.json`; **124** dirs have NO
parseable `results.json` (in-flight / failed-early — contain only `junit/`, `screenshot.png`,
`abra/`). The fix MUST skip these 124 gracefully (no 500).
- `results.json` schema 2 keys: `customization, finished, flags, level, lint, pr, recipe, ref,
results, run_id, rungs, schema, screenshot, skips, stages, summary_card, version`.
Fields the history needs ARE present: `recipe`, `version`, `level`, `ref`, `finished` (epoch
float timestamp), `run_id`. Status is derivable from `results`/`rungs` (per-stage pass/fail).
**Per-recipe run counts (from parseable results.json):**
```
33 plausible 24 ghost 9 mailu 6 cryptpad
33 custom-html 24 custom-html-tiny 8 lasuite-drive 3 drone
28 immich 15 mattermost-lts 8 lasuite-docs 3 custom-html-rst-bad
25 discourse 12 uptime-kuma 8 gitea
24 (ghost) 12 mumble 8 bluesky-pds
11 matrix-synapse 7 custom-html-bkp-bad
10 lasuite-meet 6 keycloak
9 n8n 6 hedgedoc
```
- `bluesky-pds` (named M2 target) → **8 runs**. `plausible`/`custom-html` → 33 (exceed a 30 cap →
good cap test). A ~30 display cap should show 8 for bluesky-pds, 30 for plausible/custom-html.
**bluesky-pds runs — newest-first BY `finished` timestamp (the correct order):**
```
run_id ref level finished
753 dcf933813df9 5 1781663348
556 f7b6c8dfb81c 5 1781301301
435 f7b6c8dfb81c 5 1781192858
427 f7b6c8dfb81c 5 1781178768
423 f7b6c8dfb81c 0 1781178063
ab-bluesky-pds-oldmain b2d86efba3f1 0 1781126338
m2rr-bluesky-pds b2d86efba3f1 0 1781123524
m2r-bluesky-pds b2d86efba3f1 0 1781121610
```
**ADVERSARIAL TRAP TO CHECK:** run ids are MIXED numeric (753,556,…) AND named
(`m2rr-bluesky-pds`, `ab-bluesky-pds-oldmain`). Sorting by `int(run_id)` would crash or misorder
the named runs; sorting lexically would put `9...` after `7...` wrongly and scatter named ones.
**Only a `finished`-timestamp sort yields the correct newest-first order.** I will verify the
deployed page matches the timestamp order above, and that 423 (older, finished 1781178063) sorts
BELOW 427 (1781178768) even though 423<427 numerically-close — and that the named runs land in
their timestamp positions, not bunched at top/bottom.
**Current (buggy) code (`dashboard/dashboard.py`):** `history_for(recipe)` returns
`[_build_row(b) for b in _custom_recipe_builds() …]`; `_custom_recipe_builds` fetches a single
Drone page `…/builds?per_page=100`. So history is capped at whatever recipe runs fall in the
latest-100 Drone window → most recipes show 1 row. Confirmed root cause matches plan §1.
**Things I will break-test on the fix:**
1. Count + order per recipe match the host artifacts (esp. bluesky-pds 8, timestamp order above).
2. The 124 unparseable dirs don't 500 and don't appear as garbage rows.
3. Path-traversal guard + `/recipe/<name>` validation preserved (try `/recipe/../..`,
`/recipe/foo%2f..`, arg injection in recipe name).
4. Overview (`/`), `/badge/<recipe>.svg`, `/runs/<id>/<file>` unchanged.
5. stdlib-only (no new imports/deps); mount stays read-only.
6. Display cap actually bounds (plausible/custom-html show cap, not 33) AND newest are kept
(not oldest) when capped.
7. Run links resolve — for named run ids too (no Drone build number for m2r*/ab-*).
---
## Verdicts
### M1: PASS @2026-06-17T16:30Z (claim 3595e80, cold-verified)
`history_for` rewritten to source per-recipe history from local `/var/lib/cc-ci-runs` artifacts
(`_local_history` scans dirs → `_results_for` → groups by recipe → sorts newest-first by `finished`,
caps at `HISTORY_CAP=30`). All checks done COLD from my own fixture (tarred the 308 real
`results.json` off the host), against my own pre-claim baseline — not the Builder's word:
- **Count + order match host exactly.** `history_for("bluesky-pds")` → 8 rows in order
`['753','556','435','427','423','ab-bluesky-pds-oldmain','m2rr-bluesky-pds','m2r-bluesky-pds']`
— IDENTICAL to my independent timestamp-derived baseline. **The mixed numeric+named id trap is
handled correctly**: sort key is `(finished, _numeric_id)` reverse; `_numeric_id` returns -1 for
named ids (no `int()` crash); 423 (older) sorts below 427 though numerically close; named runs land
in their timestamp positions, not bunched. Total parseable grouped rows **308**, 23 recipes — match.
- **Display cap bounds AND keeps newest.** plausible 33→30, custom-html 33→30; verified
`min(finished in capped) >= max(finished dropped)` (oldest 3 dropped, not newest).
- **Malformed/empty dirs skipped, no 500.** Injected EMPTYDIR / dir-with-junit-no-json /
malformed-json dir into fixture → total stayed 308, no exception, none appear as rows
(`_results_for` returns `{}` on miss/malformed; `_local_history` skips no-recipe rows).
- **Security preserved.** `_RUN_ID_RE` rejects `../..`, `foo/..`, `a b`, `x;rm`, `..%2f`, ``, `.`,
`foo;`, `<script>`; accepts `bluesky-pds`. `_results_for("../../etc/passwd")` → `{}` (realpath
guard intact). Unchanged from before.
- **No regression to other routes.** `latest_per_recipe` / `_custom_recipe_builds` (overview + badge
source) untouched; only the history page changed source. Row-key parity: `_local_history_row` emits
the IDENTICAL 10 keys as `_build_row`, so `render_history` is unchanged.
- **stdlib-only.** Imports unchanged: html, json, os, re, sys, time, urllib, http.server. No new deps.
- **Renders.** `render_history("bluesky-pds", …)` → 5384 bytes, 8 data rows; numeric ids link to
Drone build, named ids link to `/runs/<id>/summary.html` — all four checked artifacts exist on host.
- **Unit suite: 13 passed** (incl. new `test_history_sourced_from_local_artifacts`).
No defects. M1 verified. (Consulted JOURNAL-dash.md only AFTER writing this verdict — no new concerns.)
M2 (deploy + live verify) not yet claimed.
### M2: PASS @2026-06-17T16:40Z (claim 4c0b289, cold-verified live)
Dashboard redeployed with the M1 fix; per-recipe history verified on the LIVE site
(`https://ci.commoninternet.net`). All probes run cold against the live service + re-derived host
ground truth (host now 439 dirs / 23 recipes — re-counted fresh, not trusting the claim):
- **Deployed image rolled + healthy.** `docker service ls` → `1/1 cc-ci-dashboard:11ac2a1e6c07`
(the M1 content-hash tag, rolled from `15addbc7bf45`). The live page serving 8 bluesky-pds rows
incl. named ids is conclusive proof the NEW code is live (the old Drone-slice code could not).
- **Live counts = host counts.** bluesky-pds **8**=8, ghost **24**=24, immich **28**=28,
discourse **25**=25; plausible **30** and custom-html **30** correctly capped from 33. All match my
freshly re-derived host per-recipe counts.
- **Live order matches host timestamp order (mixed-id trap).** `/recipe/bluesky-pds` rows in exact
order `753 556 435 427 423 ab-bluesky-pds-oldmain m2rr-bluesky-pds m2r-bluesky-pds` — identical to
my baseline. Per-row status/level/version also match: 753/556/435/427 = success L5; 423 + the three
named runs = failure L0; refs correct.
- **Cap keeps NEWEST live.** `/recipe/plausible` top row = run **758**, which IS the host's newest
plausible run by `finished` (1781665203). Oldest dropped, not newest.
- **Other routes intact.** overview `/` → 200, `/badge/bluesky-pds.svg` → 200; overview still
latest-per-recipe (Drone-sourced, unchanged).
- **Security intact live.** Traversal/injection rejected at the live edge: `..%2f..%2fetc%2fpasswd`
→ 404, `%2e%2e%2f%2e%2e` → 404 (no `root:` leak); `;`-injection → 404. The only 200s are harmless:
`../..`/`%2e%2e` normalize to `/` (overview, no file content), and a valid-format-but-unknown name
renders an empty history (0 rows). `_RUN_ID_RE` + realpath guards hold.
- **Retention adequate (independently confirmed).** `grep -rniE cc-ci-runs nix/` shows NO
rm/find-delete/prune/maxage/tmpfiles trim — nothing reaps `/var/lib/cc-ci-runs`. 439 dirs span
2026-05-31 → 2026-06-17. No growth cap needed now (recorded in DECISIONS).
No defects. **M1 + M2 both fresh PASS, no VETO** → Builder may write `## DONE`.