claim(M1): per-recipe history sourced from local /var/lib/cc-ci-runs artifacts (full history, not Drone 100-build slice)

history_for() now enumerates run dirs' results.json, groups by recipe, sorts newest-first by finished timestamp (mixed numeric+named ids — timestamp is the only correct key), caps at HISTORY_CAP=30, skips malformed/empty/no-recipe dirs. Overview + badges + /runs + security guards + stdlib-only unchanged. Local verify: 13/13 unit tests; full-fixture vs 308 real results.json → bluesky-pds=8 in exact ts order, plausible capped 30 newest, edge dirs skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 16:25:39 +00:00
parent 2d5211f401
commit 3595e80d08
6 changed files with 277 additions and 8 deletions
--- a/machine-docs/BACKLOG-dash.md
+++ b/machine-docs/BACKLOG-dash.md
@ -0,0 +1,14 @@
+# BACKLOG — phase `dash`
+
+## Build backlog
+
+- [x] Root-cause confirmed (Drone 100-build window) + host artifact schema inspected.
+- [x] M1: rewrite `history_for` to source from `/var/lib/cc-ci-runs` local artifacts, newest-first by
+      `finished`, capped at HISTORY_CAP, malformed/empty dirs skipped, security/other routes unchanged.
+- [x] M1: unit test for local sourcing (count/order/cap/skip) + full-fixture verify vs real data.
+- [ ] M1: awaiting Adversary PASS in REVIEW-dash.md.
+- [ ] M2: deploy (rebuild dashboard image via deploy-dashboard reconcile / nixos-rebuild; content-hash
+      tag rolls on dashboard.py change), verify live on `/recipe/bluesky-pds` + ≥2 recipes, overview +
+      badges still 200, host health after.
+- [ ] M2: confirm retention does not trim `/var/lib/cc-ci-runs` (record in DECISIONS if a cap needed).
+- [ ] DONE: both gates Adversary-PASS in REVIEW-dash.md → write `## DONE` in STATUS-dash.md.
--- a/machine-docs/DECISIONS.md
+++ b/machine-docs/DECISIONS.md
@ -1566,3 +1566,16 @@ so the fallback (decouple version-record from retained volume) is NOT needed. Me
  at the full 20-enrolled set. WC8 disk-hygiene (`ci-docker-prune`) keeps residue bounded.
 Conclusion: keep all-enrolled with retained volumes; revisit only if `/` free drops below a single
 recipe's largest restore (~1–2G working set). No recipe dropped for disk.
+
+## phase dash — per-recipe history sourced from local run artifacts (2026-06-17)
+The dashboard's per-recipe history page (`/recipe/<recipe>`) sources its run list from the local
+`/var/lib/cc-ci-runs/*/results.json` artifacts (complete: 308 finished runs; durable; already
+bind-mounted read-only), NOT the Drone `…/builds?per_page=100` slice (root cause: that 100-build
+window dropped each recipe's older runs out of view after the regall sweep → most recipes showed 1
+run). Newest-first by the `results.json` `finished` timestamp (run ids are MIXED numeric + named, so
+only a timestamp sort is correct — `int(run_id)` would crash on `m2r-*`/`ab-*`); display-capped at
+`HISTORY_CAP=30`. Status derived from the per-stage `results` map (no top-level status field). The
+OVERVIEW (`/`) and badges keep their Drone latest-per-recipe source unchanged. Deliberately did NOT
+merge Drone live "running" status into history (optional per plan; re-adds the network dependency the
+local source removes; overview already shows live status). Retention: 308 parseable runs present, no
+trim job observed → adequate; revisit only if a cap is ever needed.
--- a/machine-docs/JOURNAL-dash.md
+++ b/machine-docs/JOURNAL-dash.md
@ -0,0 +1,44 @@
+# JOURNAL — phase `dash` (reasoning; Adversary does not read before verdict)
+
+## 2026-06-17 — M1 design + implementation
+
+**Root cause (confirmed against plan §1 + host):** `history_for` read `_custom_recipe_builds()`,
+which fetches a single Drone page `…/builds?per_page=100`. The recent `regall` sweep `!testme`'d all
+21 recipes once, filling the latest-100 window, so each recipe's older runs fell outside it → most
+recipes rendered exactly 1 history row. Host has 432 run dirs (308 parseable `results.json`).
+
+**Why source from local artifacts, not paginate Drone:** the plan's chosen design. Local artifacts
+are complete (308 finished runs vs 100-build Drone window), durable (independent of Drone
+retention/pagination), already bind-mounted read-only, and already read per-run by `_results_for`.
+Pure-local also removes a network dependency + failure mode from the history page. I deliberately did
+NOT merge in Drone "currently running" live status (plan lists it as an optional "e.g." value-add):
+it re-introduces the Drone dependency and the overview already shows live status; the DoD asks only
+that the *historical* list come from local artifacts. Recorded as a decision.
+
+**Status derivation:** `results.json` (schema 2) has no top-level status field. Derived from the
+per-stage `results` map: any `fail`/`error` → failure; all `pass`/`skip` → success; else unknown.
+A skip alone is not a failure (e.g. custom-html-bkp-bad: backup=fail → failure; level-5 plausible:
+all pass → success). This matches what the run actually did without inventing a Drone call.
+
+**The sort trap (flagged by Adversary's pre-claim baseline too):** run ids are MIXED numeric
+(`753`,`556`) and named (`m2r-bluesky-pds`,`ab-bluesky-pds-oldmain`). `int(run_id)` would crash on
+named ids; lexical sort would scatter them and misorder `9…` vs `7…`. The ONLY correct order is by
+`finished` timestamp. Sort key = `(finished, _numeric_id)` reverse — finished is primary, numeric id
+is a stable tiebreak (named ids get -1, so timestamp always decides their slot). Verified the output
+matches the Adversary's independently-derived bluesky-pds order byte-for-byte.
+
+**Cap:** `HISTORY_CAP=30` (env-overridable). Sorted newest-first BEFORE slicing, so the cap keeps the
+30 newest and drops the oldest — verified plausible (33 runs) keeps the newest 30, drops oldest 3.
+
+**Caching:** `_local_history` scans the whole runs dir once per `CACHE_TTL` (reuses the existing 30s
+TTL) and groups by recipe, so a busy page doesn't json-load 300+ files per request. `_results_for`
+(already traversal-guarded) is reused for each dir read, so the path-traversal guarantee is unchanged.
+
+**Retention:** 308 parseable runs present spanning many days — retention is adequate; no trimming of
+`/var/lib/cc-ci-runs` observed that would vanish history. Will confirm no cleanlogs/prune job trims it
+during M2 and record in DECISIONS if a cap is ever needed (none needed now).
+
+**Local verification (M1):** 13/13 unit tests pass (incl. new local-sourcing test). Full-fixture run
+against all 308 real `results.json` + injected malformed/empty/no-recipe dirs: bluesky-pds=8 in exact
+timestamp order, plausible capped 30 (newest kept), 308 total grouped, edge dirs skipped without
+raising, security guards (`_RUN_ID_RE`, `_results_for`, `serve_run_file`) all still reject traversal.
--- a/machine-docs/STATUS-dash.md
+++ b/machine-docs/STATUS-dash.md
@ -0,0 +1,68 @@
+# STATUS — phase `dash` (per-recipe run history fix)
+
+SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-dash-recipe-history.md
+Gates: M1 (fix implemented + locally verified) · M2 (deployed + verified live)
+
+## Gate: M1 CLAIMED, awaiting Adversary
+
+**WHAT** — `history_for(recipe)` in `dashboard/dashboard.py` now sources the FULL per-recipe run
+history from the local run artifacts under `/var/lib/cc-ci-runs` (each run dir's `results.json`),
+newest-first by the `finished` timestamp, display-capped at `HISTORY_CAP` (default 30). It no longer
+reads the Drone `…/builds?per_page=100` slice (the root cause: that window dropped a recipe's older
+runs out of view, so most recipes showed 1 run). Overview (`/`), `/badge/<recipe>.svg`,
+`/runs/<id>/<file>`, security guards, and stdlib-only constraint are unchanged.
+
+**WHERE** —
+- Commit: see `git log` on origin/main for the `claim(M1)` commit (this push).
+- Changed files: `dashboard/dashboard.py` (new `_run_status`, `_numeric_id`, `_local_history_row`,
+  `_local_history`; rewritten `history_for`; new `HISTORY_CAP`; new `_LOCAL` cache), and
+  `tests/unit/test_dashboard.py` (new `test_history_sourced_from_local_artifacts`).
+- Host artifacts the page reads: `/var/lib/cc-ci-runs/<id>/results.json` (bind-mounted read-only into
+  the dashboard container, unchanged from before).
+
+**HOW to verify (cold, from a fresh clone)** —
+1. Unit suite (stdlib render + new local-sourcing test):
+   ```
+   nix-shell -p 'python3.withPackages(ps:[ps.pytest])' --run \
+     'DRONE_TOKEN_FILE=$(mktemp) python3 -m pytest tests/unit/test_dashboard.py -q'
+   ```
+   EXPECTED: `13 passed`.
+2. Verify against the REAL host artifacts. Build a fixture of every `results.json` and run
+   `history_for` against it (no Drone, no network):
+   ```
+   FIX=/tmp/advfix; rm -rf $FIX; mkdir -p $FIX
+   ssh cc-ci 'cd /var/lib/cc-ci-runs && tar -cf - */results.json 2>/dev/null' | tar -xf - -C $FIX
+   printf x > /tmp/t.tok
+   DRONE_TOKEN_FILE=/tmp/t.tok CCCI_RUNS_DIR=$FIX python3 -c '
+   import sys; sys.path.insert(0,"dashboard"); import dashboard as d
+   r=d.history_for("bluesky-pds")
+   print("count", len(r), [x["number"] for x in r])
+   print("total parseable", sum(len(v) for v in d._local_history().values()))
+   print("plausible cap", len(d.history_for("plausible")))'
+   ```
+   EXPECTED:
+   - `bluesky-pds` count **8**, order EXACTLY
+     `['753','556','435','427','423','ab-bluesky-pds-oldmain','m2rr-bluesky-pds','m2r-bluesky-pds']`
+     (newest-first by `finished`; note 423 sorts BELOW 427 though id 423<427, and named ids land in
+     their timestamp positions — the mixed numeric+named id trap).
+   - total parseable grouped rows **308** (matches host: 432 dirs, 308 with parseable `results.json`).
+   - `plausible` capped at **30** (of 33), newest kept.
+
+**EXPECTED — invariants the Adversary's break-tests should confirm hold**
+- The 124 run dirs with no/malformed `results.json` are skipped (no 500, no garbage row): `_results_for`
+  returns `{}` on miss/malformed/non-dir, `_local_history` skips any row with no `recipe`.
+- Security preserved (untouched code paths): `/recipe/<name>` still gated by `_RUN_ID_RE`
+  (`^[A-Za-z0-9][A-Za-z0-9._-]*$` → rejects `../..`, `foo/..`, spaces, `;`); `_results_for` /
+  `serve_run_file` still realpath-guarded against escaping `/var/lib/cc-ci-runs`.
+- stdlib-only: no new imports (still `html,json,os,re,sys,time,urllib,http.server`).
+- Overview (`/`) and `/badge/<recipe>.svg` still sourced from Drone latest-per-recipe (`_custom_recipe_builds`
+  / `latest_per_recipe` unchanged) — only the *history* page changed source.
+- Run-link resolution: numeric id → `{DRONE_URL}/{CI_REPO}/<id>`; named id (`m2r-*`, `ab-*`) →
+  `/runs/<id>/summary.html` (local, since no Drone build number exists).
+- Status pill derived from the per-stage `results` map (`results.json` has no top-level status):
+  any `fail`/`error` → failure; all `pass`/`skip` → success; else unknown.
+
+## Gate: M2 — NOT STARTED (deploy + live verify; begins after M1 PASS)
+
+## Blocked
+(none)