REVIEW-dash — Adversary verdicts for phase `dash` (per-recipe run history fix)

SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-dash-recipe-history.md Gates: M1 (fix implemented + locally verified), M2 (deployed + verified live).

Pre-claim independent ground truth (Adversary, @2026-06-17T16:20Z, cold)

Gathered directly from the host (ssh cc-ci), BEFORE any Builder claim — this is my own baseline to verify the fix against, not the Builder's narrative.

Run artifacts on host /var/lib/cc-ci-runs:

432 run dirs total; 308 have a parseable results.json; 124 dirs have NO parseable results.json (in-flight / failed-early — contain only junit/, screenshot.png, abra/). The fix MUST skip these 124 gracefully (no 500).
results.json schema 2 keys: customization, finished, flags, level, lint, pr, recipe, ref, results, run_id, rungs, schema, screenshot, skips, stages, summary_card, version. Fields the history needs ARE present: recipe, version, level, ref, finished (epoch float timestamp), run_id. Status is derivable from results/rungs (per-stage pass/fail).

Per-recipe run counts (from parseable results.json):

33 plausible      24 ghost              9 mailu            6 cryptpad
33 custom-html    24 custom-html-tiny   8 lasuite-drive    3 drone
28 immich         15 mattermost-lts     8 lasuite-docs     3 custom-html-rst-bad
25 discourse      12 uptime-kuma        8 gitea
24 (ghost)        12 mumble             8 bluesky-pds
                  11 matrix-synapse     7 custom-html-bkp-bad
                  10 lasuite-meet       6 keycloak
                   9 n8n                6 hedgedoc

bluesky-pds (named M2 target) → 8 runs. plausible/custom-html → 33 (exceed a 30 cap → good cap test). A ~30 display cap should show 8 for bluesky-pds, 30 for plausible/custom-html.

bluesky-pds runs — newest-first BY finished timestamp (the correct order):

run_id                    ref           level  finished
753                       dcf933813df9  5      1781663348
556                       f7b6c8dfb81c  5      1781301301
435                       f7b6c8dfb81c  5      1781192858
427                       f7b6c8dfb81c  5      1781178768
423                       f7b6c8dfb81c  0      1781178063
ab-bluesky-pds-oldmain    b2d86efba3f1  0      1781126338
m2rr-bluesky-pds          b2d86efba3f1  0      1781123524
m2r-bluesky-pds           b2d86efba3f1  0      1781121610

ADVERSARIAL TRAP TO CHECK: run ids are MIXED numeric (753,556,…) AND named (m2rr-bluesky-pds, ab-bluesky-pds-oldmain). Sorting by int(run_id) would crash or misorder the named runs; sorting lexically would put 9... after 7... wrongly and scatter named ones. Only a finished-timestamp sort yields the correct newest-first order. I will verify the deployed page matches the timestamp order above, and that 423 (older, finished 1781178063) sorts BELOW 427 (1781178768) even though 423<427 numerically-close — and that the named runs land in their timestamp positions, not bunched at top/bottom.

Current (buggy) code (dashboard/dashboard.py): history_for(recipe) returns [_build_row(b) for b in _custom_recipe_builds() …]; _custom_recipe_builds fetches a single Drone page …/builds?per_page=100. So history is capped at whatever recipe runs fall in the latest-100 Drone window → most recipes show 1 row. Confirmed root cause matches plan §1.

Things I will break-test on the fix:

Count + order per recipe match the host artifacts (esp. bluesky-pds 8, timestamp order above).
The 124 unparseable dirs don't 500 and don't appear as garbage rows.
Path-traversal guard + /recipe/<name> validation preserved (try /recipe/../.., /recipe/foo%2f.., arg injection in recipe name).
Overview (/), /badge/<recipe>.svg, /runs/<id>/<file> unchanged.
stdlib-only (no new imports/deps); mount stays read-only.
Display cap actually bounds (plausible/custom-html show cap, not 33) AND newest are kept (not oldest) when capped.
Run links resolve — for named run ids too (no Drone build number for m2r*/ab-*).

Verdicts

M1: PASS @2026-06-17T16:30Z (claim `3595e80`, cold-verified)

history_for rewritten to source per-recipe history from local /var/lib/cc-ci-runs artifacts (_local_history scans dirs → _results_for → groups by recipe → sorts newest-first by finished, caps at HISTORY_CAP=30). All checks done COLD from my own fixture (tarred the 308 real results.json off the host), against my own pre-claim baseline — not the Builder's word:

Count + order match host exactly. history_for("bluesky-pds") → 8 rows in order ['753','556','435','427','423','ab-bluesky-pds-oldmain','m2rr-bluesky-pds','m2r-bluesky-pds'] — IDENTICAL to my independent timestamp-derived baseline. The mixed numeric+named id trap is handled correctly: sort key is (finished, _numeric_id) reverse; _numeric_id returns -1 for named ids (no int() crash); 423 (older) sorts below 427 though numerically close; named runs land in their timestamp positions, not bunched. Total parseable grouped rows 308, 23 recipes — match.
Display cap bounds AND keeps newest. plausible 33→30, custom-html 33→30; verified min(finished in capped) >= max(finished dropped) (oldest 3 dropped, not newest).
Malformed/empty dirs skipped, no 500. Injected EMPTYDIR / dir-with-junit-no-json / malformed-json dir into fixture → total stayed 308, no exception, none appear as rows (_results_for returns {} on miss/malformed; _local_history skips no-recipe rows).
Security preserved. _RUN_ID_RE rejects ../.., foo/.., a b, x;rm, ..%2f, ``, ., foo;, <script>; accepts bluesky-pds. _results_for("../../etc/passwd") → {} (realpath guard intact). Unchanged from before.
No regression to other routes. latest_per_recipe / _custom_recipe_builds (overview + badge source) untouched; only the history page changed source. Row-key parity: _local_history_row emits the IDENTICAL 10 keys as _build_row, so render_history is unchanged.
stdlib-only. Imports unchanged: html, json, os, re, sys, time, urllib, http.server. No new deps.
Renders. render_history("bluesky-pds", …) → 5384 bytes, 8 data rows; numeric ids link to Drone build, named ids link to /runs/<id>/summary.html — all four checked artifacts exist on host.
Unit suite: 13 passed (incl. new test_history_sourced_from_local_artifacts).

No defects. M1 verified. (Consulted JOURNAL-dash.md only AFTER writing this verdict — no new concerns.) M2 (deploy + live verify) not yet claimed.

6.6 KiB Raw Blame History

REVIEW-dash — Adversary verdicts for phase dash (per-recipe run history fix)

Pre-claim independent ground truth (Adversary, @2026-06-17T16:20Z, cold)

Verdicts

M1: PASS @2026-06-17T16:30Z (claim 3595e80, cold-verified)

6.6 KiB

Raw Blame History

REVIEW-dash — Adversary verdicts for phase `dash` (per-recipe run history fix)

M1: PASS @2026-06-17T16:30Z (claim `3595e80`, cold-verified)