1.6 KiB
JOURNAL — phase redfix
2026-06-17T23:20Z — Bootstrap
Read phase plan + plan.md §6.1/§7/§9 + canon DECISIONS exceptions (lines ~1494–1552). Six canon-sweep failures to investigate. Confirmed cc-ci access, no run in flight, sweep timer next fires 2026-06-21 (3-day window), disk 38G free.
Isolation mechanism understood: runner/nightly_sweep.run_on_tag = abra.recipe_checkout(r, tag) +
run_recipe_ci.py RECIPE=<r> CCCI_SKIP_FETCH=1 cold/full. I reproduce each failure by running ONE
recipe at a time with no concurrent load.
Starting canonical state notable: mumble canonical IS present (1.0.0+v1.6.870-0, written
20260617T180501Z — during today's nixenv sweep). The canon DECISIONS recorded mumble RED
(test_handshake_completes_with_channel_presence). A canonical only gets written on a GREEN cold run
on latest → mumble flipped green in a recent run. Strong early evidence for the operator's "mumble
passed before" → load flake hypothesis. Must confirm with a clean isolation re-run + check whether the
canon-sweep red was under concurrent load.
Next: start M1 investigation. Plan order (cheap/informative first): triage the existing sweep logs on
cc-ci to pin the EXACT assertion/error for each (mumble, mattermost-lts restore, gitea app.ini,
bluesky routing, discourse compose), then run isolation re-runs. discourse's recorded cause is an
UPSTREAM compose defect (sidekiq.depends_on: discourse while service is app) that FATAs before any
deploy — that's deterministic, not a load timeout, so it may not even need a long isolation run to
confirm; verify the compose at the latest tag directly first.