26 lines
1.6 KiB
Markdown
26 lines
1.6 KiB
Markdown
# JOURNAL — phase `redfix`
|
||
|
||
## 2026-06-17T23:20Z — Bootstrap
|
||
|
||
Read phase plan + plan.md §6.1/§7/§9 + canon DECISIONS exceptions (lines ~1494–1552). Six
|
||
canon-sweep failures to investigate. Confirmed cc-ci access, no run in flight, sweep timer next
|
||
fires 2026-06-21 (3-day window), disk 38G free.
|
||
|
||
Isolation mechanism understood: `runner/nightly_sweep.run_on_tag` = `abra.recipe_checkout(r, tag)` +
|
||
`run_recipe_ci.py RECIPE=<r> CCCI_SKIP_FETCH=1` cold/full. I reproduce each failure by running ONE
|
||
recipe at a time with no concurrent load.
|
||
|
||
Starting canonical state notable: **mumble canonical IS present** (`1.0.0+v1.6.870-0`, written
|
||
20260617T180501Z — during today's nixenv sweep). The canon DECISIONS recorded mumble RED
|
||
(`test_handshake_completes_with_channel_presence`). A canonical only gets written on a GREEN cold run
|
||
on latest → mumble flipped green in a recent run. Strong early evidence for the operator's "mumble
|
||
passed before" → load flake hypothesis. Must confirm with a clean isolation re-run + check whether the
|
||
canon-sweep red was under concurrent load.
|
||
|
||
Next: start M1 investigation. Plan order (cheap/informative first): triage the existing sweep logs on
|
||
cc-ci to pin the EXACT assertion/error for each (mumble, mattermost-lts restore, gitea app.ini,
|
||
bluesky routing, discourse compose), then run isolation re-runs. discourse's recorded cause is an
|
||
UPSTREAM compose defect (`sidekiq.depends_on: discourse` while service is `app`) that FATAs before any
|
||
deploy — that's deterministic, not a load timeout, so it may not even need a long isolation run to
|
||
confirm; verify the compose at the latest tag directly first.
|