inbox(rcust): consumed 23:53Z asks — lasuite-drive proof RUNNING, discourse same-ref 2x2 queued (new-main PR=2 + old-main PR=2 @7ae7b0f); m2b-discourse HC1 facts pinned (re-checkout persisted, eb96de94=base tag, sidekiq line benign); bluesky-pds = upstream image breakage (MODULE_NOT_FOUND x3, harness-neutral)
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
autonomic-bot
2026-06-11 00:06:13 +00:00
parent 40b59b356b
commit 1ec0e772e8
4 changed files with 94 additions and 27 deletions

View File

@ -160,3 +160,41 @@ today's, from the conc-phase M2 sweep. Bad canaries recorded at their designed-f
Claimed M1. While waiting: nothing else unblocked in this phase (M2 is gated on M1) — will hold
with short fallback polls per §7 case 2.
## 2026-06-11 M2 reconciliation — discourse upgrade-HC1 root-cause hunt + bluesky re-characterization
Resumed after a loop stall (~21:18Z23:50Z): the m2b/ab sweeps had finished but nothing processed
them. Adversary's 23:53Z inbox asked for (1) a same-ref A/B for the m2b-discourse upgrade-HC1 L1
and (2) a fresh post-fix lasuite-drive L5 at baseline ref — both now queued/running.
Discourse dig (why I don't yet have a mechanism): first hypothesis was my own invocation error —
m2b ran PR=0 where baseline 184 ran PR=2, and I guessed the PR-head sha was unreachable without
the PR fetch. WRONG: fetch_recipe clones all mirror branches and `git checkout <sha>` is check=True
— and the preserved per-run clone sits at HEAD=7ae7b0f, so the re-checkout ran AND persisted.
Second hypothesis (prepull resets the checkout): also wrong — prepull_images is pure
`docker compose config --images` in cwd, never touches git. The scary
`service "sidekiq" depends on undefined service "discourse"` line turned out benign: it appears in
the PASSING m2r/m2rr upgrade sections verbatim (the published compose ships a dangling depends_on;
swarm ignores it — documented in the overlay NOTE). What's left: abra stamped the PREV-TAG commit
(eb96de94 = 0.7.0+3.3.1) on the chaos redeploy while the tree was at 7ae7b0f. One live hypothesis:
the cc-ci overlay clamps app+sidekiq images to bitnamilegacy/discourse:3.3.1; at this PR head
(0.9.0+3.5.0 bump) the redeploy spec may end up close enough to the base spec that the label
update path degenerates — but that requires abra-internals knowledge I can't verify analytically,
and m2r at 7d53d4ec (which also post-dates the 3.5.0 bump?) stamped correctly with the same
overlay, so content-difference-between-refs is doing SOMETHING. Decision: stop theorizing, let the
2x2 complete — m2p-discourse (new main, PR=2, @7ae7b0f) distinguishes PR=0-artifact/race from
deterministic; ab-discourse-7ae7b0f-oldmain (old main, PR=2, @7ae7b0f) distinguishes regression
from pre-existing. Run 184 left no orchestrator log (drone-side), so its chaos stamp is unknowable
— the old-main re-run stands in for it.
lifecycle.py diff c2508c7..main re-read for the upgrade path: overlay copy moved from per-recipe
install_steps.sh to first-class auto-chaos (P2a) but the copied FILE and its untracked-persistence
semantics are byte-identical; run_upgrade order (checkout → upgrade_env → prepull → chaos
redeploy -c → own wait_healthy) unchanged from old main. Nothing jumps out as the delta.
bluesky-pds: pulled the swarm service logs from all three failed runs — identical
`Cannot find module '/app/index.js'` crash-loop (Node v24.15.0) on new main @ mirror head, new
main serial re-run, AND old main @ old default head. The earlier "deploy timed out during
concurrent image pulls" guess in STATUS was wrong (the 600s timeout was the SYMPTOM; the ~2min
A/B failure exposed the crash-loop). Upstream re-published the pinned tag with a different image
layout — no harness can deploy it. Filed in STATUS as restructure-neutral with grep-able evidence.