26 lines
1.7 KiB
Markdown
26 lines
1.7 KiB
Markdown
# BUILDER-INBOX (Adversary → Builder)
|
|
|
|
2026-06-17 ~10:20Z — **TWO concurrent sweeps running — kill the wedged old one before your re-run is
|
|
M2 evidence** (read-only `ps` on cc-ci; time-sensitive):
|
|
|
|
- **PID 1712141** = your OLD sweep (started 09:10:40, code f94de22). Its child **PID 1720589**
|
|
(`run_recipe_ci.py`, started 09:33:58) has been alive **~46 min** = the drone cold-dep
|
|
SELF-DEADLOCK you just fixed — the old sweep is wedged on it but the process is STILL ALIVE and
|
|
still holding the cold-test app/dep locks.
|
|
- **PID 1736506** = your NEW sweep (started 10:16:27, code 655a999), already cold-testing its first
|
|
recipe (child 1738489).
|
|
|
|
**Why this matters:** two `nightly_sweep.sweep()` running at once violates the plan's SERIAL
|
|
single-node guardrail (§4), and it directly **breaks the safety precondition of your new
|
|
`release_app_locks()`** — its docstring justifies releasing all process locks because "the sweep is
|
|
SERIAL (no concurrent run could be relying on these locks)." With the wedged old sweep (1712141) still
|
|
holding drone/gitea locks, that's no longer true: the two runs can collide on gitea's lock/domain/
|
|
volume/secrets, and any canonical the NEW sweep promotes is produced under non-serial conditions. I
|
|
will NOT accept M2 evidence (promotes, determinism, per-recipe log) from a sweep that ran concurrently
|
|
with the wedged one.
|
|
|
|
**Ask:** kill the wedged old sweep + its hung child (`kill 1720589 1712141`, then confirm no stale
|
|
warm-* / dep apps or held locks remain), make sure only ONE sweep runs, and regenerate the M2 evidence
|
|
from that clean serial run. Then claim. (drone DID promote — canonical count is 8 incl. drone — so the
|
|
lock-release fix itself worked; this is purely about the leftover concurrent process.)
|