probe(dstamp): Adversary independent probe findings — Docker rollback root cause confirmed, fix 0cc31a5 assessed CORRECT, race-window concern flagged (covered by defence-in-depth). Anti-anchoring preserved: JOURNAL not read. Awaiting claim(dstamp) for formal verdict.
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
@ -24,3 +24,30 @@
|
||||
|
||||
## Adversary findings
|
||||
<!-- Adversary-owned. Do not edit above this line in this section. -->
|
||||
|
||||
**Root cause independently confirmed @2026-06-11T17:3x (JOURNAL not read, anti-anchoring preserved):**
|
||||
|
||||
Docker Swarm `failure_action: rollback` + `order: start-first` in discourse's `compose.yml` app
|
||||
service (BOTH `eb96de94` base AND `7ae7b0f` PR-head). On the upgrade chaos redeploy, `start-first`
|
||||
runs OLD + NEW tasks co-resident (~2× memory); the heavy Rails/precompile app fails swarm's 5s
|
||||
update monitor under host memory pressure → rollback fires → app service spec reverts to
|
||||
PreviousSpec (`chaos-version=eb96de94+U`). Because `start-first` kept the OLD task serving,
|
||||
`wait_healthy` passed; `deployed_identity` read the rolled-back spec; HC1 misreported it as
|
||||
"stamp mismatch" (the real failure was "new task failed the update monitor").
|
||||
|
||||
`services_converged` blind spot: `"rollback_completed"` not in blocking states → returned True.
|
||||
|
||||
Evidence: `docker service inspect disc-ae10f0_..._app` confirmed `UpdateConfig: {On failure:
|
||||
rollback, Order: start-first, Monitoring Period: 5s}`. repro1 (isolated, no concurrency) ALSO
|
||||
showed drift → pure-concurrency hypothesis REFUTED independently before reading Builder evidence.
|
||||
|
||||
abra exonerated: abra reads `git HEAD = 7ae7b0f` and stamps `7ae7b0f7+U` CORRECTLY. Three
|
||||
bail-at-secrets repros + repro2 debug line confirm. The `+U` comes from `compose.ccci.yml` as
|
||||
untracked file in per-run recipe dir (rcust-era overlay absent from run 184's pre-rcust path).
|
||||
|
||||
Fix 0cc31a5 assessed CORRECT: overlay sets `order: stop-first` (eliminates OOM 2×-memory
|
||||
trigger); `lifecycle.assert_upgrade_converged` closes the wait_healthy blind spot by catching
|
||||
`"rollback_completed"|"rollback_paused"|"paused"` and failing HONESTLY. HC1 unchanged.
|
||||
Minor race window in `assert_upgrade_converged` (first poll could see "none" before Docker
|
||||
starts the roll) is covered: with stop-first, a post-race rollback also fails `wait_healthy`.
|
||||
No blocker. Formal verdict awaits Builder's `claim(dstamp)` commit.
|
||||
|
||||
Reference in New Issue
Block a user