review(2w): watchdog [C1] ping — no formal gate yet; read-only pre-review (reconciler clean, alerts-dir flag) + inbox heads-up to coordinate live reproduce
This commit is contained in:
23
machine-docs/BUILDER-INBOX.md
Normal file
23
machine-docs/BUILDER-INBOX.md
Normal file
@ -0,0 +1,23 @@
|
||||
# Builder inbox — from Adversary (@2026-05-29)
|
||||
|
||||
**Re: watchdog pinged me on a [C1] claim, but I don't see a formal gate.**
|
||||
|
||||
Your `STATUS-2w.md ## Gate` reads "(none claimed yet)", and your STATUS still lists W0.7 + W0.8 as
|
||||
remaining (with the lasuite-docs nginx-upstream `--chaos` race blocking the WC1 dependent-green
|
||||
proof). So I'm treating WC1/WC1.1/WC1.2 as **NOT yet formally claimed** and have NOT logged a verdict.
|
||||
The ping likely fired on the "reconciler-side WC1/WC1.1/WC1.2 proven" wording in 819c1bc.
|
||||
|
||||
**What I did (read-only, no live churn):** pre-reviewed `runner/warm_reconcile.py` (no defects — WC1.2
|
||||
ordering/conservatism + WC1.1 deploy-fail-and-unhealthy rollback both look correct) and inspected live
|
||||
state (warm-keycloak active, last_good=10.7.1+26.6.2 = recovered canonical). Logged in REVIEW-2w.
|
||||
|
||||
**Coordination:** I deliberately did NOT run the live marquee reproduce yet — it churns the warm
|
||||
keycloak (undeploy/snapshot/deploy ×several) and would collide if you're driving keycloak for W0.8.
|
||||
**When you formally claim WC1, set the `## Gate` line and I'll run the full cold reproduce then.**
|
||||
|
||||
**One flag to check on your side:** `/var/lib/ci-warm/alerts/` is currently EMPTY, but W0.9 claims a
|
||||
rollback alert was written there and the alert-relay archiving (alerts/seen/) is deferred/unwired —
|
||||
so a written alert should still be present. Probably you cleaned up the W0.9 test alert; just
|
||||
confirming nothing silently dropped it. I'll verify an alert actually lands during my reproduce.
|
||||
|
||||
— Adversary
|
||||
@ -87,3 +87,37 @@ leftover phase-2 cold app `lasu-0a6fb2` is **fully gone**: `abra app ls -S -m` s
|
||||
secrets. Disk `/` at **63% (9.8G free / 28G)** — consistent with the Builder's claimed 96%→62%
|
||||
reclaim. Cold-teardown-sacred holds for this orphan; disk budget healthy. Will fold into the WC8
|
||||
verdict when that gate is claimed. Still no WC gate CLAIMED; W0 → next is W0.9 WC1.1 live proofs.
|
||||
|
||||
## @2026-05-29 — Watchdog pinged [C1]; NO formal gate claim yet — read-only pre-review (NOT a verdict)
|
||||
Watchdog signalled a [C1] claim, but `STATUS-2w.md ## Gate` reads "(none claimed yet)" and the
|
||||
Builder's own STATUS lists **W0.7 + W0.8 as remaining** before claiming WC1/WC1.1/WC1.2, with a build
|
||||
finding (lasuite-docs in-place `--chaos` redeploy nginx `host not found in upstream ...backend:8000`
|
||||
race) currently **blocking the WC1 dependent-green proof**. Per §6.1 there is NO formal gate to pass
|
||||
yet — ping likely fired on the "reconciler-side WC1/WC1.1/WC1.2 proven" wording in 819c1bc. I will
|
||||
NOT log a WC1/WC1.1/WC1.2 PASS until the gate is formally CLAIMED and I run the marquee reproduce cold.
|
||||
|
||||
**Read-only pre-review done now (no live churn — avoids colliding with the Builder's W0.8 keycloak work):**
|
||||
- Live state consistent with the W0.9 narrative: `warm-keycloak.service` active; live image
|
||||
`keycloak/keycloak:26.6.2` + `mariadb:12.2`; `/var/lib/ci-warm/keycloak/last_good = 10.7.1+26.6.2`
|
||||
(the recovered canonical — correctly NOT advanced to the simulated-broken 10.7.10).
|
||||
- Static review of `runner/warm_reconcile.py` — no defects:
|
||||
- WC1.2 safety gate runs BEFORE any snapshot/deploy (L335-343); a hold returns with NO
|
||||
snapshot/deploy/rollback churn; both `held-major` + `held-manual-migration` alerts carry `release_notes`.
|
||||
- `is_major_bump` is conservative: holds on a major bump of EITHER the recipe-semver (pre-`+`) OR
|
||||
the app-version (post-`+`), so a keycloak app-major (25->26, the DB-migration case) is also held.
|
||||
Neutralizes a tag-format wording mismatch (plan §WC1.2 says `<upstream>+<recipe-semver>`; code's
|
||||
observed data says `<recipe-semver>+<app-version>`) — checking both sides covers intent either way.
|
||||
Not a defect; noted so I don't re-flag it.
|
||||
- WC1.1 rolls back on BOTH a deploy exception AND an unhealthy result (L356-362); stateful path
|
||||
restores the snapshot before redeploying the prior version; raises if the rollback itself is
|
||||
unhealthy. Alert `rollback` carries last_good/attempted/recovered/notes.
|
||||
- **OPEN FLAG to confirm at the live reproduce:** `/var/lib/ci-warm/alerts/` is currently EMPTY,
|
||||
though W0.9 claims a rollback alert was written there and the alert-relay archiving to `alerts/seen/`
|
||||
is explicitly deferred/unwired. Likely benign (Builder cleaned up the W0.9 test alert), but I MUST
|
||||
confirm a `*rollback*.json` alert actually lands during my own cold reproduce (no silent no-alert).
|
||||
- **PLAN for the formal gate:** when WC1 is CLAIMED, run the Builder's reproduce (STATUS L79-83):
|
||||
fake tags `10.7.9+26.6.2`(good) + `10.7.10+26.6.2`(broken KC_HOSTNAME), `CCCI_SKIP_FETCH=1
|
||||
cc-ci-run runner/warm_reconcile.py keycloak` x2 → expect `upgraded:` then `rolled-back:`, marker
|
||||
realm survives, last_good unchanged at prior, a `*rollback*.json` alert; PLUS the WC1 headline
|
||||
(dependent SSO custom test green vs warm keycloak + concurrent distinct realms + reaping) + a
|
||||
major/manual-migration WC1.2 hold proof. Sent a BUILDER-INBOX heads-up to coordinate keycloak timing.
|
||||
|
||||
Reference in New Issue
Block a user