From 0c11b0b39d836f914d14e0b9036a1703c81a9fa8 Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Thu, 18 Jun 2026 00:26:19 +0000 Subject: [PATCH] =?UTF-8?q?note(redfix):=20M1=20interim=20=E2=80=94=20blue?= =?UTF-8?q?sky-pds=20CONFIRMED=20by=20my=20reproduction=20(getent=20app->1?= =?UTF-8?q?0.10.0.4=20proxy=20collision,=20real=20app=2010.0.5.6=20never?= =?UTF-8?q?=20resolved;=20deterministic=20000);=20recipe=20routing=20defec?= =?UTF-8?q?t=20not=20machinery/flake?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- machine-docs/REVIEW-redfix.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/machine-docs/REVIEW-redfix.md b/machine-docs/REVIEW-redfix.md index e2c6d82..58f93d6 100644 --- a/machine-docs/REVIEW-redfix.md +++ b/machine-docs/REVIEW-redfix.md @@ -45,3 +45,22 @@ _(none yet — awaiting Builder bootstrap + first gate claim)_ "deploys+serves fine, only the overlay test reds." Will run when M1 is claimed and the swarm is free (Builder not deploying). Same for bluesky app-alias collision (needs live caddy/getent diag). These are NOT verdicts — formal M1 PASS/FAIL awaits the Builder's gate claim. +- 2026-06-18T00:25Z — **M1 CLAIMED** (commit 0a06c41). Node verified idle/clean before any run + (only infra + live warm-keycloak; no bluesky/test stacks; no run_recipe_ci; load 0.03; gitea idle + 3.5.3) — Builder "node clean" claim ✔. Began my own COLD isolation re-runs (one at a time, no + concurrent load), swarm confirmed free. +- 2026-06-18T00:29Z — **bluesky-pds CONFIRMED by my own reproduction** (`/tmp/adv-bluesky.log`, + tag 0.3.0+v0.4.219, RECIPE=bluesky-pds CCCI_SKIP_FETCH=1). Cold lifecycle GREEN (install/backup/ + restore/custom=pass, upgrade=skip) — reproduced. WC5 promote → unhealthy, 000. DECISIVE live diag + inside the warm caddy container (60326521a2ac, nets: proxy=10.10.52.13 + internal=10.0.5.3): + * `getent hosts app` → **10.10.0.4** (a *proxy*-net foreign endpoint) — NOT bluesky's own app. + * bluesky's OWN app is at internal **10.0.5.6** (real target), never resolved. + * caddy TLS log cycles `dial tcp 10.10.0.{4,5,6,8,10,11,12}:3000: connect: connection refused` + on `ask http://app:3000/tls-check` → on-demand cert denied → TLS fails → /xrpc/_health = 000. + Verdict basis: NOT a flake (deterministic, every retry refused); NOT promote-machinery (the probe + correctly refuses an unhealthy endpoint, no false promote); **genuine recipe routing defect** — + recipe names its svc `app` + puts caddy on the shared multi-tenant `proxy` net + Caddyfile uses bare + `app`, so docker DNS resolves `app` to OTHER stacks' apps. Builder's classification (recipe defect, + reverses the plan's "cc-ci warm-machinery" prior) is CORRECT. Sharper than Builder's note (my run's + internal IP 10.0.5.6 vs their 10.0.3.3 — same mechanism, different deploy). Letting run finish + will + tear down the orphan warm-bluesky stack. [interim — full M1 verdict batched after mumble+discourse.]