diff --git a/machine-docs/REVIEW-redfix.md b/machine-docs/REVIEW-redfix.md index 57c5f46..bf7ba99 100644 --- a/machine-docs/REVIEW-redfix.md +++ b/machine-docs/REVIEW-redfix.md @@ -227,3 +227,27 @@ test-disabling. * **Node restored**: undeploy succeeded, app.ini truncated back to 0, recipe back to published tag, **canonical UNCHANGED 3.5.3 idle e6a1cc79 ts 20260617T083930Z**, stack gone. Builder's gitea fix CORRECT. (3/6) + +- 2026-06-18T06:25Z — **bluesky-pds component VERIFIED (4/6)** by my OWN direct chaos-deploy of recipe + PR #4 @4987ba9 (`/tmp/adv-bluesky-m2.log`). Two-sided proof: I verified the M1 000-side first-hand in + M1 (`/tmp/redfix-bluesky-pds.log` + live diag: WC5 promote 000, caddy `app` -> foreign proxy IP, no + cert). Now the FIX side. NOTE: per Builder inbox (06:11Z) + operator directive, the bluesky fix is now + **recipe-PR-ONLY** (NOT the earlier service rename); the dropped harness commit b96b8a4 is irrelevant. + * **Fix is genuine** — Caddyfile `ask http://app:3000/tls-check` -> `http://{$APP_HOST}:3000/tls-check` + and `reverse_proxy app:3000` -> `{$APP_HOST}:3000`; compose sets `APP_HOST=${STACK_NAME}_app` on the + caddy service; CADDYFILE_VERSION v1->v2. Service stays named `app`. Established coop-cloud pattern. + * **Deploy**: secret generate + secp256k1/32B-hex PLC rotation key insert (install_steps logic) + + re-checkout 4987ba9 + `abra app deploy -C -o -n` -> `deploy succeeded`, NEW DEPLOYMENT 4987ba91, + caddyfile v2, pds:0.4.219. **app 1/1, caddy 1/1.** + * **Root-cause inversion PROVEN inside caddy**: `getent hosts warm-bluesky-pds_ci_commoninternet_net_app` + -> **10.0.5.5** (own-stack INTERNAL) while bare `getent hosts app` -> **10.10.0.12** (FOREIGN proxy + IP — the exact M1 collision). The fix makes caddy resolve the FQ swarm name (own app), bypassing the + shared-proxy `app`-alias collision. + * **External health**: `https://warm-bluesky-pds.ci.commoninternet.net/xrpc/_health` -> **200 + {"version":"0.4.219"}** on 3/3 attempts (**M1 was 000**). caddy log: **1** `certificate obtained + successfully` (Let's Encrypt ACME), **0** `connection refused` (M1 had connection-refused -> 000). + * **Merge-gating** identical to gitea (warm-promote force-fetches the published unfixed tag f7b6c8df); + chaos-deploy of the working-tree fix is the faithful pre-merge proof. NOT a standing exception. + * **Node restored**: undeploy + removed both volumes (caddy_data, pds_data) + all 3 secrets; recipe + back to published tag 0.3.0+v0.4.219; NO bluesky stack/volume/secret/canonical (matches M1). Builder's + bluesky fix CORRECT. (4/6)