fix(routing): rename main service app->pds so caddy resolves THIS stack on shared proxy #4

Open
autonomic-bot wants to merge 1 commits from ci/warm-routing-alias into main

Fixes the warm-domain HTTPS routing failure surfaced by the cc-ci canonical sweep (cold deploy is green, but the stable warm domain returns 000 on /xrpc/_health).

Root cause: the caddy sidecar uses on-demand TLS and calls http://app:3000/tls-check before issuing a cert. On a multi-tenant host every co-located stack aliases its main service app on the shared proxy overlay; caddy (attached to both proxy and internal) resolves bare app to a FOREIGN stack's endpoint (observed: caddy dialed proxy IPs 10.10.0.x belonging to other stacks; connection refused), so the tls-check fails, no cert is issued, and HTTPS is dead.

Fix: give the PDS a unique pds alias on the internal network and point caddy's reverse_proxy + on_demand_tls ask at pds:3000. pds exists only on internal, so it always resolves to THIS stack's PDS. The service name stays app (no downstream breakage).

Verified by cc-ci on the warm-canonical deploy path (the cold per-run domain was never affected).

cc @trav @notplants

Fixes the warm-domain HTTPS routing failure surfaced by the cc-ci canonical sweep (cold deploy is green, but the stable warm domain returns 000 on /xrpc/_health). **Root cause:** the caddy sidecar uses on-demand TLS and calls `http://app:3000/tls-check` before issuing a cert. On a multi-tenant host every co-located stack aliases its main service `app` on the shared `proxy` overlay; caddy (attached to both `proxy` and `internal`) resolves bare `app` to a FOREIGN stack's endpoint (observed: caddy dialed proxy IPs 10.10.0.x belonging to other stacks; connection refused), so the tls-check fails, no cert is issued, and HTTPS is dead. **Fix:** give the PDS a unique `pds` alias on the `internal` network and point caddy's `reverse_proxy` + `on_demand_tls ask` at `pds:3000`. `pds` exists only on `internal`, so it always resolves to THIS stack's PDS. The service name stays `app` (no downstream breakage). Verified by cc-ci on the warm-canonical deploy path (the cold per-run domain was never affected). cc @trav @notplants
autonomic-bot added 1 commit 2026-06-18 01:28:57 +00:00
The caddy sidecar uses on-demand TLS and asks http://app:3000/tls-check before issuing a cert.
On a shared host every co-located stack aliases its main service 'app' on the 'proxy' overlay;
caddy (on both proxy+internal) resolves bare 'app' to a FOREIGN stack's endpoint, so the tls-check
connection is refused, no cert is issued, and the PDS is unreachable over HTTPS (xrpc/_health=000).
Give the PDS a unique 'pds' alias on the internal network and point caddy's reverse_proxy +
on_demand_tls ask at it; 'pds' exists only on internal, so it always resolves to this stack's PDS.
Service name stays 'app' (no downstream breakage).
autonomic-bot changed title from fix(routing): unique pds alias so caddy resolves THIS stack on shared proxy to fix(routing): rename main service app->pds so caddy resolves THIS stack on shared proxy 2026-06-18 01:59:18 +00:00
autonomic-bot force-pushed ci/warm-routing-alias from fdbd1e2fee to 11e41b0592 2026-06-18 01:59:18 +00:00 Compare
autonomic-bot force-pushed ci/warm-routing-alias from 11e41b0592 to 4987ba91c7 2026-06-18 05:41:31 +00:00 Compare
This pull request can be merged automatically.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin ci/warm-routing-alias:ci/warm-routing-alias
git checkout ci/warm-routing-alias
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: recipe-maintainers/bluesky-pds#4
No description provided.