fix: clickhouse-backup install must succeed loudly, never silently degrade
Some checks failed
cc-ci/testme cc-ci: failure

Replaces the previous best-effort (|| true) approach: a deploy without
clickhouse-backup would have silently broken backup/restore, so the
entrypoint now hard-fails (visibly, in service logs) if the tool truly
cannot be installed — but makes that case effectively unreachable:

- cache the VERIFIED binary on the persistent clickhouse volume, keyed
  by version: downloaded at most once per app; container restarts never
  re-fetch (kills the re-download amplification that turned a GitHub
  throttle into a permanent crash-loop)
- canonical Altinity release URL (project moved; old path is a redirect)
- bounded retries with backoff + wget read timeout (a stalled connection
  can no longer hang the deploy)
- verify the binary executes before trusting or caching it (catches
  truncated downloads and a corrupt cache)
- compose: fix app depends_on to the real service name
  (plausible_events_db) — docker compose config was failing on it, which
  disabled CI image prepull and pushed pulls into the deploy window
- bump CLICKHOUSE_ENTRYPOINT_VERSION v4 -> v5 (swarm configs immutable)

Verified on a dev deploy: fresh download path, cached-restart path,
clickhouse-backup create/list/delete, and /api/health all green.
This commit is contained in:
autonomic-bot
2026-06-09 19:09:13 +00:00
parent b90a8c4239
commit 9f8bcbc9e3
3 changed files with 36 additions and 26 deletions

View File

@ -7,7 +7,7 @@ services:
command: sh -c "sleep 10 && /entrypoint.sh db createdb && /entrypoint.sh db migrate && /entrypoint.sh run"
depends_on:
- db
- events_db
- plausible_events_db
environment:
- BASE_URL=https://$DOMAIN
- SECRET_KEY_BASE