fix(clickhouse): resilient clickhouse-backup fetch (cache/retry/non-blocking) #1

Closed
autonomic-bot wants to merge 1 commits from ci/clickhouse-backup-resilient into main

1 Commits

Author SHA1 Message Date
bd8bd93d2e fix(clickhouse): make clickhouse-backup fetch resilient (cache on persistent volume, retry+backoff, never block server start)
Some checks failed
cc-ci/testme cc-ci: failure
The published entrypoint downloads the 22MB clickhouse-backup binary from GitHub at container boot
with 'set -ex' + a single silenced no-retry wget to ephemeral /tmp. Any transient failure of that
download (rate-limit / network) exits the container BEFORE clickhouse-server starts, so swarm restarts
it, it re-downloads, and the throttle is amplified into a crash-loop (deploy timeout).

clickhouse-backup is the BACKUP tool (backupbot pre/post hooks), not required for clickhouse-server to
run. This hardening caches the binary on the persistent /var/lib/clickhouse volume (fetched at most
once, reused on restart), retries with backoff, never blocks the server start on a fetch failure, and
un-silences the wget for diagnosability. No behaviour change when the first download succeeds.
2026-05-31 05:28:16 +00:00