feat(db): switch to discourse/postgres image with install-user + checksum adapter
All checks were successful
cc-ci/testme cc-ci: success

Replace the bitnami-era pgvector:pg17 db + hand-rolled pg_upgrade entrypoint
with discourse/postgres:pg18 (pgvector + discourse's auto-upgrade layer, as
suggested on coop-cloud/discourse#16). The image does the heavy lifting
(installs old binaries, runs pg_upgrade into the versioned PGDATA); a thin
cc-db-entrypoint.sh wrapper fills the two gaps it leaves:

- secrets: inject DB_PASSWORD/POSTGRES_PASSWORD from the docker secret (the
  image reads them from env, no *_FILE support);
- install user: detect the old cluster's bootstrap superuser (oid 10) and
  export POSTGRES_USER so pg_upgrade + the new cluster's initdb match it. Real
  deployments differ (bitnami-origin clusters install as 'postgres' + a
  'discourse' app role; others as 'discourse'). The image hardcodes
  --username=$POSTGRES_USER and never detects this, so the adapter is required;
- checksums: pg18's initdb enables data checksums by default but pg13-17
  clusters here have them off, and pg_upgrade requires a match -> initdb the new
  cluster with --no-data-checksums unless the old one reports them on.

Other changes:
- mount postgresql_data at /var/lib/postgresql (versioned PGDATA .../18/docker)
- pg_backup.sh: detect the superuser at runtime; fix paths for the new layout
- bump DB_ENTRYPOINT_VERSION v6, PG_BACKUP_VERSION v3 (immutable swarm configs)
- drop entrypoint.postgres.sh.tmpl

Verified on cctest: upgrade from an existing pg17 cluster (install user
'postgres') -> pg18, all data preserved, serves over HTTPS via Traefik.
This commit is contained in:
notplants
2026-06-22 16:50:08 +00:00
committed by notplants
parent 0c4539b7ad
commit 9b33fd8761
6 changed files with 146 additions and 91 deletions

View File

@ -63,35 +63,42 @@ services:
start_period: 25m
db:
image: pgvector/pgvector:pg17
# discourse/postgres = pgvector + discourse's postgres management layer, which
# auto-upgrades an older cluster in place on boot (pg_upgrade into the versioned
# PGDATA /var/lib/postgresql/${MAJOR}/docker). The cc-db-entrypoint wrapper
# injects the password secret and detects the old cluster's install user.
image: discourse/postgres:pg18
networks:
- internal
secrets:
- db_password
volumes:
- 'postgresql_data:/var/lib/postgresql/data'
# the image expects the whole cluster tree mounted here (not the data subdir);
# an existing pg17 cluster at the volume root is found and upgraded into /18/docker
- 'postgresql_data:/var/lib/postgresql'
configs:
- source: db_entrypoint
target: /docker-entrypoint.sh
target: /usr/local/bin/cc-db-entrypoint.sh
mode: 0555
- source: pg_backup
target: /pg_backup.sh
mode: 0555
entrypoint: /docker-entrypoint.sh
entrypoint: /usr/local/bin/cc-db-entrypoint.sh
environment:
# internal-only overlay network; keep all-trust so the app and the
# backup/restore hooks connect without juggling the superuser password
- POSTGRES_HOST_AUTH_METHOD=trust
- POSTGRES_USER=discourse
- POSTGRES_DB=discourse
- POSTGRES_PASSWORD_FILE=/run/secrets/db_password
- DB_USER=discourse
healthcheck:
test: "pg_isready -U discourse -d discourse"
interval: 30s
timeout: 10s
retries: 5
# generous: a postgres major-version upgrade (apt install + pg_upgrade) runs
# in the entrypoint before the server accepts connections — don't let the
# healthcheck kill an in-progress migration
start_period: 10m
# generous: a postgres major-version upgrade (apt install old binaries +
# pg_upgrade) runs in the entrypoint before the server accepts connections —
# don't let the healthcheck kill an in-progress migration
start_period: 15m
deploy:
labels:
backupbot.backup: "true"
@ -140,8 +147,7 @@ configs:
file: migrate-uploads.sh
db_entrypoint:
name: ${STACK_NAME}_db_entrypoint_${DB_ENTRYPOINT_VERSION}
file: entrypoint.postgres.sh.tmpl
template_driver: golang
file: cc-db-entrypoint.sh
pg_backup:
name: ${STACK_NAME}_pg_backup_${PG_BACKUP_VERSION}
file: pg_backup.sh