journal(2): mattermost PASS (3rd this session); next bluesky-pds P4 scoped (account-based marker to catch running-app-sqlite-hold restore gap)

This commit is contained in:
2026-05-30 02:39:05 +01:00
parent 32050885a8
commit 8e160af997

View File

@ -1286,3 +1286,32 @@ This is the 2nd recipe (after immich) where the P4 data-integrity overlay caught
backup/restore defect — strong evidence the phase's P4 requirement is doing real work. The remaining
backup-capable recipes (bluesky-pds, uptime-kuma, ghost) should be assumed similarly suspect until their
restore is proven to round-trip seeded data.
---
## 2026-05-30T~01:40 — Q4.5 mattermost PASS (3rd this session); next: bluesky-pds P4 (scoped)
Session tally: immich Q3.5 PASS (recipe-PR adds DB backup), matrix-synapse Q4.1 PASS (post-restore
DB-pool race fix), mattermost-lts Q4.5 PASS (recipe-PR fixes no-op restore; negative control proved
teeth). Two recipe-PRs fixing real coop-cloud data-loss bugs (immich + mattermost), both Adversary-
verified non-vacuous via PR=0 negative controls.
**NEXT: bluesky-pds P4 (Q4.3 already has strong functional; only the P4 data-integrity overlay is
missing).** Recipe shape: service `app` (pds 0.4) mounts `pds_data:/pds` (PDS_DATA_DIRECTORY=/pds;
atproto account/repo sqlite + blobs under /pds/blocks). `backupbot.backup=true` on `app`, NO
backup.path / pre-hook / restore post-hook → whole-volume file-level backup (same shape as mattermost's
broken PGDATA backup). **Design decision for the P4 marker — DON'T use a bare /pds/ci_marker FILE:**
the PDS doesn't hold a loose file open, so a file marker would survive restore even if the running PDS
fails to reload its restored sqlite — i.e. it would NOT catch the "running app holds the data files"
class of bug (which IS what bit mattermost/immich). To have teeth, seed RECIPE-AWARE data: create an
atproto account (unique handle, via the PDS API like the §4.3 test / `com.atproto.server.createAccount`
with an admin-minted invite code), `test_backup` asserts it resolves (`com.atproto.repo.describeRepo`),
`pre_restore` deletes it (`com.atproto.admin.deleteAccount`, admin auth via pds_admin_password) so a
successful restore is OBSERVABLE, `test_restore` asserts the account resolves again. Expect this MAY
reveal the same running-app-holds-sqlite restore gap → if so, recipe-PR (restart the pds on restore,
or a sqlite-aware restore hook). Deploy-test first to find out (don't assume).
- After bluesky: uptime-kuma (sqlite data-vol + Socket.IO §4.3 create-monitor) and ghost (mysql
backup + §4.3 create-post) remain; then plausible (clickhouse rate-limit) cold green; discourse/drone
stay BLOCKED. Then Q5 (docs + DONE).
Checkpointing here (node clean, no gate pending — all 3 claims this session PASSed) to take bluesky
fresh next cycle; the analysis above lets it start at the overlay, not the investigation.