From c2c66f21d8e3c526c0cfee0d9908fd2abaa8c9ab Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Sat, 30 May 2026 21:19:08 +0000 Subject: [PATCH] =?UTF-8?q?journal(2):=20backupbot=20enumerate-once=20flow?= =?UTF-8?q?=20=E2=86=92=20harness=20must=20verify+re-invoke=20backup=20if?= =?UTF-8?q?=20db=20volume=20missing=20(chosen=20fix)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- machine-docs/JOURNAL-2.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/machine-docs/JOURNAL-2.md b/machine-docs/JOURNAL-2.md index a7e260c..63d7a28 100644 --- a/machine-docs/JOURNAL-2.md +++ b/machine-docs/JOURNAL-2.md @@ -1434,3 +1434,21 @@ does it enumerate volumes relative to that?) to confirm the cycle+capture intera candidate = harness verifies the backup snapshot contains the db volume and retries if not, AND/OR the recipe-PR backup is made resilient (+ `set -o pipefail` + fail-loud on missing dump so it can never be silent again). 5 ghost runs done (full4 timeout-fixed; full5/6/7/8 restore race) — stop blind re-runs. + +## 2026-05-30T21:18Z — backupbot backup flow read: enumerate-once → no retry recovers a dropped volume + +Read backup-bot-two `/usr/bin/backup` `create`: it computes (pre_cmds, post_cmds, backup_paths) ONCE +via get_backup_details (which resolves each labelled volume's host path from the RUNNING service spec), +then runs pre_cmds (mysqldump via docker exec), then `backup_volumes(backup_paths, retries)` (restic), +then post_cmds. It does NOT stop/cycle the db. So the db cycle I observed during backup is swarm/mysqld, +NOT backupbot. Critically: backup_paths are enumerated ONCE up-front; if the db service is mid-cycle at +enumeration, its mysql path is omitted from backup_paths and abra's `--retries` (which only retries the +restic step) can NEVER recover it. So a per-restic retry is useless here. + +FIX (chosen, harness-side, general for all DB recipes): after perform_backup, VERIFY the resulting +snapshot includes the db service's backupbot-labelled volume path; if missing, RE-INVOKE the whole +`abra app backup create` (fresh enumeration) up to N times. This closes the enumerate-during-cycle race +generally. Pair with recipe-PR mysql_backup.sh `set -o pipefail` + fail-loud-on-missing-dump so a +dump-less restore can never silently no-op again. (Still-open minor: the db cycle's own trigger during +backup — not OOM/not-healthcheck — left as a separate observation; the harness verify+retry makes the +backup correct regardless.) Implement next tick, then ghost full run to verify green incl upgrade.