From 01fd43bcd5fd8cd2f7820ac89334088c499bc396 Mon Sep 17 00:00:00 2001
From: autonomic-bot <autonomic-bot@git.autonomic.zone>
Date: Sat, 30 May 2026 20:14:13 +0000
Subject: [PATCH] =?UTF-8?q?journal(2):=20ghost=20full5=20restore=20RED=20(?=
 =?UTF-8?q?ci=5Fmarker=20absent)=20=E2=80=94=20full6=20instrumented=20re-r?=
 =?UTF-8?q?un=20to=20characterize=20flaky=20vs=20systematic?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 machine-docs/JOURNAL-2.md | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/machine-docs/JOURNAL-2.md b/machine-docs/JOURNAL-2.md
index 02412b0..a5699a2 100644
--- a/machine-docs/JOURNAL-2.md
+++ b/machine-docs/JOURNAL-2.md
@@ -1368,3 +1368,24 @@ migration — pure can't-connect), so the only cost is the mysql-init head start
 Fix (4a160f6): bump ghost DEPLOY_TIMEOUT + EXTRA_ENV TIMEOUT 1200→2400s (matches discourse). Not a
 test weakening — the wait is bounded; a genuine hang still fails at 40min. Teardown after full4 was
 clean (no leftover stack/volume/secret). Re-running as full5.
+
+## 2026-05-30T20:10Z — ghost full5: P4 restore RED (ci_marker table absent post-restore) — investigating
+
+full5 (`/root/ccci-ghost-full5.log`, 2400s timeout): deploy-count=1, install/upgrade/backup/custom
+PASS, **restore FAIL**. `test_restore_returns_state`: `Table 'ghost.ci_marker' doesn't exist` after
+restore (generic test_restore_healthy PASSED → app up; my P4 overlay caught a data-integrity gap).
+
+Recipe-PR head ae43ffe DOES ship the reimport hook (verified ~/.abra/recipes/ghost on cc-ci):
+compose db service has `backupbot.backup.pre-hook=/mysql_backup.sh backup`,
+`backupbot.backup.volumes.mysql.path=backup.sql.gz`, `backupbot.restore.post-hook=/mysql_backup.sh
+restore`; mysql_backup.sh restore = `gunzip -c /var/lib/mysql/backup.sql.gz | mysql -u root`.
+
+Puzzle: full3 (17:23, app-ONLY overlay, db@native 1m) was FULLY GREEN incl restore; full5 (after
+3ca45c7 added db@15m grace) regressed on restore. db-grace was observed-necessary (run#2 mysql-init
+exit-137 redo-corruption deadlock under load), so I can't just drop it. But db start_period only
+changes WHEN swarm marks unhealthy — it shouldn't mechanically break the reimport. So leading
+hypotheses: (a) load-dependent flake in backupbot restore / the reimport; (b) recipe-hook robustness
+gap — `gunzip -c | mysql` has `set -e` but NOT `set -o pipefail`, so a failed/empty gunzip silently
+reimports nothing yet returns 0. Action: full6 re-run + instrument the restore tier live (capture
+backupbot restore output, backup.sql.gz presence, whether reimport populated ci_marker). NOT claiming
+ghost until restore is reliably green. Stack/vol teardown after full5 was clean.