feat(2): ghost P4 data-integrity overlay (MySQL ci_marker) + §4.3 create-post round-trip

- ops.py + test_{upgrade,backup,restore}.py: seed ci_marker into the MySQL `ghost` DB (db service) via the mysql CLI; rides the recipe's mysqldump --tab backup. recipe is MySQL not sqlite (stale comment fixed). Expect restore RED -> recipe-PR (no backupbot.restore hook; immich/mattermost class). - functional/_ghost.py: cookie-aware Ghost Admin API client (stdlib http.cookiejar; Origin CSRF hdr). - functional/test_post_roundtrip.py: §4.3 create published post + read back (unique marker, non-vacuous); closes the DEFERRED ghost create-post item. - PARITY.md + recipe_meta.py updated. Authored node-free; full-lifecycle run next, NOT yet claimed.
2026-05-30 04:14:06 +01:00
parent c8c3cc8858
commit b4d03ccafe
8 changed files with 355 additions and 14 deletions
--- a/tests/ghost/PARITY.md
+++ b/tests/ghost/PARITY.md
@ -16,22 +16,35 @@ and a JSON Content/Admin API at `/ghost/api/*`. Defining behaviors exercised:

 Two specific tests + parity health_check = ≥2 floor met.

-## Plan §4.3 prescribed deeper test (deferred to Q4 follow-up)
+## Plan §4.3 prescribed deeper test — AUTHORED (closes DEFERRED ghost create-post)

-§4.3 named "create-a-post round-trip" for ghost. That requires:
-1. Setup the Ghost owner (POST `/ghost/api/v3/admin/authentication/setup/`) with a per-run
-   admin email+password.
-2. Login → JWT bearer token.
-3. POST `/ghost/api/v3/admin/posts/` to create a post.
-4. GET `/ghost/api/v3/admin/posts/<id>/` to read it back.
+§4.3 named "create-a-post round-trip" for ghost. Implemented in
+`tests/ghost/functional/test_post_roundtrip.py` (helper `functional/_ghost.py`):
+1. Wait for the Admin API healthcheck (`GET /ghost/api/admin/site/` → 200).
+2. Setup the Ghost owner (POST `/ghost/api/admin/authentication/setup/`, fresh deploy) + establish
+   an admin **session cookie** (POST `/ghost/api/admin/session/`) — cookie-aware stdlib opener,
+   version-negotiated (no `/v3/` in the path; recipe-versioned).
+3. POST `/ghost/api/admin/posts/?source=html` to create a published post with a unique marker in
+   title + body.
+4. GET `/ghost/api/admin/posts/<id>/?formats=html` to read it back; assert title + body marker
+   round-trip intact (unique-per-run → non-vacuous).

-Doable; adds a per-run setup secret + token-management. Tracked for Q4 follow-up.
+Admin creds are class-B run-scoped (destroyed at teardown with the app).

-## Backup data-integrity (P4)
+## Backup data-integrity (P4) — AUTHORED

-Lifecycle overlays not authored. The base recipe stores state in SQLite + a content volume;
-backup-capable is auto-detected from compose. Q5 catch-up if backup data-integrity proves
-needed for this recipe.
+`ops.py` + `test_install`-free lifecycle overlays (`test_upgrade.py` / `test_backup.py` /
+`test_restore.py`) seed a deterministic `ci_marker` row into the **MySQL** `ghost` DB (the recipe's
+real state store) via the `mysql` CLI in the `db` service. The recipe's backupbot pre-hook
+(`mysqldump ghost --tab`) dumps that table into the backed-up path, so the marker rides
+backup→restore the way a real post's row would. pre_restore drops the table (divergence); the
+restore overlay asserts it returned.
+
+**Expected RED until a recipe-PR lands:** the ghost recipe has a logical mysqldump backup but **no
+`backupbot.restore.*` hook** (and the mysql data volume itself isn't backupbot-labelled), so a
+file-level restore never reimports the dump — same defect class fixed in immich#1 / mattermost-lts#1.
+If `test_restore_returns_state` goes RED, the durable fix is a recipe-PR adding a mysqldump-reimport
+restore post-hook. (See `test_restore.py` docstring + DECISIONS.md.)

 ## Playwright (P6)

--- a/tests/ghost/functional/_ghost.py
+++ b/tests/ghost/functional/_ghost.py
@ -0,0 +1,116 @@
+"""Shared ghost test helper — a cookie-aware Ghost Admin API client.
+
+Ghost's Admin API authenticates a human session via a cookie (`ghost-admin-api-session`) set by
+`POST /ghost/api/admin/session/`, and enforces a CSRF check requiring the `Origin` header to match
+the site's configured `url` on state-changing requests. The shared `runner/harness/http` helpers are
+deliberately cookie-less (stateless status+json), so this helper builds a stdlib
+`urllib` opener with an `HTTPCookieProcessor` so the session cookie persists across
+setup → login → create → read within the test process.
+
+Auth path (version-independent — no `/v3/`/`/v5/` in the URL; Ghost negotiates the API version, and
+the recipe under test may be any 5.x/6.x):
+  1. `POST /authentication/setup/` — creates the owner on a FRESH deploy (idempotent: a re-run finds
+     "setup already completed" and we ignore it).
+  2. `POST /session/` — establishes the admin session cookie (always done, so the client is
+     authenticated whether or not THIS process ran setup).
+  3. `POST /posts/?source=html` / `GET /posts/<id>/` — create + read back.
+
+Admin credentials are class-B run-scoped (deterministic within a run; the whole app — DB + secrets —
+is destroyed at teardown). Password is ≥10 chars per Ghost's setup requirement.
+"""
+
+from __future__ import annotations
+
+import http.cookiejar
+import json
+import ssl
+import urllib.error
+import urllib.request
+
+# Per-run *.ci.commoninternet.net domains use the operator wildcard cert via Traefik file provider;
+# the real-cert check is done once in the generic install assertion, so content/API calls skip the
+# chain check (same rationale as runner/harness/http._CTX).
+_CTX = ssl.create_default_context()
+_CTX.check_hostname = False
+_CTX.verify_mode = ssl.CERT_NONE
+
+ADMIN_NAME = "CCCI Admin"
+ADMIN_EMAIL = "ccci-admin@ccci.example.com"
+ADMIN_PW = "Ccci-Test-Pw-2026!"  # >=10 chars (Ghost setup requirement)
+BLOG_TITLE = "CCCI Test Blog"
+
+
+def _json(raw: bytes) -> object | None:
+    try:
+        return json.loads(raw)
+    except (json.JSONDecodeError, ValueError):
+        return None
+
+
+class GhostAdmin:
+    def __init__(self, domain: str):
+        self.base = f"https://{domain}/ghost/api/admin"
+        self.origin = f"https://{domain}"
+        self._jar = http.cookiejar.CookieJar()
+        self._opener = urllib.request.build_opener(
+            urllib.request.HTTPCookieProcessor(self._jar),
+            urllib.request.HTTPSHandler(context=_CTX),
+        )
+
+    def req(self, method: str, path: str, body: dict | None = None, timeout: int = 60):
+        url = f"{self.base}{path}"
+        data = json.dumps(body).encode() if body is not None else None
+        req = urllib.request.Request(url, data=data, method=method)
+        if data is not None:
+            req.add_header("Content-Type", "application/json")
+        # CSRF: Ghost requires Origin to match the configured site url on state-changing requests.
+        req.add_header("Origin", self.origin)
+        try:
+            with self._opener.open(req, timeout=timeout) as resp:
+                return resp.getcode(), _json(resp.read())
+        except urllib.error.HTTPError as e:
+            return e.code, _json(e.read())
+        except Exception as e:  # noqa: BLE001 — transport-level: surface as status 0
+            return 0, {"transport_error": str(e)}
+
+    def ensure_authenticated(self) -> None:
+        # Ensure the owner exists (fresh deploy → 201; already set up → 4xx, ignored).
+        self.req(
+            "POST",
+            "/authentication/setup/",
+            {
+                "setup": [
+                    {
+                        "name": ADMIN_NAME,
+                        "email": ADMIN_EMAIL,
+                        "password": ADMIN_PW,
+                        "blogTitle": BLOG_TITLE,
+                    }
+                ]
+            },
+        )
+        # Always establish a fresh admin session (cookie persists in self._jar).
+        status, body = self.req(
+            "POST", "/session/", {"username": ADMIN_EMAIL, "password": ADMIN_PW}
+        )
+        assert status in (200, 201), (
+            f"ghost admin session login failed: HTTP {status}, body={body!r}"
+        )
+
+    def create_post(self, title: str, html: str) -> dict:
+        status, body = self.req(
+            "POST",
+            "/posts/?source=html",
+            {"posts": [{"title": title, "html": html, "status": "published"}]},
+        )
+        assert status in (200, 201), f"create post failed: HTTP {status}, body={body!r}"
+        posts = (body or {}).get("posts") or []
+        assert posts and posts[0].get("id"), f"create post returned no id: {body!r}"
+        return posts[0]
+
+    def get_post(self, post_id: str) -> dict:
+        status, body = self.req("GET", f"/posts/{post_id}/?formats=html")
+        assert status == 200, f"read post failed: HTTP {status}, body={body!r}"
+        posts = (body or {}).get("posts") or []
+        assert posts, f"read post returned empty: {body!r}"
+        return posts[0]
--- a/tests/ghost/functional/test_post_roundtrip.py
+++ b/tests/ghost/functional/test_post_roundtrip.py
@ -0,0 +1,59 @@
+"""ghost — Q4.4 recipe-specific functional test (plan §4.3: "create the app's primary object — a
+post — and read it back").
+
+Exercises Ghost's core publishing path end-to-end against the live per-run deploy, via the real
+Admin API:
+  1. Wait for the Admin API to answer (the recipe's own healthcheck hits /ghost/api/admin/site/).
+  2. Bootstrap the owner on a fresh deploy + establish an admin session (_ghost.GhostAdmin).
+  3. POST /ghost/api/admin/posts/ to create a published post carrying a unique marker (title + body).
+  4. GET /ghost/api/admin/posts/<id>/ to read it back and assert the marker round-tripped intact.
+
+NOT health-only: a Ghost whose DB/Admin-API/publishing path is broken fails here even though `/`
+(themed front) and `/ghost/` (admin SPA shell) return 200. The marker is unique per run, so a stale
+or echoed response cannot pass. This closes the DEFERRED.md ghost "create-a-post round-trip" item.
+"""
+
+from __future__ import annotations
+
+import os
+import sys
+import uuid
+
+sys.path.insert(0, os.path.dirname(__file__))
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "..", "runner"))
+import _ghost  # noqa: E402
+from harness import http as harness_http  # noqa: E402
+
+
+def test_create_post_roundtrip(live_app):
+    # 1) The Admin API (and its DB migrations) may settle slightly after the themed front is up —
+    #    poll the recipe's own admin healthcheck endpoint before authenticating.
+    harness_http.retry_http_get(
+        f"https://{live_app}/ghost/api/admin/site/",
+        expect_status=200,
+        max_wait=120,
+        interval=10,
+    )
+
+    admin = _ghost.GhostAdmin(live_app)
+    admin.ensure_authenticated()
+
+    # 2-3) Create a published post with a unique marker in both title and body.
+    uniq = uuid.uuid4().hex[:10]
+    title = f"ccci-marker-{uniq}"
+    marker = f"ccci-body-marker-{uniq}-roundtrip"
+    created = admin.create_post(title, f"<p>{marker}</p>")
+    assert created.get("title") == title, (
+        f"created post title mismatch: sent {title!r}, got {created.get('title')!r}"
+    )
+
+    # 4) Read it back by id and assert the post survived the round-trip (title always returned;
+    #    html returned because we requested ?formats=html).
+    got = admin.get_post(created["id"])
+    assert got.get("title") == title, (
+        f"post title did not round-trip: sent {title!r}, got {got.get('title')!r}"
+    )
+    html = got.get("html") or ""
+    assert marker in html, (
+        f"post body did not round-trip: marker {marker!r} not in read-back html {html!r}"
+    )
--- a/tests/ghost/ops.py
+++ b/tests/ghost/ops.py
@ -0,0 +1,58 @@
+"""ghost — pre-op seed hooks (Phase 1e HC3 / Phase 2 P4 backup data-integrity).
+
+The orchestrator runs these BEFORE each op; the matching test_<op>.py asserts post-op (assertion
+only). The CURRENT ghost recipe (1.2.0+6.21.2-alpine) stores ALL its content — posts, users,
+settings — in a **MySQL** `ghost` database (compose `db` service, `mysql:8.0`), NOT sqlite (the
+older recipe_meta comment was stale; the live compose uses `database__client: mysql`). The recipe's
+`db` service is backupbot-labelled with a **logical dump** pre-hook
+(`mysqldump -u root -p... ghost --tab /var/lib/mysql-files/`, `backup.path=/var/lib/mysql-files/`),
+so the marker must live in the `ghost` database to ride that dump.
+
+We seed a dedicated `ci_marker` table (Ghost's own knex migrations never touch it) via the `mysql`
+CLI in the `db` service. MYSQL_PWD (not `-p<pw>`) avoids the client's "password on the command line
+is insecure" stderr noise; `-N -s` strips column names + box decoration so read-backs are clean
+scalars. The marker rides backup→restore the same way a real post's row would.
+"""
+
+import os
+import sys
+
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
+from harness import lifecycle  # noqa: E402
+
+
+def _mysql(domain, sql):
+    cmd = (
+        'MYSQL_PWD="$(cat /run/secrets/db_password)" '
+        f'mysql -u root -N -s ghost -e "{sql}"'
+    )
+    return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
+
+
+def _seed(domain, value):
+    _mysql(
+        domain,
+        "CREATE TABLE IF NOT EXISTS ci_marker(v VARCHAR(255)); DELETE FROM ci_marker; "
+        f"INSERT INTO ci_marker VALUES('{value}');",
+    )
+    got = _mysql(domain, "SELECT v FROM ci_marker;")
+    assert got == value, f"seed did not commit (read back {got!r}, expected {value!r})"
+
+
+def pre_upgrade(domain, meta):
+    _seed(domain, "upgrade-survives")
+
+
+def pre_backup(domain, meta):
+    _seed(domain, "original")
+
+
+def pre_restore(domain, meta):
+    # diverge from the backup so a successful restore is observable: drop the marker table.
+    _mysql(domain, "DROP TABLE IF EXISTS ci_marker;")
+    got = _mysql(
+        domain,
+        "SELECT COUNT(*) FROM information_schema.tables "
+        "WHERE table_schema='ghost' AND table_name='ci_marker';",
+    )
+    assert got == "0", f"drop did not take (information_schema still lists ci_marker: {got!r})"
--- a/tests/ghost/recipe_meta.py
+++ b/tests/ghost/recipe_meta.py
@ -1,13 +1,17 @@
 # Per-recipe harness config for ghost (Phase 2 Q4.4 — Node.js publishing platform).
 # Ghost serves an HTML site at `/`; admin UI at `/ghost/`. The first GET to /ghost/ redirects
 # to the setup wizard (302). Ghost exposes a JSON Content API at /ghost/api/content/ which
-# requires an API key; the Admin API at /ghost/api/admin/ requires auth tokens.
+# requires an API key; the Admin API at /ghost/api/admin/ requires a session/token (see
+# functional/_ghost.py — version-negotiated, no /v3/ path).
+# State lives in a **MySQL** `ghost` DB (compose `db` service, mysql:8.0) + the `ghost_content`
+# volume (themes/images) — NOT sqlite. The `db` service is backupbot-labelled with a logical
+# mysqldump pre-hook; P4 (ops.py + test_{backup,restore,upgrade}.py) seeds a `ci_marker` row there.
 HEALTH_PATH = "/"  # Ghost serves a themed site HTML at root (200)
 HEALTH_OK = (200,)
 DEPLOY_TIMEOUT = 1200  # subprocess timeout for `abra app deploy` (cold-start ghost ~15-20min)
 HTTP_TIMEOUT = 900

-# Ghost's first-boot does theme + DB migrations on a fresh sqlite volume; default TIMEOUT=300
+# Ghost's first-boot does theme + DB migrations against a fresh MySQL `ghost` DB; default TIMEOUT=300
 # (abra's internal convergence wait) is too tight on cc-ci's single node. Bump to 1200s, matched
 # to DEPLOY_TIMEOUT so abra finishes its convergence wait before the Python subprocess timeout.
 EXTRA_ENV = {"TIMEOUT": "1200"}
--- a/tests/ghost/test_backup.py
+++ b/tests/ghost/test_backup.py
@ -0,0 +1,28 @@
+"""ghost — BACKUP overlay (Phase 1e HC3 / Phase 2 P4): assertion-only + additive.
+
+ops.pre_backup seeded ci_marker='original' into the MySQL `ghost` DB before the backup op (the
+recipe's backupbot pre-hook runs `mysqldump ... ghost --tab /var/lib/mysql-files/`, dumping every
+`ghost` table — including ci_marker — into the backed-up path). The orchestrator performed the
+backup once. This overlay ADDS: the seeded row is intact in the live DB at backup time. The
+backup→restore divergence (dropping the table) is in ops.pre_restore.
+"""
+
+import os
+import sys
+
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
+from harness import lifecycle  # noqa: E402
+
+
+def _mysql(domain, sql):
+    cmd = (
+        'MYSQL_PWD="$(cat /run/secrets/db_password)" '
+        f'mysql -u root -N -s ghost -e "{sql}"'
+    )
+    return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
+
+
+def test_backup_captures_state(live_app):
+    assert _mysql(live_app, "SELECT v FROM ci_marker;") == "original", (
+        "the seeded ghost MySQL marker was not present at backup time"
+    )
--- a/tests/ghost/test_restore.py
+++ b/tests/ghost/test_restore.py
@ -0,0 +1,36 @@
+"""ghost — RESTORE overlay (Phase 1e HC3 / Phase 2 P4): data-integrity, assertion-only + additive.
+
+ops.pre_restore dropped the ci_marker table (diverge from the backup); the orchestrator restored
+once. This overlay ADDS: the restored DB carries the pre-mutation 'original' marker — proving the
+seeded data actually survived backup→restore, not just that the service came back up.
+
+NOTE (expected RED until a recipe-PR lands): the current ghost recipe backs the DB up as a LOGICAL
+mysqldump (`--tab` → SQL files under /var/lib/mysql-files/) but ships **no `backupbot.restore.*`
+hook**, and the actual mysql data volume is NOT itself backupbot-labelled. So a file-level restore
+puts the dump files back on disk but never reimports them into the running MySQL → the dropped marker
+does not return. This is the SAME defect class cc-ci already caught + fixed in immich and
+mattermost-lts (pg_dump backup with no reimport-on-restore). If this test goes RED, the fix is a
+recipe-PR adding a restore post-hook that reimports the dump (terminate/recreate `ghost` DB +
+`mysql ghost < dump` / `mysqlimport`), mirroring the mattermost-lts#1 / immich#1 pattern.
+"""
+
+import os
+import sys
+
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
+from harness import lifecycle  # noqa: E402
+
+
+def _mysql(domain, sql):
+    cmd = (
+        'MYSQL_PWD="$(cat /run/secrets/db_password)" '
+        f'mysql -u root -N -s ghost -e "{sql}"'
+    )
+    return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
+
+
+def test_restore_returns_state(live_app):
+    assert _mysql(live_app, "SELECT v FROM ci_marker;") == "original", (
+        "restore did not return the pre-mutation ghost MySQL marker (data-integrity failure — "
+        "recipe likely lacks a mysqldump-reimport restore hook; see module docstring)"
+    )
--- a/tests/ghost/test_upgrade.py
+++ b/tests/ghost/test_upgrade.py
@ -0,0 +1,27 @@
+"""ghost — UPGRADE overlay (Phase 1e HC3 / Phase 2 P4): assertion-only + additive.
+
+ops.pre_upgrade seeded ci_marker='upgrade-survives' into the MySQL `ghost` DB before the upgrade op
+(HC1 chaos redeploy prev-published → PR head). The mysql data volume persists across the redeploy,
+so the seeded row must still be there afterwards — proving the upgrade preserved app data, not just
+that the new version came up healthy. Read via the `mysql` CLI in the `db` service.
+"""
+
+import os
+import sys
+
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
+from harness import lifecycle  # noqa: E402
+
+
+def _mysql(domain, sql):
+    cmd = (
+        'MYSQL_PWD="$(cat /run/secrets/db_password)" '
+        f'mysql -u root -N -s ghost -e "{sql}"'
+    )
+    return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
+
+
+def test_upgrade_preserves_state(live_app):
+    assert _mysql(live_app, "SELECT v FROM ci_marker;") == "upgrade-survives", (
+        "the seeded ghost MySQL marker did not survive the upgrade redeploy (data loss on upgrade)"
+    )