feat(1d): migrate keycloak/cryptpad/matrix-synapse/n8n/lasuite-docs overlays to deploy-once contract (DG7)

Mechanical port to the assertion-only contract (no softened/skipped assertions): install uses
live_app + generic.assert_serving (extend) + the recipe's http/playwright/api checks; upgrade seeds
its data marker then generic.do_upgrade + asserts survival; backup/restore split into test_backup.py
(seed->do_backup->mutate) + new test_restore.py (do_restore->assert original). Recipe-specifics
preserved verbatim (keycloak realm+admin-console+kc_admin, matrix/lasuite db-service psql markers,
cryptpad/n8n volume markers). No recipe now double-deploys under the deploy-once orchestrator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 01:32:53 +01:00
parent 9b5bcff92a
commit afd75a48db
21 changed files with 315 additions and 325 deletions

View File

@ -25,9 +25,8 @@ per-recipe overlay authoring is Phase 2.
**Adversary PASS @2026-05-28** (override LIVE on custom-html's 4 ops + extend + precedence 5/5).
- [x] **DG4.1** — Overlays reuse the deployment: ONE deploy + ONE teardown per run; no extra
new/deploy/undeploy (assert via deploy-count). **Adversary PASS @2026-05-28** (deploy-count=1).
- [~] **DG5** — Custom install-steps hook + graceful-generic rule; fail-without / pass-with proof.
**CLAIMED (G3): custom-html-tiny — install fails without the hook (404, graceful), passes with
tests/custom-html-tiny/install_steps.sh seeding content.**
- [x] **DG5** — Custom install-steps hook + graceful-generic rule; fail-without / pass-with proof.
**Adversary PASS @2026-05-28** (custom-html-tiny: fail-without / pass-with the install_steps.sh hook).
- [ ] **DG6**`!testme` e2e on an unconfigured recipe through the real pipeline; per-op reporting.
- [ ] **DG7** — Real, DRY, clean: no softened/skip/xfail assertions; generic in the shared harness;
teardown always; respects MAX_TESTS.
@ -58,13 +57,8 @@ move-assertion so a no-op can't pass), awaiting Adversary re-test+close.
**G2 (DG4+DG4.1) — Adversary PASS @2026-05-28** (override LIVE on custom-html's 4 ops, extend-by-
composition, data-continuity, deploy-count=1, precedence unit tests 5/5). No VETO.
**Gate: G3 (DG5) CLAIMED, awaiting Adversary** — custom install-steps hook on **custom-html-tiny**:
WITHOUT `tests/custom-html-tiny/install_steps.sh` the generic install FAILS (404, graceful-generic —
reported per-op, not a crash); WITH it (seeds index.html into the content volume pre-deploy) install
PASSES. The same Run B also demonstrates DG3's N/A-skip: custom-html-tiny is non-backup-capable, so
backup/restore report **skip** while install/upgrade pass (deploy-count=1). Evidence in JOURNAL-1d.
Reproduce (cold): run `RECIPE=custom-html-tiny STAGES=install …` with the hook absent (install:fail)
then present (install:pass, backup/restore:skip).
**G3 (DG5 + DG3 N/A-skip) — Adversary PASS @2026-05-28.** No VETO. DG1DG5 all Adversary-verified;
F1d-1 + F1d-2 closed. Only G4 (DG6 e2e + DG7 no-regression/DRY + DG8 docs + cold-verify) remains.
Design (DECISIONS.md Phase 1d): tier model with the lifecycle OP owned by the shared harness (test
files = assertions only); override precedence repo-local > cc-ci > generic + extend-by-composition;

View File

@ -1,5 +1,7 @@
"""cryptpad — backup/restore stage (D2): write a marker into the backed-up cryptpad_data volume,
backup, mutate, restore, assert the restored state matches the pre-mutation (backed-up) state.
"""cryptpad — BACKUP overlay (Phase 1d, DG4): seed a known state into the backed-up cryptpad_data
volume, back it up (assert a snapshot artifact), then mutate so the RESTORE overlay (test_restore.py)
can prove the backed-up state returns. Runs on the shared deployment; the mutated marker persists for
the restore tier.
The cryptpad `app` service is labelled `backupbot.backup=true`, so its volumes (incl. cryptpad_data)
are backed up. Marker is checked via `exec_in_app` (data isn't HTTP-served)."""
@ -8,32 +10,21 @@ import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
MARKER = "/cryptpad/data/ci-marker.txt"
def test_backup_mutate_restore(deployed, meta):
domain = deployed
def test_backup_captures_state(live_app, meta):
domain = live_app
# 1) establish original state in the backed-up volume, then back it up
# 1) establish original state in the backed-up volume, then back it up (reuse the generic op:
# backup + assert a snapshot artifact was produced)
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo original > {MARKER}"])
assert lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "original"
lifecycle.backup_app(domain)
snap = generic.do_backup(domain)
assert snap, "backup produced no snapshot artifact"
# 2) mutate state (diverge from the backup)
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo mutated > {MARKER}"])
assert lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "mutated"
# 3) restore -> state returns to the backed-up "original"
lifecycle.restore_app(domain)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
assert (
lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "original"
), "restore did not return the pre-mutation state"

View File

@ -1,23 +1,29 @@
"""cryptpad — install stage (recipe #3, stateful/no-DB). D2 install + D3 Playwright."""
"""cryptpad — INSTALL overlay (Phase 1d, DG4): override + extend-by-composition.
Reuses the generic "really serving" assertion, then ADDS the recipe-specific checks: cryptpad answers
over real HTTPS through the gateway, and a real browser loads the live cryptpad landing page and sees
its served app (D2 install + D3 Playwright). Assertion-only on the shared deployment."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def test_http_reachable(deployed_app):
"""cryptpad answers over real HTTPS through the gateway (nginx -> cryptpad app)."""
status = lifecycle.http_get(deployed_app, "/")
assert status in (200, 301, 302), f"expected 2xx/3xx from {deployed_app}, got {status}"
def test_serving_and_content(live_app, meta):
# extend-by-composition: reuse the generic "really serving" assertion first ...
generic.assert_serving(live_app, meta)
# ... then the recipe-specific assertions.
# cryptpad answers over real HTTPS through the gateway (nginx -> cryptpad app).
status = lifecycle.http_get(live_app, "/")
assert status in (200, 301, 302), f"expected 2xx/3xx from {live_app}, got {status}"
def test_playwright_loads_cryptpad(deployed_app):
"""A real browser loads the live cryptpad landing page and sees its served app."""
# A real browser loads the live cryptpad landing page and sees its served app.
from playwright.sync_api import sync_playwright
url = f"https://{deployed_app}/"
url = f"https://{live_app}/"
with sync_playwright() as p:
browser = p.chromium.launch(args=["--no-sandbox"])
try:

View File

@ -0,0 +1,24 @@
"""cryptpad — RESTORE overlay (Phase 1d, DG4): data-integrity, extends the generic restore.
Runs after the backup overlay (test_backup.py) on the SAME shared deployment, which left the
cryptpad_data marker mutated to "mutated" after backing up "original". This restores the snapshot via
the shared op helper (`generic.do_restore`, which also asserts the app is healthy + serving
afterwards), then asserts the volume data returned to the pre-mutation "original" — the app-specific
data integrity the generic restore cannot check. Reads the marker via `exec_in_app` (data isn't
HTTP-served). Assertion-only (no deploy/teardown)."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import generic, lifecycle # noqa: E402
MARKER = "/cryptpad/data/ci-marker.txt"
def test_restore_returns_state(live_app, meta):
domain = live_app
generic.do_restore(domain, meta) # restore + assert healthy/serving
assert (
lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "original"
), "restore did not return the pre-mutation state"

View File

@ -1,53 +1,28 @@
"""cryptpad — upgrade stage (D2): deploy the previous published version, write a data marker into a
persistent volume, upgrade to current/$REF, assert the app stays healthy and the data survives.
"""cryptpad — UPGRADE overlay (Phase 1d, DG4): data-continuity, extends the generic upgrade.
cryptpad data isn't HTTP-served as a static file (it's an encrypted datastore), so the marker is
written into the cryptpad_data volume and read back via `exec_in_app` (docker exec), not HTTP."""
The orchestrator deployed the previous published version ONCE; this overlay writes a marker into the
persistent cryptpad_data volume (cryptpad data isn't HTTP-served as a static file — it's an encrypted
datastore — so the marker is read back via `exec_in_app`, not HTTP), performs the in-place upgrade via
the shared op helper (`generic.do_upgrade`, which also asserts reconverge + serving + that the
deployment moved), then asserts the data SURVIVED. Assertion-only on the shared deployment."""
import os
import sys
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
MARKER = "/cryptpad/data/ci-marker.txt"
@pytest.fixture
def old_app(recipe, app_domain, meta, request):
prev = lifecycle.previous_version(recipe)
if not prev:
pytest.skip(f"{recipe}: no previous published version to upgrade from")
lifecycle.janitor()
request.addfinalizer(lambda: lifecycle.teardown_app(app_domain))
lifecycle.deploy_app(recipe, app_domain, version=prev)
lifecycle.wait_healthy(
app_domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
return app_domain, prev
def test_upgrade_preserves_data(old_app, meta):
domain, prev = old_app
def test_upgrade_preserves_data(live_app, meta):
domain = live_app
# write a data marker into the persistent cryptpad_data volume
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo upgrade-survives > {MARKER}"])
assert lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "upgrade-survives"
# upgrade previous -> current/$REF
lifecycle.upgrade_app(domain, version=os.environ.get("VERSION") or None)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
# in-place upgrade previous -> target (reuses the generic op: upgrade + assert reconverge/serving)
generic.do_upgrade(domain, os.environ.get("VERSION") or None, meta)
# app healthy and the data written before the upgrade is still there
assert lifecycle.http_get(domain, "/") in (200, 301, 302)

View File

@ -1,32 +1,27 @@
"""keycloak — backup/restore stage (D2): create a realm, backup, delete it (mutate), restore,
assert the realm is back (mariadb restored to the backed-up state)."""
"""keycloak — BACKUP overlay (Phase 1d, DG4): seed a known state (the marker realm in mariadb),
back it up (assert a snapshot artifact), then mutate (delete the realm) so the RESTORE overlay
(test_restore.py) can prove the backed-up state returns. Runs on the shared deployment; the mutated
state persists for the restore tier."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
import kc_admin # noqa: E402
from harness import lifecycle # noqa: E402
from harness import generic # noqa: E402
def test_backup_mutate_restore(deployed):
domain = deployed
def test_backup_captures_state(live_app, meta):
domain = live_app
pw = kc_admin.admin_password(domain)
tok = kc_admin.admin_token(domain, pw)
# 1) create the marker realm, then back up
# 1) create the marker realm, then back up (reuse the generic op: backup + assert a snapshot)
assert kc_admin.create_marker_realm(domain, tok) in (201, 409)
assert kc_admin.marker_realm_exists(domain, tok)
lifecycle.backup_app(domain)
snap = generic.do_backup(domain)
assert snap, "backup produced no snapshot artifact"
# 2) mutate: delete the realm
# 2) mutate: delete the realm (diverge from the backup)
assert kc_admin.delete_marker_realm(domain, tok) in (204, 200)
assert not kc_admin.marker_realm_exists(domain, tok), "delete did not take"
# 3) restore -> realm returns
lifecycle.restore_app(domain)
lifecycle.wait_healthy(
domain, path="/realms/master", ok_codes=(200,), deploy_timeout=600, http_timeout=600
)
tok2 = kc_admin.admin_token(domain, pw)
assert kc_admin.marker_realm_exists(domain, tok2), "restore did not bring back the realm"

View File

@ -1,22 +1,28 @@
"""keycloak — install stage (recipe #2, DB-backed SSO; D2 install + D3 Playwright)."""
"""keycloak — INSTALL overlay (Phase 1d, DG4): override + extend-by-composition.
Reuses the generic "really serving" assertion, then ADDS the recipe-specific checks: the master
realm endpoint answers 200 over HTTPS (keycloak + mariadb are up), and a real browser loads the
keycloak admin console (D2 install + D3 Playwright). Assertion-only on the shared deployment."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def test_realm_endpoint_healthy(deployed_app):
"""The master realm endpoint answers 200 over HTTPS (keycloak + mariadb are up)."""
assert lifecycle.http_get(deployed_app, "/realms/master") == 200
def test_serving_and_admin_console(live_app, meta):
# extend-by-composition: reuse the generic "really serving" assertion first ...
generic.assert_serving(live_app, meta)
# ... then the recipe-specific assertions.
# The master realm endpoint answers 200 over HTTPS (keycloak + mariadb are up).
assert lifecycle.http_get(live_app, "/realms/master") == 200
def test_playwright_admin_login(deployed_app):
"""A real browser loads the keycloak admin console (renders the sign-in UI)."""
# A real browser loads the keycloak admin console (renders the sign-in UI).
from playwright.sync_api import sync_playwright
url = f"https://{deployed_app}/admin/master/console/"
url = f"https://{live_app}/admin/master/console/"
with sync_playwright() as p:
browser = p.chromium.launch(args=["--no-sandbox"])
try:

View File

@ -0,0 +1,22 @@
"""keycloak — RESTORE overlay (Phase 1d, DG4): data-integrity, extends the generic restore.
Runs after the backup overlay (test_backup.py) on the SAME shared deployment, which left the marker
realm deleted after backing it up. This restores the snapshot via the shared op helper
(`generic.do_restore`, which also asserts the app is healthy + serving afterwards), then asserts the
marker realm returned (mariadb restored to the backed-up state) — the app-specific data integrity
the generic restore cannot check. Assertion-only (no deploy/teardown)."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
import kc_admin # noqa: E402
from harness import generic # noqa: E402
def test_restore_returns_state(live_app, meta):
domain = live_app
generic.do_restore(domain, meta) # restore + assert healthy/serving
pw = kc_admin.admin_password(domain)
tok = kc_admin.admin_token(domain, pw)
assert kc_admin.marker_realm_exists(domain, tok), "restore did not bring back the realm"

View File

@ -1,49 +1,27 @@
"""keycloak — upgrade stage (D2): deploy previous version, create a realm (DB data), upgrade to
current/$REF, assert the app is healthy and the realm survived (mariadb data preserved)."""
"""keycloak — UPGRADE overlay (Phase 1d, DG4): data-continuity, extends the generic upgrade.
The orchestrator deployed the previous published version ONCE; this overlay creates a marker realm
(DB data in mariadb) on the live app, performs the in-place upgrade via the shared op helper
(`generic.do_upgrade`, which also asserts reconverge + serving + that the deployment moved), then
asserts the realm SURVIVED (mariadb data preserved). Assertion-only on the shared deployment."""
import os
import sys
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
import kc_admin # noqa: E402
from harness import lifecycle # noqa: E402
from harness import generic # noqa: E402
@pytest.fixture
def old_app(recipe, app_domain, meta, request):
prev = lifecycle.previous_version(recipe)
if not prev:
pytest.skip(f"{recipe}: no previous published version")
lifecycle.janitor()
request.addfinalizer(lambda: lifecycle.teardown_app(app_domain))
lifecycle.deploy_app(recipe, app_domain, version=prev)
lifecycle.wait_healthy(
app_domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
return app_domain, prev
def test_upgrade_preserves_realm(old_app, meta):
domain, prev = old_app
def test_upgrade_preserves_realm(live_app, meta):
domain = live_app
pw = kc_admin.admin_password(domain)
tok = kc_admin.admin_token(domain, pw)
assert kc_admin.create_marker_realm(domain, tok) in (201, 409)
assert kc_admin.marker_realm_exists(domain, tok), "marker realm not created"
lifecycle.upgrade_app(domain, version=os.environ.get("VERSION") or None)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
# in-place upgrade previous -> target (reuses the generic op: upgrade + assert reconverge/serving)
generic.do_upgrade(domain, os.environ.get("VERSION") or None, meta)
# re-auth (token from the old instance is fine, but get a fresh one post-upgrade) and verify
tok2 = kc_admin.admin_token(domain, pw)

View File

@ -1,5 +1,7 @@
"""lasuite-docs — backup/restore stage (D2): write a postgres marker, backup (pg_backup.sh pre-hook
dumps the DB), mutate (drop it), restore (post-hook reloads), assert the restored DB matches.
"""lasuite-docs — BACKUP overlay (Phase 1d, DG4): seed a postgres marker, back it up (pg_backup.sh
pre-hook dumps the DB; assert a snapshot artifact), then mutate (drop it) so the RESTORE overlay
(test_restore.py) can prove the backed-up state returns. Runs on the shared deployment; the mutated
state persists for the restore tier.
Exercises the recipe's real DB-dump backup hook (postgres + minio are both backupbot-labelled); the
postgres marker is the meaningful Docs-metadata data path."""
@ -8,7 +10,7 @@ import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def _psql(domain, sql):
@ -16,31 +18,23 @@ def _psql(domain, sql):
return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
def test_backup_mutate_restore(deployed, meta):
domain = deployed
def test_backup_captures_state(live_app, meta):
domain = live_app
# 1) establish original state in postgres, then back up (reuse the generic op: backup +
# assert a snapshot artifact; pg_backup.sh dumps the DB)
_psql(
domain,
"CREATE TABLE IF NOT EXISTS ci_marker(v text); DELETE FROM ci_marker; "
"INSERT INTO ci_marker VALUES('original');",
)
assert _psql(domain, "SELECT v FROM ci_marker;") == "original"
lifecycle.backup_app(domain)
snap = generic.do_backup(domain)
assert snap, "backup produced no snapshot artifact"
# 2) mutate: drop the marker table (diverge from the backup)
_psql(domain, "DROP TABLE ci_marker;")
assert _psql(domain, "SELECT to_regclass('public.ci_marker');") in (
"",
"NULL",
), "drop did not take"
lifecycle.restore_app(domain)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
assert (
_psql(domain, "SELECT v FROM ci_marker;") == "original"
), "restore did not return the pre-mutation postgres state"

View File

@ -1,27 +1,30 @@
"""lasuite-docs — install stage (recipe #5, multi-service + object-storage/S3). D2 install: the
multi-service stack (frontend + Django backend + celery + y-provider + docspec + postgres + redis +
minio + nginx) converges and serves the app over real HTTPS through the gateway.
"""lasuite-docs — INSTALL overlay (Phase 1d, DG4): override + extend-by-composition.
Login is OIDC-gated (no live OIDC provider in CI), so the functional assertion is that the frontend
SPA is served (unauthenticated landing), not an authenticated flow."""
Reuses the generic "really serving" assertion, then ADDS the recipe-specific checks: the multi-service
stack serves over real HTTPS through the gateway, and a real browser loads the live Docs frontend (the
SPA shell). Login is OIDC-gated (no live OIDC provider in CI), so the functional assertion is that the
frontend SPA is served (unauthenticated landing), not an authenticated flow. Assertion-only on the
shared deployment."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def test_http_reachable(deployed_app):
status = lifecycle.http_get(deployed_app, "/")
assert status in (200, 301, 302), f"expected 2xx/3xx from {deployed_app}, got {status}"
def test_serving_and_frontend(live_app, meta):
# extend-by-composition: reuse the generic "really serving" assertion first ...
generic.assert_serving(live_app, meta)
# ... then the recipe-specific assertions.
status = lifecycle.http_get(live_app, "/")
assert status in (200, 301, 302), f"expected 2xx/3xx from {live_app}, got {status}"
def test_playwright_loads_frontend(deployed_app):
"""A real browser loads the live Docs frontend (the SPA shell) over HTTPS."""
# A real browser loads the live Docs frontend (the SPA shell) over HTTPS.
from playwright.sync_api import sync_playwright
url = f"https://{deployed_app}/"
url = f"https://{live_app}/"
with sync_playwright() as p:
browser = p.chromium.launch(args=["--no-sandbox"])
try:

View File

@ -0,0 +1,27 @@
"""lasuite-docs — RESTORE overlay (Phase 1d, DG4): data-integrity, extends the generic restore.
Runs after the backup overlay (test_backup.py) on the SAME shared deployment, which left the postgres
marker table dropped after dumping it. This restores the snapshot via the shared op helper
(`generic.do_restore`, which also asserts the app is healthy + serving afterwards; the recipe's
restore.post-hook reloads the dump), then asserts the restored DB matches the pre-mutation "original"
— the app-specific data integrity the generic restore cannot check. Reads via `psql` in the `db`
service. Assertion-only (no deploy/teardown)."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import generic, lifecycle # noqa: E402
def _psql(domain, sql):
cmd = f'PGPASSWORD=$(cat /run/secrets/postgres_p) psql -U docs -d docs -tAc "{sql}"'
return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
def test_restore_returns_state(live_app, meta):
domain = live_app
generic.do_restore(domain, meta) # restore + assert healthy/serving
assert (
_psql(domain, "SELECT v FROM ci_marker;") == "original"
), "restore did not return the pre-mutation postgres state"

View File

@ -1,16 +1,16 @@
"""lasuite-docs — upgrade stage (D2): deploy the previous published version, write a DB marker,
upgrade to current/$REF, assert the app stays healthy and the postgres data survives.
"""lasuite-docs — UPGRADE overlay (Phase 1d, DG4): data-continuity, extends the generic upgrade.
Docs metadata lives in postgres, so the marker is a row in a dedicated `ci_marker` table (the app's
own Django migrations don't touch it), read back via `psql` in the `db` service."""
The orchestrator deployed the previous published version ONCE; this overlay writes a marker row into
postgres (a dedicated `ci_marker` table the app's own Django migrations don't touch, read back via
`psql` in the `db` service), performs the in-place upgrade via the shared op helper
(`generic.do_upgrade`, which also asserts reconverge + serving + that the deployment moved), then
asserts the postgres data SURVIVED. Assertion-only on the shared deployment."""
import os
import sys
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def _psql(domain, sql):
@ -18,26 +18,8 @@ def _psql(domain, sql):
return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
@pytest.fixture
def old_app(recipe, app_domain, meta, request):
prev = lifecycle.previous_version(recipe)
if not prev:
pytest.skip(f"{recipe}: no previous published version to upgrade from")
lifecycle.janitor()
request.addfinalizer(lambda: lifecycle.teardown_app(app_domain))
lifecycle.deploy_app(recipe, app_domain, version=prev)
lifecycle.wait_healthy(
app_domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
return app_domain, prev
def test_upgrade_preserves_data(old_app, meta):
domain, prev = old_app
def test_upgrade_preserves_data(live_app, meta):
domain = live_app
_psql(
domain,
"CREATE TABLE IF NOT EXISTS ci_marker(v text); DELETE FROM ci_marker; "
@ -45,14 +27,8 @@ def test_upgrade_preserves_data(old_app, meta):
)
assert _psql(domain, "SELECT v FROM ci_marker;") == "upgrade-survives"
lifecycle.upgrade_app(domain, version=os.environ.get("VERSION") or None)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
# in-place upgrade previous -> target (reuses the generic op: upgrade + assert reconverge/serving)
generic.do_upgrade(domain, os.environ.get("VERSION") or None, meta)
assert lifecycle.http_get(domain, "/") in (200, 301, 302)
assert (

View File

@ -1,6 +1,7 @@
"""matrix-synapse — backup/restore stage (D2): write a postgres marker, backup (the recipe's
pg_backup.sh pre-hook dumps the DB to backup.sql), mutate (drop the marker), restore (post-hook
reloads the dump), assert the restored DB matches the pre-mutation state.
"""matrix-synapse — BACKUP overlay (Phase 1d, DG4): seed a postgres marker, back it up (the recipe's
pg_backup.sh pre-hook dumps the DB to backup.sql; assert a snapshot artifact), then mutate (drop the
marker) so the RESTORE overlay (test_restore.py) can prove the backed-up state returns. Runs on the
shared deployment; the mutated state persists for the restore tier.
This exercises the real DB-dump backup hook (backupbot.backup.pre-hook / restore.post-hook), not a
plain volume copy — the meaningful data path for a postgres-backed app."""
@ -9,7 +10,7 @@ import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def _psql(domain, sql):
@ -17,17 +18,19 @@ def _psql(domain, sql):
return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
def test_backup_mutate_restore(deployed, meta):
domain = deployed
def test_backup_captures_state(live_app, meta):
domain = live_app
# 1) establish original state in postgres, then back up (pg_backup.sh dumps the DB)
# 1) establish original state in postgres, then back up (reuse the generic op: backup +
# assert a snapshot artifact; pg_backup.sh dumps the DB)
_psql(
domain,
"CREATE TABLE IF NOT EXISTS ci_marker(v text); DELETE FROM ci_marker; "
"INSERT INTO ci_marker VALUES('original');",
)
assert _psql(domain, "SELECT v FROM ci_marker;") == "original"
lifecycle.backup_app(domain)
snap = generic.do_backup(domain)
assert snap, "backup produced no snapshot artifact"
# 2) mutate: drop the marker table (diverge from the backup)
_psql(domain, "DROP TABLE ci_marker;")
@ -35,16 +38,3 @@ def test_backup_mutate_restore(deployed, meta):
"",
"NULL",
), "drop did not take"
# 3) restore -> the dumped DB (with the marker) is reloaded
lifecycle.restore_app(domain)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
assert (
_psql(domain, "SELECT v FROM ci_marker;") == "original"
), "restore did not return the pre-mutation postgres state"

View File

@ -1,23 +1,30 @@
"""matrix-synapse — install stage (recipe #4, DB + media store). D2 install: the synapse client API
answers 200 over real HTTPS through the gateway (nginx -> synapse). The base recipe has no browser
UI (element-web is an addon), so the functional assertion is the JSON client API, not Playwright."""
"""matrix-synapse — INSTALL overlay (Phase 1d, DG4): override + extend-by-composition.
Reuses the generic "really serving" assertion, then ADDS the recipe-specific checks: the synapse
client API answers 200 over real HTTPS through the gateway, and the client-API version document is
real synapse JSON (proves the app, not just a proxy 200). The base recipe has no browser UI
(element-web is an addon), so the functional assertion is the JSON client API, not Playwright.
Assertion-only on the shared deployment."""
import json
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def test_client_api_healthy(deployed_app):
status = lifecycle.http_get(deployed_app, "/_matrix/client/versions")
assert status == 200, f"expected 200 from {deployed_app}/_matrix/client/versions, got {status}"
def test_serving_and_client_api(live_app, meta):
# extend-by-composition: reuse the generic "really serving" assertion first ...
generic.assert_serving(live_app, meta)
# ... then the recipe-specific assertions.
# The synapse client API answers 200 over real HTTPS through the gateway (nginx -> synapse).
status = lifecycle.http_get(live_app, "/_matrix/client/versions")
assert status == 200, f"expected 200 from {live_app}/_matrix/client/versions, got {status}"
def test_client_api_advertises_versions(deployed_app):
"""The client-API version document is real synapse JSON (proves the app, not just a proxy 200)."""
body = lifecycle.http_body(deployed_app, "/_matrix/client/versions")
# The client-API version document is real synapse JSON (proves the app, not just a proxy 200).
body = lifecycle.http_body(live_app, "/_matrix/client/versions")
doc = json.loads(body)
assert (
isinstance(doc.get("versions"), list) and doc["versions"]

View File

@ -0,0 +1,27 @@
"""matrix-synapse — RESTORE overlay (Phase 1d, DG4): data-integrity, extends the generic restore.
Runs after the backup overlay (test_backup.py) on the SAME shared deployment, which left the postgres
marker table dropped after dumping it. This restores the snapshot via the shared op helper
(`generic.do_restore`, which also asserts the app is healthy + serving afterwards; the recipe's
restore.post-hook reloads the dump), then asserts the restored DB matches the pre-mutation "original"
— the app-specific data integrity the generic restore cannot check. Reads via `psql` in the `db`
service. Assertion-only (no deploy/teardown)."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import generic, lifecycle # noqa: E402
def _psql(domain, sql):
cmd = f'PGPASSWORD=$(cat /run/secrets/db_password) psql -U synapse -d synapse -tAc "{sql}"'
return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
def test_restore_returns_state(live_app, meta):
domain = live_app
generic.do_restore(domain, meta) # restore + assert healthy/serving
assert (
_psql(domain, "SELECT v FROM ci_marker;") == "original"
), "restore did not return the pre-mutation postgres state"

View File

@ -1,16 +1,16 @@
"""matrix-synapse — upgrade stage (D2): deploy the previous published version, write a DB marker,
upgrade to current/$REF, assert the app stays healthy and the postgres data survives.
"""matrix-synapse — UPGRADE overlay (Phase 1d, DG4): data-continuity, extends the generic upgrade.
Matrix data lives in postgres, so the marker is a row in a dedicated `ci_marker` table (synapse's
own schema migrations don't touch it), read back via `psql` in the `db` service."""
The orchestrator deployed the previous published version ONCE; this overlay writes a marker row into
postgres (a dedicated `ci_marker` table synapse's own schema migrations don't touch, read back via
`psql` in the `db` service), performs the in-place upgrade via the shared op helper
(`generic.do_upgrade`, which also asserts reconverge + serving + that the deployment moved), then
asserts the postgres data SURVIVED. Assertion-only on the shared deployment."""
import os
import sys
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def _psql(domain, sql):
@ -18,26 +18,8 @@ def _psql(domain, sql):
return lifecycle.exec_in_app(domain, ["sh", "-c", cmd], service="db").strip()
@pytest.fixture
def old_app(recipe, app_domain, meta, request):
prev = lifecycle.previous_version(recipe)
if not prev:
pytest.skip(f"{recipe}: no previous published version to upgrade from")
lifecycle.janitor()
request.addfinalizer(lambda: lifecycle.teardown_app(app_domain))
lifecycle.deploy_app(recipe, app_domain, version=prev)
lifecycle.wait_healthy(
app_domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
return app_domain, prev
def test_upgrade_preserves_data(old_app, meta):
domain, prev = old_app
def test_upgrade_preserves_data(live_app, meta):
domain = live_app
# write a marker row into postgres (independent of synapse's own tables)
_psql(
domain,
@ -46,15 +28,8 @@ def test_upgrade_preserves_data(old_app, meta):
)
assert _psql(domain, "SELECT v FROM ci_marker;") == "upgrade-survives"
# upgrade previous -> current/$REF
lifecycle.upgrade_app(domain, version=os.environ.get("VERSION") or None)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
# in-place upgrade previous -> target (reuses the generic op: upgrade + assert reconverge/serving)
generic.do_upgrade(domain, os.environ.get("VERSION") or None, meta)
# app healthy and the data written before the upgrade is still there
assert lifecycle.http_get(domain, meta["HEALTH_PATH"]) == 200

View File

@ -1,5 +1,7 @@
"""n8n — backup/restore stage (D2): write a marker into the backed-up /home/node/.n8n path, backup,
mutate, restore, assert the restored state matches the pre-mutation state.
"""n8n — BACKUP overlay (Phase 1d, DG4): seed a known state into the backed-up /home/node/.n8n path,
back it up (assert a snapshot artifact), then mutate so the RESTORE overlay (test_restore.py) can
prove the backed-up state returns. Runs on the shared deployment; the mutated marker persists for the
restore tier.
The n8n `app` service is labelled `backupbot.backup=true` with `backupbot.backup.path=/home/node/.n8n`,
so a marker file there is backed up; checked via `exec_in_app`."""
@ -8,29 +10,21 @@ import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
MARKER = "/home/node/.n8n/ci-marker.txt"
def test_backup_mutate_restore(deployed, meta):
domain = deployed
def test_backup_captures_state(live_app, meta):
domain = live_app
# 1) establish original state in the backed-up path, then back it up (reuse the generic op:
# backup + assert a snapshot artifact was produced)
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo original > {MARKER}"])
assert lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "original"
lifecycle.backup_app(domain)
snap = generic.do_backup(domain)
assert snap, "backup produced no snapshot artifact"
# 2) mutate state (diverge from the backup)
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo mutated > {MARKER}"])
assert lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "mutated"
lifecycle.restore_app(domain)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
assert (
lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "original"
), "restore did not return the pre-mutation state"

View File

@ -1,22 +1,28 @@
"""n8n — install stage (recipe #6, workflow automation). D2 install + D3 Playwright."""
"""n8n — INSTALL overlay (Phase 1d, DG4): override + extend-by-composition.
Reuses the generic "really serving" assertion, then ADDS the recipe-specific checks: /healthz answers
200, and a real browser loads the live n8n editor SPA over HTTPS (D2 install + D3 Playwright).
Assertion-only on the shared deployment."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
def test_healthz(deployed_app):
status = lifecycle.http_get(deployed_app, "/healthz")
assert status == 200, f"expected 200 from {deployed_app}/healthz, got {status}"
def test_serving_and_editor(live_app, meta):
# extend-by-composition: reuse the generic "really serving" assertion first ...
generic.assert_serving(live_app, meta)
# ... then the recipe-specific assertions.
status = lifecycle.http_get(live_app, "/healthz")
assert status == 200, f"expected 200 from {live_app}/healthz, got {status}"
def test_playwright_loads_editor(deployed_app):
"""A real browser loads the live n8n editor SPA over HTTPS."""
# A real browser loads the live n8n editor SPA over HTTPS.
from playwright.sync_api import sync_playwright
url = f"https://{deployed_app}/"
url = f"https://{live_app}/"
with sync_playwright() as p:
browser = p.chromium.launch(args=["--no-sandbox"])
try:

24
tests/n8n/test_restore.py Normal file
View File

@ -0,0 +1,24 @@
"""n8n — RESTORE overlay (Phase 1d, DG4): data-integrity, extends the generic restore.
Runs after the backup overlay (test_backup.py) on the SAME shared deployment, which left the
/home/node/.n8n marker mutated to "mutated" after backing up "original". This restores the snapshot
via the shared op helper (`generic.do_restore`, which also asserts the app is healthy + serving
afterwards), then asserts the data returned to the pre-mutation "original" — the app-specific data
integrity the generic restore cannot check. Reads via `exec_in_app`. Assertion-only (no
deploy/teardown)."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import generic, lifecycle # noqa: E402
MARKER = "/home/node/.n8n/ci-marker.txt"
def test_restore_returns_state(live_app, meta):
domain = live_app
generic.do_restore(domain, meta) # restore + assert healthy/serving
assert (
lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "original"
), "restore did not return the pre-mutation state"

View File

@ -1,51 +1,27 @@
"""n8n — upgrade stage (D2): deploy the previous published version, write a data marker into the
persistent /home/node/.n8n volume, upgrade to current/$REF, assert health + data survival.
"""n8n — UPGRADE overlay (Phase 1d, DG4): data-continuity, extends the generic upgrade.
n8n state lives in the .n8n volume (sqlite + config); the marker is a file there, read back via
`exec_in_app` (not HTTP-served)."""
The orchestrator deployed the previous published version ONCE; this overlay writes a marker file into
the persistent /home/node/.n8n volume (n8n state = sqlite + config; the marker is read back via
`exec_in_app`, not HTTP-served), performs the in-place upgrade via the shared op helper
(`generic.do_upgrade`, which also asserts reconverge + serving + that the deployment moved), then
asserts the data SURVIVED. Assertion-only on the shared deployment."""
import os
import sys
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
from harness import generic, lifecycle # noqa: E402
MARKER = "/home/node/.n8n/ci-marker.txt"
@pytest.fixture
def old_app(recipe, app_domain, meta, request):
prev = lifecycle.previous_version(recipe)
if not prev:
pytest.skip(f"{recipe}: no previous published version to upgrade from")
lifecycle.janitor()
request.addfinalizer(lambda: lifecycle.teardown_app(app_domain))
lifecycle.deploy_app(recipe, app_domain, version=prev)
lifecycle.wait_healthy(
app_domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
return app_domain, prev
def test_upgrade_preserves_data(old_app, meta):
domain, prev = old_app
def test_upgrade_preserves_data(live_app, meta):
domain = live_app
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo upgrade-survives > {MARKER}"])
assert lifecycle.exec_in_app(domain, ["cat", MARKER]).strip() == "upgrade-survives"
lifecycle.upgrade_app(domain, version=os.environ.get("VERSION") or None)
lifecycle.wait_healthy(
domain,
ok_codes=tuple(meta["HEALTH_OK"]),
path=meta["HEALTH_PATH"],
deploy_timeout=meta["DEPLOY_TIMEOUT"],
http_timeout=meta["HTTP_TIMEOUT"],
)
# in-place upgrade previous -> target (reuses the generic op: upgrade + assert reconverge/serving)
generic.do_upgrade(domain, os.environ.get("VERSION") or None, meta)
assert lifecycle.http_get(domain, meta["HEALTH_PATH"]) == 200
assert (