feat(1e): HC3 additive generic + op/assertion split (orchestrator owns the op)

- orchestrator: per mutating tier, run optional pre-op seed hook (ops.py pre_<op>) → perform the op
  ONCE (harness-owned) → run generic assertion (unless opted out) AND overlay assertion, both against
  the shared post-op deployment. Op results passed op→assertion via run-scoped CCCI_OP_STATE_FILE.
- opt-out: CCCI_SKIP_GENERIC / CCCI_SKIP_GENERIC_<OP> / recipe_meta.SKIP_GENERIC (declarative).
- generic.py: split do_* into op primitives (perform_upgrade/backup/restore) + assertions
  (assert_upgraded/backup_artifact/restore_healthy) reading op_state(); deployed_identity now returns
  {version,image,chaos} (chaos label ready for HC1).
- generic test_<op>.py + all 6 recipe overlays migrated to assertion-only; pre-op seeding moved to
  per-recipe ops.py (pre_upgrade/pre_backup/pre_restore). install overlays unchanged (no op).
- deploy-count stays 1 (op primitives never call deploy_app). lint PASS; 8 unit tests PASS on cc-ci.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 03:12:04 +01:00
parent 6a59343996
commit b7e6cbd7be
31 changed files with 623 additions and 412 deletions

32
tests/custom-html/ops.py Normal file
View File

@ -0,0 +1,32 @@
"""custom-html — pre-op seed hooks (Phase 1e HC3). The orchestrator runs `pre_<op>(domain, meta)`
BEFORE it performs the op; the matching test_<op>.py asserts the post-op state (assertion-only).
nginx serves the volume at /usr/share/nginx/html, so the marker file survives an upgrade / a
backup+restore of that volume and is both HTTP-readable and exec-readable."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import lifecycle # noqa: E402
MARKER_PATH = "/usr/share/nginx/html/ci-marker.txt"
def _write(domain: str, val: str) -> None:
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo {val} > {MARKER_PATH}"])
def pre_upgrade(domain, meta):
# seed a marker before the upgrade so the overlay can prove the data survives it
_write(domain, "upgrade-survives")
def pre_backup(domain, meta):
# establish a known original state before the backup op captures it
_write(domain, "original")
def pre_restore(domain, meta):
# diverge from the backed-up state so a successful restore (back to "original") is observable
_write(domain, "mutated")

View File

@ -1,34 +1,21 @@
"""custom-html — BACKUP overlay (Phase 1d, DG4): seed a known state, back it up (assert artifact),
then mutate so the RESTORE overlay (test_restore.py) can prove the backed-up state returns. Runs on
the shared deployment; the marker it leaves ("mutated") persists for the restore tier.
"""custom-html — BACKUP overlay (Phase 1e HC3): assertion-only + additive.
Reads the marker via `exec_in_app` (the file in the volume), NOT http: backup/restore preserve the
VOLUME, and reading it directly is immune to the serving/container-routing race right after
backup-bot-two cycles the app container (HTTP briefly served empty). Serving is proven separately by
the install/upgrade tiers' assert_serving."""
The orchestrator ran `ops.pre_backup` (seeded "original" into the served volume), then performed the
backup ONCE. The generic backup tier already asserted a snapshot artifact was produced; this overlay
ADDS the recipe-specific check: the seeded "original" state is intact in the volume post-backup
(pre-mutation). The backup→restore divergence happens in `ops.pre_restore`. Reads via exec_in_app
(volume-direct), immune to the post-backup serving race after backup-bot-two cycles the container."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import generic, lifecycle # noqa: E402
from harness import lifecycle # noqa: E402
MARKER_PATH = "/usr/share/nginx/html/ci-marker.txt"
def _marker(domain: str) -> str:
return lifecycle.exec_in_app(domain, ["cat", MARKER_PATH]).strip()
def test_backup_captures_state(live_app, meta):
domain = live_app
# 1) establish a known original state, then back it up (reuse the generic op: backup + assert a
# snapshot artifact was produced)
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo original > {MARKER_PATH}"])
assert _marker(domain) == "original"
snap = generic.do_backup(domain)
assert snap, "backup produced no snapshot artifact"
# 2) mutate state so a successful restore is observable (diverge from the backup)
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo mutated > {MARKER_PATH}"])
assert _marker(domain) == "mutated"
def test_backup_captures_state(live_app):
assert (
lifecycle.exec_in_app(live_app, ["cat", MARKER_PATH]).strip() == "original"
), "the seeded state was not present at backup time"

View File

@ -1,25 +1,22 @@
"""custom-html — RESTORE overlay (Phase 1d, DG4): data-integrity, extends the generic restore.
"""custom-html — RESTORE overlay (Phase 1e HC3): data-integrity, assertion-only + additive.
Runs after the backup overlay (test_backup.py) on the SAME shared deployment, which left state
mutated to "mutated" after backing up "original". This restores the snapshot via the shared op
helper (`generic.do_restore`, which also asserts the app is healthy + serving afterwards), then
asserts the VOLUME data returned to the pre-mutation "original" — the app-specific data integrity the
generic restore cannot check. Reads the marker via exec_in_app (volume-direct, robust to the
post-restore serving race). Assertion-only (no deploy/teardown)."""
The orchestrator ran `ops.pre_restore` (mutated the marker to "mutated", diverging from the backed-up
"original"), then performed the restore ONCE. The generic restore tier already asserted healthy +
serving; this overlay ADDS the recipe-specific check: the volume data returned to the pre-mutation
(backed-up) "original". Reads via exec_in_app (volume-direct), robust to the post-restore serving
race."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import generic, lifecycle # noqa: E402
from harness import lifecycle # noqa: E402
MARKER_PATH = "/usr/share/nginx/html/ci-marker.txt"
def test_restore_returns_state(live_app, meta):
domain = live_app
generic.do_restore(domain, meta) # restore + assert healthy/serving
restored = lifecycle.exec_in_app(domain, ["cat", MARKER_PATH]).strip()
def test_restore_returns_state(live_app):
restored = lifecycle.exec_in_app(live_app, ["cat", MARKER_PATH]).strip()
assert (
restored == "original"
), f"restore did not return the pre-mutation (backed-up) state: got {restored!r}"

View File

@ -1,29 +1,21 @@
"""custom-html — UPGRADE overlay (Phase 1d, DG4): data-continuity, extends the generic upgrade.
"""custom-html — UPGRADE overlay (Phase 1e HC3): data-continuity, assertion-only + additive.
The orchestrator deployed the previous published version ONCE; this overlay seeds a marker into the
served volume, performs the in-place upgrade via the shared op helper (`generic.do_upgrade`, which
also asserts reconverge + serving), then asserts the data SURVIVED. Assertion-only on the shared
deployment (no deploy/teardown here)."""
The orchestrator deployed the base version, ran `ops.pre_upgrade` (seeded a marker into the served
volume), then performed the upgrade ONCE. The generic upgrade tier already asserted reconverge +
serving + moved; this overlay runs ALONGSIDE it and ADDS the recipe-specific check: the data written
before the upgrade survived it. No op, no deploy/teardown here."""
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import generic, lifecycle # noqa: E402
from harness import lifecycle # noqa: E402
MARKER_PATH = "/usr/share/nginx/html/ci-marker.txt"
def test_upgrade_preserves_data(live_app, meta):
domain = live_app
# write a data marker into the served volume (nginx serves /usr/share/nginx/html)
lifecycle.exec_in_app(domain, ["sh", "-c", f"echo upgrade-survives > {MARKER_PATH}"])
assert lifecycle.http_fetch(domain, "/ci-marker.txt")[1].strip() == "upgrade-survives"
# in-place upgrade previous -> target (reuses the generic op: upgrade + assert reconverge/serving)
generic.do_upgrade(domain, os.environ.get("VERSION") or None, meta)
# the data written before the upgrade is still there
def test_upgrade_preserves_data(live_app):
# the marker seeded by ops.pre_upgrade (before the harness upgraded) is still served
assert (
lifecycle.http_fetch(domain, "/ci-marker.txt")[1].strip() == "upgrade-survives"
lifecycle.http_fetch(live_app, "/ci-marker.txt")[1].strip() == "upgrade-survives"
), "data did not survive the upgrade"