fix(1e): HC1 upgrade/restore tier calls now pass head_ref (multi-line edit miss)

Earlier perl substitution missed the multi-line upgrade and restore run_lifecycle_tier calls (still
passed `target` = VERSION env, None for !testme runs), so perform_upgrade got head_ref=None for
upgrade tier → re-checkout skipped → chaos redeploy of leftover prev checkout (vacuous prev→prev that
'passed' via the chaos-label move fallback).

Verified e2e on hedgedoc (install,upgrade; commit pending push):
  upgrade→PR-head: head_ref=09bf4d54 chaos-version=09bf4d54 version=3.0.9+1.10.7→3.0.10+1.10.8
deploy-count=1, install/upgrade=pass, clean teardown. The chaos-version label deterministically
matches head_ref — direct proof PR-head code was deployed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 04:04:13 +01:00
parent 1a9632c2e8
commit 74725610ab
5 changed files with 192 additions and 89 deletions

View File

@ -15,24 +15,28 @@ those are discovered and run against the live app (D4 — see below).
tests/<recipe>/
├── recipe_meta.py # optional per-recipe harness config (see below)
├── install_steps.sh # optional custom install-steps hook (pre-deploy setup)
├── test_install.py # optional install overlay (else the generic install tier runs)
├── test_upgrade.py # optional upgrade overlay (else the generic upgrade tier runs)
├── test_backup.py # optional backup overlay (else the generic backup tier runs)
── test_restore.py # optional restore overlay (else the generic restore tier runs)
├── ops.py # optional pre-op seed hooks (pre_install/pre_upgrade/pre_backup/pre_restore)
├── test_install.py # optional install overlay (runs ADDITIVELY alongside generic)
├── test_upgrade.py # optional upgrade overlay (runs ADDITIVELY alongside generic)
── test_backup.py # optional backup overlay (runs ADDITIVELY alongside generic)
└── test_restore.py # optional restore overlay (runs ADDITIVELY alongside generic)
```
**A recipe is testable with ZERO config:** with no overlay files, the **generic lifecycle suite**
runs (install/upgrade/backup/restore) against a single shared deployment — see `docs/testing.md` for
the full model (tiers, deploy-once, override-vs-extend, precedence, the install-steps hook). The
per-recipe dir only holds the bits where the recipe needs *more* than the generic.
the full model (deploy-once, additive generic+overlay, the chaos PR-head upgrade, the HC2 repo-local
allowlist, the install-steps hook). The per-recipe dir only holds the bits where the recipe needs
*more* than the generic.
To add recipe-specific coverage, drop a `tests/<recipe>/test_<op>.py` **overlay** (it OVERRIDES the
generic for that op; absent ⇒ generic runs). Overlays are **assertion-only** against the shared live
deployment (the `live_app` fixture; they never deploy), and reuse the generic op + serving check by
composition (`from harness import generic; generic.do_upgrade(...)` etc.), adding recipe-specific
assertions. Copy an existing overlay (`tests/custom-html/` simple/volume marker; `tests/keycloak/`
admin-API; `tests/matrix-synapse/` `db`-service psql marker). **Do not edit the shared
`tests/conftest.py` / `runner/harness/` to add a recipe** — set per-recipe config in `recipe_meta.py`:
To add recipe-specific coverage, drop a `tests/<recipe>/test_<op>.py` **overlay** it runs
**ALONGSIDE** the generic for that op (HC3 additive, Phase 1e); the generic floor is never silently
dropped. Overlays are **assertion-only** against the shared live deployment (the `live_app` fixture;
they never perform the op or deploy/teardown — the orchestrator owns those). If the overlay needs to
SEED pre-op state (data-continuity markers, the backup→restore divergence), put `pre_<op>(domain,
meta)` callables in `tests/<recipe>/ops.py` — the orchestrator runs them BEFORE the op. Copy an
existing recipe (`tests/custom-html/` simple/volume marker; `tests/keycloak/` admin-API; `tests/
matrix-synapse/` `db`-service psql marker). **Do not edit the shared `tests/conftest.py` /
`runner/harness/` to add a recipe** — set per-recipe knobs in `recipe_meta.py`:
```python
HEALTH_PATH = "/realms/master" # path that returns a healthy status (default "/")
@ -41,19 +45,24 @@ DEPLOY_TIMEOUT = 600 # seconds for services to converge (default 600
HTTP_TIMEOUT = 600 # seconds for the app to answer (default 300)
BACKUP_CAPABLE = True # override backup-capability auto-detect (default: scan compose)
EXTRA_ENV = {"KEY": "value"} # or EXTRA_ENV(domain) -> dict; extra .env keys set at deploy
SKIP_GENERIC = ["upgrade"] # per-recipe opt-out from the generic floor for the listed ops
# ("all"/"*" = every op); rarely needed — generic is the floor
```
Useful `harness.lifecycle` helpers for overlays: `http_get`, `http_fetch`, `http_body`,
`exec_in_app` (use this for data markers — volume/DB, robust to the serving layer); the lifecycle ops
themselves come from `harness.generic` (`assert_serving`, `do_upgrade`, `do_backup`, `do_restore`).
The harness forces `LETS_ENCRYPT_ENV=""` (no ACME), a unique short domain per run, and guarantees
teardown.
`exec_in_app` (use this for data markers — volume/DB, hardened with returncode+retry); the lifecycle
ops themselves are orchestrator-owned (you never call them from an overlay). The harness forces
`LETS_ENCRYPT_ENV=""` (no ACME), a unique short domain per run, and guarantees teardown.
## 3. Recipe-local tests (D4)
## 3. Recipe-local tests (D4) — default-deny (HC2)
If the recipe's own repo contains `tests/test_*.py`, the runner snapshots them right after fetch and
runs them against the **live deployment** as a `recipe-local` stage. Contract: those tests receive
env `CCCI_BASE_URL` (e.g. `https://<app>.ci.commoninternet.net/`) and `CCCI_APP_DOMAIN`.
If the recipe's own repo contains `tests/test_*.py` / `install_steps.sh` / `ops.py`, the runner
snapshots them right after fetch — but per Phase 1e HC2 it executes them **only** for recipes on the
cc-ci approval allowlist `tests/repo-local-approved.txt` (default empty ⇒ default-deny). PR-author
code runs on the CI host with `/run/secrets/*` present, so adding a recipe to the allowlist is a
deliberate cc-ci-maintainer act (in a cc-ci PR, after reviewing that recipe's repo-local tests).
Without approval, only the cc-ci overlays in this repo + the generic floor run. Approved recipe-local
files receive env `CCCI_BASE_URL` (e.g. `https://<app>.ci.commoninternet.net/`) and `CCCI_APP_DOMAIN`.
## 4. Add the repo to the bridge poll list