# cc-ci Phase 1e — Generic-harness corrections (Autonomous Build Plan) **Status:** QUEUED — runs **after Phase 1d** and **before Phase 2** (`plan-phase2-recipe-tests.md`). It corrects the **shared generic-test harness** from 1d, so it must land before Phase 2 authors overlays on top of it. **Transition:** **manual** (operator kicks it off). **Builds on:** the Phase-1d generic suite (`runner/run_recipe_ci.py`, `runner/harness/*`, `tests/_generic/*`, `tests/conftest.py`) — see `plan-phase1d-generic-test-suite.md`. **Owner agents:** same Builder + Adversary loops (`plan.md` §6/§7); Adversary cold-verifies. **This file's path:** `/srv/cc-ci/cc-ci-plan/plan-phase1e-harness-corrections.md` **Phase order:** 1c → 1b → 1d → **1e** → 2 → 2b → 3. --- ## 0. Why this phase An operator review of the 1d generic suite (2026-05-28) found three corrections to the **shared harness** — the foundation every recipe overlay (Phase 2) builds on. Fixing them now, once, is far cheaper than after overlays exist. All three are small in code but change behavior, so each needs a fresh Adversary cold-verification and must not weaken any existing test. --- ## 1. Definition of Done (Phase 1e exit condition) Terminates when every item holds **and the Adversary has independently cold-verified** (logged in `machine-docs/REVIEW-1e.md`): - [ ] **HC1 — Upgrade tier upgrades to the code under test (PR head), not a published tag.** The upgrade tier deploys the **previous published version** (last release before the PR) and then **upgrades to the PR head via `abra app deploy --chaos`** (chaos = the current checkout). The PR's actual changes are exercised by the upgrade path. (§2.1) - [ ] **HC2 — Repo-local (PR-authored) code is not executed unless the recipe is approved.** By default the harness runs **only cc-ci-authored** overlays/install-steps (`tests//…`) + the generic; PR-authored repo-local `test_*.py` and `install_steps.sh` are **not run**. Repo-local code is honored **only for recipes on an explicit cc-ci-maintained approval allowlist** (default-deny). (§2.2) - [ ] **HC3 — Generic runs by default (additive); skipping it is explicit.** When a recipe ships an overlay for an op, the **generic still runs** alongside it by default; the generic is skipped **only** when an explicit env/flag opts out. The baseline floor is never lost silently. (§2.3) - [ ] **HC4 — No regression, cold-verified.** The Adversary re-runs the relevant D1–D10 / DG1–DG8 acceptance from a cold start: nothing weakened, deploy-once (DG4.1) still holds, teardown still sacred, and the three new behaviors are demonstrated (HC1: a PR-head upgrade proven to deploy PR-head; HC2: a repo-local test is *ignored* for a non-approved recipe and *run* for an approved one; HC3: generic runs with an overlay present, and is skipped only with the opt-out set). When HC1–HC4 hold and are confirmed, write `## DONE` to `machine-docs/STATUS-1e.md`. --- ## 2. The three corrections ### 2.1 HC1 — Upgrade to the PR head (not a published tag) Current 1d behavior: deploy previous published version, then `abra app upgrade` to the **newest published tag** — and because deploying the prev tag re-checks-out the recipe, the **PR-head code is never deployed**, so a recipe PR's changes aren't exercised by upgrade. Corrected: 1. Deploy the **previous published version** (the last release before the code under test) as the "before" state. 2. **Restore the PR-head checkout** (re-checkout the PR ref / re-use the post-fetch snapshot — the prev-tag deploy will have reset `~/.abra/recipes/`). 3. **Upgrade to it via `abra app deploy --chaos`** (chaos = current checkout = PR head) in place on the shared deployment. 4. Assert reconverge + still serving (as today). - **Adapt the "deployment moved" assertion** (`generic.do_upgrade`): prev→PR-head may *not* bump the coop-cloud version label (a PR can change a recipe without a version bump), so also accept an image/config change, or assert the running config now matches the PR head — keep it non-vacuous without false-failing a legit unbumped PR. - **Non-PR `!testme`** (no PR head): "current checkout" = the catalogue current, so upgrade tests prev→current — still valid. - Preserve **deploy-once** spirit: this is still one app deployment mutated in place (prev → chaos redeploy of PR head is the upgrade op, not a fresh second app). Reconcile with the DG4.1 deploy-count guard — define whether a chaos redeploy counts as a "deploy" and adjust the guard so the legitimate upgrade isn't flagged (e.g. count `abra app new` installs, not in-place redeploys). ### 2.2 HC2 — Repo-local trust gate (default-deny; cc-ci overlays only) `install_steps.sh` and repo-local `test_*.py` are PR-author-controlled code that runs on the CI host with `/run/secrets/*` present — an untrusted-code risk. Operator decision (2026-05-28): - **Default:** the harness runs **only cc-ci-authored** overlays + install-steps (`tests//…`) and the generic. Repo-local (`/tests/`) `test_*.py` and `install_steps.sh` are **discovered-but-not-executed**. - **Approved recipes only:** repo-local code is honored **only** when the recipe is on an explicit, **cc-ci-maintained approval allowlist** (default-empty ⇒ default-deny). Adding a recipe to the allowlist is a deliberate cc-ci-maintainer act after reviewing that recipe's tests. - Update `discovery.resolve_op` / `custom_tests` / `install_steps` so the **repo-local source is only consulted for allowlisted recipes**; otherwise precedence is **cc-ci > generic** only. - **Open (settle in DECISIONS):** the allowlist's form + location (a checked-in file like `tests/repo-local-approved.txt`, or a field in a cc-ci config), and the approval workflow. Keep it simple + auditable + in git. - (Future hardening, → IDEAS, not this phase: sandbox/network-restrict even cc-ci overlays.) ### 2.3 HC3 — Generic by default (additive), explicit opt-out Supersedes 1d's pure-override default. New rule: when a recipe ships an overlay for an op, **both the generic and the overlay run** for that op by default; the generic is skipped **only** when an explicit opt-out is set. - **Opt-out mechanism (propose; settle in DECISIONS):** an env flag `CCCI_SKIP_GENERIC` (all ops) and per-op `CCCI_SKIP_GENERIC_` (e.g. `..._UPGRADE`), settable via the recipe's `recipe_meta.py` (a `SKIP_GENERIC` list) so it's declarative per recipe, not a hidden global. - **Op-vs-assertion split (required by additive + deploy-once):** a mutating op (upgrade/backup/ restore) must run **once**, then **both** the generic assertions and the overlay assertions evaluate the post-op state — never upgrade/backup twice. So refactor the tiers: the **orchestrator performs the op once** (the harness owns the op), then runs generic assertions (unless opted out) + overlay assertions against the shared post-op deployment. For `install` (no op) both assertion sets just run. This keeps deploy-once and one-op-per-tier intact. - Net effect: the generic "is it actually serving / did the upgrade move / snapshot produced" floor is **always** exercised unless a recipe explicitly declares it skips generics — overlays add, they don't silently subtract. --- ## 3. Method / milestones (bounded) - **E0 — HC2 trust gate.** Gate repo-local behind the approval allowlist (default-deny); cc-ci+generic only otherwise. *Accept:* repo-local ignored for a non-approved recipe, run for an approved one. - **E1 — HC3 additive + op/assertion split.** Generic runs alongside overlays by default; op runs once; opt-out env skips the generic assertions. *Accept:* overlay + generic both run on one deployment; opt-out skips generic; deploy-count still 1. - **E2 — HC1 upgrade-to-PR-head.** prev-release → PR-head via `deploy --chaos`; moved-assertion adapted; deploy-count guard reconciled. *Accept:* upgrade demonstrably deploys PR-head. - **E3 — HC4 cold re-verification + docs.** Adversary cold-verifies no regression + the three new behaviors; update `docs/` + `machine-docs/DECISIONS.md`; flip `STATUS-1e.md` to `## DONE`. --- ## 4. Guardrails - **Never weaken a test** — these are correctness/security fixes; the cardinal rule still wins. - **Default-secure** — repo-local PR code is off unless the recipe is explicitly approved; the allowlist lives in git and is auditable. - **Floor-by-default** — the generic baseline always runs unless a recipe explicitly opts out. - **Deploy-once preserved** — one app deployment, one teardown; ops run once; reconcile the DG4.1 guard with the chaos-upgrade redeploy. - **Bounded** — three fixes + verification, then stop; bigger hardening (sandboxing) → IDEAS. ## 5. Open decisions (log in machine-docs/DECISIONS.md) - HC2: approval-allowlist form/location + the approval workflow. - HC3: opt-out flag name/granularity + declaring it via `recipe_meta.py`. - HC1: how the DG4.1 deploy-count guard treats an in-place chaos upgrade (don't flag the legit op).