# REVIEW-1e — Adversary verdicts (Phase 1e: generic-harness corrections) Adversary-owned, append-only. Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase1e-harness-corrections.md`. Definition of Done = HC1–HC4 each cold-verified PASS here (handshake per plan.md §6.1). ## Definition-of-Done tracker - [ ] **HC1** — Upgrade tier upgrades to PR head (prev published → PR-head via `abra app deploy --chaos`), not a published tag; moved-assertion adapted; DG4.1 deploy-count guard reconciled. - [x] **HC2** — Repo-local (PR-authored) `test_*.py` / `install_steps.sh` NOT executed unless recipe is on the cc-ci approval allowlist (default-deny). **PASS @2026-05-28 (E0, commit c7ae296).** - [x] **HC3** — Generic runs by default alongside an overlay (additive); skipped only via explicit opt-out; op runs once. **PASS @2026-05-28 (E1 re-claim, fix commit 6eabfdc).** - [ ] **HC4** — No regression: D1–D10 / DG1–DG8 re-verified cold; deploy-once (DG4.1) holds; teardown sacred; three new behaviors demonstrated. Maps to Builder milestones: E0=HC2, E1=HC3, E2=HC1, E3=HC4+docs. ## Cold-start access (re-verified each phase) - @2026-05-28 — `ssh cc-ci` OK (NixOS 24.11), dashboard HTTP 200 via SOCKS proxy 127.0.0.1:1055. Proxy/SSH path healthy. ## Verdicts ### E0 / HC2 — repo-local trust gate (default-deny) — PASS @2026-05-28 Builder claim (STATUS-1e, commit c7ae296 / feat d38a695): repo-local (PR-authored) `test_*.py`/`install_steps.sh`/`ops.py` consulted only for recipes on `tests/repo-local-approved.txt` (empty ⇒ deny); centralized `_gated()` in `discovery.py`; 8 unit tests pass. **Cold verification (own clone HEAD=c7ae296, shipped to cc-ci, run via `cc-ci-run`):** 1. **Unit suite, independent run:** `cd /tmp/adv-1e && cc-ci-run -m pytest tests/unit -v` → **8 passed in 0.06s** (incl. repo-local-ignored-when-unapproved / wins-when-approved for overlay+custom+install_steps+pre_op, and default-allowlist-is-empty). 2. **My own break-it probe** (`hc2_probe.py`, planted a HOSTILE repo-local `install_steps.sh` `rm -rf /` + `ops.py` `os.system('id')` + `test_install.py`): - real checked-in allowlist → `approved_recipes() == set()` (default-deny). - `real-default` → `approved=False`, overlay falls back to **cc-ci**, `install_steps=None`, `pre_op=None` (hostile repo-local code NOT selected). - lone `*` → **DENY** (not a wildcard, as the file header promises). - only-comment / whitespace lines → **DENY**. - approving a *different* recipe (hedgedoc) → custom-html still **DENY** (no leak). - `custom-html` listed → `approved=True`, overlay/install_steps/pre_op all flip to **repo-local**. 3. **No bypass:** every execution path in `runner/run_recipe_ci.py` routes through gated `discovery.*` (`resolve_op`→`resolve_overlay_op`, `custom_tests`, `install_steps`→lifecycle hook). `snapshot_recipe_tests` reads the repo-local dir ungated but only **copies** it (discover), never executes — matches the plan's "discovered-but-NOT-executed". `pre_op_hook` not yet wired into the orchestrator (E1/HC3 work); its discovery fn is already gated. Verdict: **PASS** — default-secure, centralized gate, flips only on explicit per-recipe approval; hostile repo-local code provably not executed under the shipped default. No finding. **Note (not a defect):** orchestrator still uses single-file override `resolve_op` (1d semantics); the additive generic floor (HC3) is E1 in-flight — will re-check the gate survives the HC3 refactor. ### E1 / HC3 — additive generic + op/assertion split — FAIL (PASS WITHHELD) @2026-05-28 Builder claim (STATUS-1e gate, commit b7e6cbd): generic runs additively alongside overlays; orchestrator owns each op (once); opt-out via `CCCI_SKIP_GENERIC[_]`/`recipe_meta.SKIP_GENERIC`; deploy-count stays 1; two e2e (default + opt-out) "clean." **Cold verification (own clone HEAD=b7e6cbd shipped to cc-ci `/tmp/adv-1e`, run via `cc-ci-run`):** - **Structure (PASS):** read the refactor — `run_lifecycle_tier` performs the op ONCE (`_perform_op`→`generic.perform_{upgrade,backup,restore}`, none call `deploy_app`), then runs generic (unless `_skip_generic`) + overlay as separate pytests vs the shared post-op state. Generic+overlay test files are assertion-only; seeding moved to `ops.py pre_`. `assert_upgraded` keeps the non-vacuous move check (F1d-2). `_record_deploy()` lives only in `deploy_app`. - **Default e2e** (custom-html, all stages): EVERY tier ran BOTH `assert (generic)` AND `assert (cc-ci)`; pre_upgrade/backup/restore seeds fired; **deploy-count=1**; install/upgrade/backup/ restore all PASS; custom=skip; clean teardown (no leftover stack/volume). ✓ additive confirmed. - **Opt-out e2e** (`CCCI_SKIP_GENERIC=1`): generic skipped on every tier (**0** `_generic/` files ran), overlay-only, **deploy-count=1** ✓ — **but backup=FAIL**: `test_backup_captures_state` → `AssertionError: '' == 'original'`. Same code/recipe; only diff is the opt-out flag. **Interim verdict (commit 4334e19): FAIL — opt-out flipped backup RED**, theorised cause was the opt-out path removing an accidental ~1s generic-pytest timing buffer. **Filed F1e-1.** ### CORRECTION @2026-05-28 (isolated repro disproved the opt-out theory) Isolated, no-concurrency repro of `STAGES=install,backup,restore` on custom-html: - **opt-out × 3** (`CCCI_SKIP_GENERIC=1`): backup PASS, restore PASS, deploy-count=1. **3/3.** - **default × 1**: backup PASS, restore PASS, deploy-count=1. So opting out of the generic is **NOT** what flips the backup RED — the original symptom occurred while the Builder was running concurrent custom-html e2e on the same node. The real trigger is **load / concurrency** putting the post-backup container cycle into a window where `exec_in_app`'s `docker exec` fails. The **static defect stays the same** (and the fix direction in F1e-1 is still correct): `exec_in_app` silently returns empty stdout on a failed exec (returncode ignored) + no readiness retry. F1e-1 reframed in BACKLOG-1e; my earlier "opt-out is not behavior-neutral" framing is **withdrawn**. ### Builder's fix (commit 6eabfdc) — verification pending `exec_in_app` now polls (re-resolves container + re-execs) until `rc==0` or 90s, then **raises** — never masks a failed exec as empty data. No assertion weakened. Same commit also lands HC1 plumbing (`chaos_redeploy`, `recipe_head_commit`, `.chaos-version` parsing in `deployed_identity`, head_ref match in `assert_upgraded`) — out-of-scope for this re-verification, will check at E2 claim. **Fix verified cold @2026-05-28 (own clone HEAD=6eabfdc shipped to `/tmp/adv-fix`):** `CCCI_SKIP_GENERIC=1 RECIPE=custom-html STAGES=install,backup,restore cc-ci-run runner/run_recipe_ci.py` → install/backup/restore **all PASS**, deploy-count=1, generic skipped on every tier (overlay-only), clean teardown (no leftover stack/volume). The `exec_in_app` poll+raise is structurally watertight: re-resolves the container each try, raises on persistent failure — no silent-empty data path remains; a real exec failure becomes a real test failure rather than an `'' == 'original'` false-RED. **F1e-1 closed by Adversary @2026-05-28** (BACKLOG-1e). ### Final E1/HC3 verdict — PASS @2026-05-28 (re-claim commit e75ec1b; fix commit 6eabfdc) Cold-verified: (1) additive — every lifecycle tier runs both `assert (generic)` and `assert (cc-ci)` on the shared post-op deployment (default run, all stages PASS); (2) opt-out — `CCCI_SKIP_GENERIC=1` skips the generic on every tier with **0** `_generic/` files run and overlay-only, deploy-count=1; (3) op-once — op primitives `perform_{upgrade,backup,restore}` never call `deploy_app`, deploy-count stays 1 in both modes; (4) assertion-only overlays — no double-op risk; (5) no assertion weakened — `assert_upgraded` keeps the non-vacuous move check (F1d-2 honored). HC2 gate survives the refactor. **Open robustness item:** F1e-2 (recipe-fetch concurrency race) — pre-existing, orthogonal, tracked for HC4. ### Separate observation while testing (NOT F1e-1) A controlled 2-concurrent same-recipe test (PR=8001/PR=8002, both custom-html) on the **OLD** code showed run-a die in `abra recipe fetch custom-html -n` (rc=1) — concurrent rm-rf + abra-fetch on the same `~/.abra/recipes/custom-html` collide. Pre-existing (in 1d too), orthogonal to E1/HC3, not the F1e-1 trigger. Filing separately as **F1e-2 [adversary]** for HC4 visibility (§6 D-gate requires concurrent runs to be safe). Drone caps `MAX_TESTS=1-2` today, so practical impact is bounded.