Files
cc-ci/machine-docs/REVIEW-1e.md

111 lines
8.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# REVIEW-1e — Adversary verdicts (Phase 1e: generic-harness corrections)
Adversary-owned, append-only. Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase1e-harness-corrections.md`.
Definition of Done = HC1HC4 each cold-verified PASS here (handshake per plan.md §6.1).
## Definition-of-Done tracker
- [ ] **HC1** — Upgrade tier upgrades to PR head (prev published → PR-head via `abra app deploy --chaos`), not a published tag; moved-assertion adapted; DG4.1 deploy-count guard reconciled.
- [x] **HC2** — Repo-local (PR-authored) `test_*.py` / `install_steps.sh` NOT executed unless recipe is on the cc-ci approval allowlist (default-deny). **PASS @2026-05-28 (E0, commit c7ae296).**
- [x] **HC3** — Generic runs by default alongside an overlay (additive); skipped only via explicit opt-out; op runs once. **PASS @2026-05-28 (E1 re-claim, fix commit 6eabfdc).**
- [ ] **HC4** — No regression: D1D10 / DG1DG8 re-verified cold; deploy-once (DG4.1) holds; teardown sacred; three new behaviors demonstrated.
Maps to Builder milestones: E0=HC2, E1=HC3, E2=HC1, E3=HC4+docs.
## Cold-start access (re-verified each phase)
- @2026-05-28 — `ssh cc-ci` OK (NixOS 24.11), dashboard HTTP 200 via SOCKS proxy 127.0.0.1:1055. Proxy/SSH path healthy.
## Verdicts
### E0 / HC2 — repo-local trust gate (default-deny) — PASS @2026-05-28
Builder claim (STATUS-1e, commit c7ae296 / feat d38a695): repo-local (PR-authored)
`test_*.py`/`install_steps.sh`/`ops.py` consulted only for recipes on `tests/repo-local-approved.txt`
(empty ⇒ deny); centralized `_gated()` in `discovery.py`; 8 unit tests pass.
**Cold verification (own clone HEAD=c7ae296, shipped to cc-ci, run via `cc-ci-run`):**
1. **Unit suite, independent run:** `cd /tmp/adv-1e && cc-ci-run -m pytest tests/unit -v`
**8 passed in 0.06s** (incl. repo-local-ignored-when-unapproved / wins-when-approved for
overlay+custom+install_steps+pre_op, and default-allowlist-is-empty).
2. **My own break-it probe** (`hc2_probe.py`, planted a HOSTILE repo-local `install_steps.sh`
`rm -rf /` + `ops.py` `os.system('id')` + `test_install.py`):
- real checked-in allowlist → `approved_recipes() == set()` (default-deny).
- `real-default``approved=False`, overlay falls back to **cc-ci**, `install_steps=None`,
`pre_op=None` (hostile repo-local code NOT selected).
- lone `*`**DENY** (not a wildcard, as the file header promises).
- only-comment / whitespace lines → **DENY**.
- approving a *different* recipe (hedgedoc) → custom-html still **DENY** (no leak).
- `custom-html` listed → `approved=True`, overlay/install_steps/pre_op all flip to **repo-local**.
3. **No bypass:** every execution path in `runner/run_recipe_ci.py` routes through gated
`discovery.*` (`resolve_op``resolve_overlay_op`, `custom_tests`, `install_steps`→lifecycle hook).
`snapshot_recipe_tests` reads the repo-local dir ungated but only **copies** it (discover), never
executes — matches the plan's "discovered-but-NOT-executed". `pre_op_hook` not yet wired into the
orchestrator (E1/HC3 work); its discovery fn is already gated.
Verdict: **PASS** — default-secure, centralized gate, flips only on explicit per-recipe approval;
hostile repo-local code provably not executed under the shipped default. No finding.
**Note (not a defect):** orchestrator still uses single-file override `resolve_op` (1d semantics);
the additive generic floor (HC3) is E1 in-flight — will re-check the gate survives the HC3 refactor.
### E1 / HC3 — additive generic + op/assertion split — FAIL (PASS WITHHELD) @2026-05-28
Builder claim (STATUS-1e gate, commit b7e6cbd): generic runs additively alongside overlays;
orchestrator owns each op (once); opt-out via `CCCI_SKIP_GENERIC[_<OP>]`/`recipe_meta.SKIP_GENERIC`;
deploy-count stays 1; two e2e (default + opt-out) "clean."
**Cold verification (own clone HEAD=b7e6cbd shipped to cc-ci `/tmp/adv-1e`, run via `cc-ci-run`):**
- **Structure (PASS):** read the refactor — `run_lifecycle_tier` performs the op ONCE
(`_perform_op``generic.perform_{upgrade,backup,restore}`, none call `deploy_app`), then runs generic
(unless `_skip_generic`) + overlay as separate pytests vs the shared post-op state. Generic+overlay
test files are assertion-only; seeding moved to `ops.py pre_<op>`. `assert_upgraded` keeps the
non-vacuous move check (F1d-2). `_record_deploy()` lives only in `deploy_app`.
- **Default e2e** (custom-html, all stages): EVERY tier ran BOTH `assert (generic)` AND
`assert (cc-ci)`; pre_upgrade/backup/restore seeds fired; **deploy-count=1**; install/upgrade/backup/
restore all PASS; custom=skip; clean teardown (no leftover stack/volume). ✓ additive confirmed.
- **Opt-out e2e** (`CCCI_SKIP_GENERIC=1`): generic skipped on every tier (**0** `_generic/` files ran),
overlay-only, **deploy-count=1** ✓ — **but backup=FAIL**: `test_backup_captures_state`
`AssertionError: '' == 'original'`. Same code/recipe; only diff is the opt-out flag.
**Interim verdict (commit 4334e19): FAIL — opt-out flipped backup RED**, theorised cause was the
opt-out path removing an accidental ~1s generic-pytest timing buffer. **Filed F1e-1.**
### CORRECTION @2026-05-28 (isolated repro disproved the opt-out theory)
Isolated, no-concurrency repro of `STAGES=install,backup,restore` on custom-html:
- **opt-out × 3** (`CCCI_SKIP_GENERIC=1`): backup PASS, restore PASS, deploy-count=1. **3/3.**
- **default × 1**: backup PASS, restore PASS, deploy-count=1.
So opting out of the generic is **NOT** what flips the backup RED — the original symptom occurred while
the Builder was running concurrent custom-html e2e on the same node. The real trigger is **load /
concurrency** putting the post-backup container cycle into a window where `exec_in_app`'s `docker exec`
fails. The **static defect stays the same** (and the fix direction in F1e-1 is still correct):
`exec_in_app` silently returns empty stdout on a failed exec (returncode ignored) + no readiness retry.
F1e-1 reframed in BACKLOG-1e; my earlier "opt-out is not behavior-neutral" framing is **withdrawn**.
### Builder's fix (commit 6eabfdc) — verification pending
`exec_in_app` now polls (re-resolves container + re-execs) until `rc==0` or 90s, then **raises**
never masks a failed exec as empty data. No assertion weakened. Same commit also lands HC1 plumbing
(`chaos_redeploy`, `recipe_head_commit`, `.chaos-version` parsing in `deployed_identity`, head_ref
match in `assert_upgraded`) — out-of-scope for this re-verification, will check at E2 claim.
**Fix verified cold @2026-05-28 (own clone HEAD=6eabfdc shipped to `/tmp/adv-fix`):**
`CCCI_SKIP_GENERIC=1 RECIPE=custom-html STAGES=install,backup,restore cc-ci-run runner/run_recipe_ci.py`
→ install/backup/restore **all PASS**, deploy-count=1, generic skipped on every tier (overlay-only),
clean teardown (no leftover stack/volume). The `exec_in_app` poll+raise is structurally watertight:
re-resolves the container each try, raises on persistent failure — no silent-empty data path remains;
a real exec failure becomes a real test failure rather than an `'' == 'original'` false-RED.
**F1e-1 closed by Adversary @2026-05-28** (BACKLOG-1e).
### Final E1/HC3 verdict — PASS @2026-05-28 (re-claim commit e75ec1b; fix commit 6eabfdc)
Cold-verified: (1) additive — every lifecycle tier runs both `assert (generic)` and `assert (cc-ci)` on
the shared post-op deployment (default run, all stages PASS); (2) opt-out — `CCCI_SKIP_GENERIC=1`
skips the generic on every tier with **0** `_generic/` files run and overlay-only, deploy-count=1;
(3) op-once — op primitives `perform_{upgrade,backup,restore}` never call `deploy_app`, deploy-count
stays 1 in both modes; (4) assertion-only overlays — no double-op risk; (5) no assertion weakened —
`assert_upgraded` keeps the non-vacuous move check (F1d-2 honored). HC2 gate survives the refactor.
**Open robustness item:** F1e-2 (recipe-fetch concurrency race) — pre-existing, orthogonal, tracked
for HC4.
### Separate observation while testing (NOT F1e-1)
A controlled 2-concurrent same-recipe test (PR=8001/PR=8002, both custom-html) on the **OLD** code
showed run-a die in `abra recipe fetch custom-html -n` (rc=1) — concurrent rm-rf + abra-fetch on the
same `~/.abra/recipes/custom-html` collide. Pre-existing (in 1d too), orthogonal to E1/HC3, not the
F1e-1 trigger. Filing separately as **F1e-2 [adversary]** for HC4 visibility (§6 D-gate requires
concurrent runs to be safe). Drone caps `MAX_TESTS=1-2` today, so practical impact is bounded.