Files
cc-ci/machine-docs/REVIEW-1e.md

8.4 KiB
Raw Blame History

REVIEW-1e — Adversary verdicts (Phase 1e: generic-harness corrections)

Adversary-owned, append-only. Phase plan: /srv/cc-ci/cc-ci-plan/plan-phase1e-harness-corrections.md. Definition of Done = HC1HC4 each cold-verified PASS here (handshake per plan.md §6.1).

Definition-of-Done tracker

  • HC1 — Upgrade tier upgrades to PR head (prev published → PR-head via abra app deploy --chaos), not a published tag; moved-assertion adapted; DG4.1 deploy-count guard reconciled.
  • HC2 — Repo-local (PR-authored) test_*.py / install_steps.sh NOT executed unless recipe is on the cc-ci approval allowlist (default-deny). PASS @2026-05-28 (E0, commit c7ae296).
  • HC3 — Generic runs by default alongside an overlay (additive); skipped only via explicit opt-out; op runs once. PASS @2026-05-28 (E1 re-claim, fix commit 6eabfdc).
  • HC4 — No regression: D1D10 / DG1DG8 re-verified cold; deploy-once (DG4.1) holds; teardown sacred; three new behaviors demonstrated.

Maps to Builder milestones: E0=HC2, E1=HC3, E2=HC1, E3=HC4+docs.

Cold-start access (re-verified each phase)

  • @2026-05-28 — ssh cc-ci OK (NixOS 24.11), dashboard HTTP 200 via SOCKS proxy 127.0.0.1:1055. Proxy/SSH path healthy.

Verdicts

E0 / HC2 — repo-local trust gate (default-deny) — PASS @2026-05-28

Builder claim (STATUS-1e, commit c7ae296 / feat d38a695): repo-local (PR-authored) test_*.py/install_steps.sh/ops.py consulted only for recipes on tests/repo-local-approved.txt (empty ⇒ deny); centralized _gated() in discovery.py; 8 unit tests pass.

Cold verification (own clone HEAD=c7ae296, shipped to cc-ci, run via cc-ci-run):

  1. Unit suite, independent run: cd /tmp/adv-1e && cc-ci-run -m pytest tests/unit -v8 passed in 0.06s (incl. repo-local-ignored-when-unapproved / wins-when-approved for overlay+custom+install_steps+pre_op, and default-allowlist-is-empty).
  2. My own break-it probe (hc2_probe.py, planted a HOSTILE repo-local install_steps.sh rm -rf / + ops.py os.system('id') + test_install.py):
    • real checked-in allowlist → approved_recipes() == set() (default-deny).
    • real-defaultapproved=False, overlay falls back to cc-ci, install_steps=None, pre_op=None (hostile repo-local code NOT selected).
    • lone *DENY (not a wildcard, as the file header promises).
    • only-comment / whitespace lines → DENY.
    • approving a different recipe (hedgedoc) → custom-html still DENY (no leak).
    • custom-html listed → approved=True, overlay/install_steps/pre_op all flip to repo-local.
  3. No bypass: every execution path in runner/run_recipe_ci.py routes through gated discovery.* (resolve_opresolve_overlay_op, custom_tests, install_steps→lifecycle hook). snapshot_recipe_tests reads the repo-local dir ungated but only copies it (discover), never executes — matches the plan's "discovered-but-NOT-executed". pre_op_hook not yet wired into the orchestrator (E1/HC3 work); its discovery fn is already gated.

Verdict: PASS — default-secure, centralized gate, flips only on explicit per-recipe approval; hostile repo-local code provably not executed under the shipped default. No finding. Note (not a defect): orchestrator still uses single-file override resolve_op (1d semantics); the additive generic floor (HC3) is E1 in-flight — will re-check the gate survives the HC3 refactor.

E1 / HC3 — additive generic + op/assertion split — FAIL (PASS WITHHELD) @2026-05-28

Builder claim (STATUS-1e gate, commit b7e6cbd): generic runs additively alongside overlays; orchestrator owns each op (once); opt-out via CCCI_SKIP_GENERIC[_<OP>]/recipe_meta.SKIP_GENERIC; deploy-count stays 1; two e2e (default + opt-out) "clean."

Cold verification (own clone HEAD=b7e6cbd shipped to cc-ci /tmp/adv-1e, run via cc-ci-run):

  • Structure (PASS): read the refactor — run_lifecycle_tier performs the op ONCE (_perform_opgeneric.perform_{upgrade,backup,restore}, none call deploy_app), then runs generic (unless _skip_generic) + overlay as separate pytests vs the shared post-op state. Generic+overlay test files are assertion-only; seeding moved to ops.py pre_<op>. assert_upgraded keeps the non-vacuous move check (F1d-2). _record_deploy() lives only in deploy_app.
  • Default e2e (custom-html, all stages): EVERY tier ran BOTH assert (generic) AND assert (cc-ci); pre_upgrade/backup/restore seeds fired; deploy-count=1; install/upgrade/backup/ restore all PASS; custom=skip; clean teardown (no leftover stack/volume). ✓ additive confirmed.
  • Opt-out e2e (CCCI_SKIP_GENERIC=1): generic skipped on every tier (0 _generic/ files ran), overlay-only, deploy-count=1 ✓ — but backup=FAIL: test_backup_captures_stateAssertionError: '' == 'original'. Same code/recipe; only diff is the opt-out flag.

Interim verdict (commit 4334e19): FAIL — opt-out flipped backup RED, theorised cause was the opt-out path removing an accidental ~1s generic-pytest timing buffer. Filed F1e-1.

CORRECTION @2026-05-28 (isolated repro disproved the opt-out theory)

Isolated, no-concurrency repro of STAGES=install,backup,restore on custom-html:

  • opt-out × 3 (CCCI_SKIP_GENERIC=1): backup PASS, restore PASS, deploy-count=1. 3/3.
  • default × 1: backup PASS, restore PASS, deploy-count=1.

So opting out of the generic is NOT what flips the backup RED — the original symptom occurred while the Builder was running concurrent custom-html e2e on the same node. The real trigger is load / concurrency putting the post-backup container cycle into a window where exec_in_app's docker exec fails. The static defect stays the same (and the fix direction in F1e-1 is still correct): exec_in_app silently returns empty stdout on a failed exec (returncode ignored) + no readiness retry. F1e-1 reframed in BACKLOG-1e; my earlier "opt-out is not behavior-neutral" framing is withdrawn.

Builder's fix (commit 6eabfdc) — verification pending

exec_in_app now polls (re-resolves container + re-execs) until rc==0 or 90s, then raises — never masks a failed exec as empty data. No assertion weakened. Same commit also lands HC1 plumbing (chaos_redeploy, recipe_head_commit, .chaos-version parsing in deployed_identity, head_ref match in assert_upgraded) — out-of-scope for this re-verification, will check at E2 claim.

Fix verified cold @2026-05-28 (own clone HEAD=6eabfdc shipped to /tmp/adv-fix): CCCI_SKIP_GENERIC=1 RECIPE=custom-html STAGES=install,backup,restore cc-ci-run runner/run_recipe_ci.py → install/backup/restore all PASS, deploy-count=1, generic skipped on every tier (overlay-only), clean teardown (no leftover stack/volume). The exec_in_app poll+raise is structurally watertight: re-resolves the container each try, raises on persistent failure — no silent-empty data path remains; a real exec failure becomes a real test failure rather than an '' == 'original' false-RED. F1e-1 closed by Adversary @2026-05-28 (BACKLOG-1e).

Final E1/HC3 verdict — PASS @2026-05-28 (re-claim commit e75ec1b; fix commit 6eabfdc)

Cold-verified: (1) additive — every lifecycle tier runs both assert (generic) and assert (cc-ci) on the shared post-op deployment (default run, all stages PASS); (2) opt-out — CCCI_SKIP_GENERIC=1 skips the generic on every tier with 0 _generic/ files run and overlay-only, deploy-count=1; (3) op-once — op primitives perform_{upgrade,backup,restore} never call deploy_app, deploy-count stays 1 in both modes; (4) assertion-only overlays — no double-op risk; (5) no assertion weakened — assert_upgraded keeps the non-vacuous move check (F1d-2 honored). HC2 gate survives the refactor. Open robustness item: F1e-2 (recipe-fetch concurrency race) — pre-existing, orthogonal, tracked for HC4.

Separate observation while testing (NOT F1e-1)

A controlled 2-concurrent same-recipe test (PR=8001/PR=8002, both custom-html) on the OLD code showed run-a die in abra recipe fetch custom-html -n (rc=1) — concurrent rm-rf + abra-fetch on the same ~/.abra/recipes/custom-html collide. Pre-existing (in 1d too), orthogonal to E1/HC3, not the F1e-1 trigger. Filing separately as F1e-2 [adversary] for HC4 visibility (§6 D-gate requires concurrent runs to be safe). Drone caps MAX_TESTS=1-2 today, so practical impact is bounded.