# cc-ci Phase 2b — Confirm the test sequence minimizes deploys (no redundant deploys) **Status:** QUEUED — starts after Phase 2 (`plan-phase2-recipe-tests.md`) reaches `## DONE`, before Phase 3. **Transition:** manual (operator kicks it off). **Owner:** Builder + Adversary loops. **This file:** `/srv/cc-ci/cc-ci-plan/plan-phase2b-test-performance.md` --- ## 0. Scope (NARROWED — operator, 2026-05-30) The original Phase 2b was a broad empirical performance program (instrument → baseline → attribute → optimize). **That has been removed and parked in `IDEAS.md`** ("Phase-2b empirical performance work"). **Why:** the real deploy-speed bottleneck was **hardware**, not software — the cc-ci VM was **2 vCPU on a 4-core host** and **disk-I/O-bound** (load ~8, io pressure ~65%), with warm-keycloak (JVM) + all infra resident; RAM was never the constraint. That was fixed **directly**: cc-nix-test bumped to **4 vCPU** and made the **only running VM** on b1 (full host CPU). The software micro-optimizations are judged unlikely to be worth the effort and are deferred to IDEAS, to be revisited only if measurement later proves a specific software bottleneck. **So Phase 2b is reduced to ONE thing:** confirm the per-recipe test sequence already uses the **minimum number of deploys** — and fix it if it doesn't — **without weakening any test**. (Operator's expectation: we have probably already done this via the deploy-once / deploy-sharing design.) ## 1. Mission Verify that a recipe's full test sequence does **not** redeploy more than necessary, and document the deploy budget. Reuse a single deployment across the stages that can safely share one; only deploy again where a stage genuinely requires a distinct deployment. ## 2. Definition of Done (Adversary cold-verifies → REVIEW.md) - [ ] **B1 — Deploy budget is documented and minimal.** Write down, per recipe run, exactly how many `abra app deploy`/`upgrade` cycles happen and why each is necessary. Expected minimum: - **one** base deploy shared by **install + functional/custom + backup→restore** (restore redeploys onto the same app only as the restore mechanism itself requires); - **one** additional prior-version deploy **only** for the **upgrade** tier (old→new is the whole point of that tier); - **one** deploy per declared **dependency** (e.g. an SSO provider), deployed once and reused. i.e. `deploys == 1 (base) + 1 (upgrade tier) + N_deps` — no extra/redundant redeploys. - [ ] **B2 — Enforced, not just claimed.** The harness already emits a deploy count and fails on a mismatch (the DG4.1 `deploy-count != expected` check + the `RUN SUMMARY` `deploy-count` line) — point to that as the enforcement and confirm `expected_deploy_count` reflects the minimal budget in B1. If any recipe exceeds it, **remove the redundant deploy** (e.g. collapse a needless re-deploy between install and functional) and re-verify. - [ ] **B3 — No test weakened to save a deploy.** Every stage still runs its real assertions and real isolation/teardown; sharing a deployment must not skip or soften any check. Adversary confirms from a cold start that suite coverage is unchanged — only the deploy count is reduced/confirmed. - [ ] **B4 — Recorded.** A short note (`docs/perf/deploys.md` or DECISIONS.md) states the confirmed per-recipe deploy budget and that it is minimal. If it was already minimal, say so explicitly (the likely outcome); if a redundant deploy was removed, record before/after counts. When B1–B4 hold and are Adversary-verified, write `## DONE` to Phase-2b `STATUS.md`. ## 3. Method 1. Read `run_recipe_ci.py`/harness: trace every `abra app deploy`/`abra app upgrade` call across the stage sequence; count them; map each to a stage and a justification. 2. Compare to the minimal budget (B1). The existing `deploy-count`/`expected_deploy_count` logic is the reference — verify it equals the minimum and that runs pass it. 3. If over budget on any recipe, eliminate the redundant deploy **without** changing what's tested; re-run the full suite (Adversary cold-verifies green + isolation intact). 4. If already minimal, document the confirmation and finish — do NOT add speculative perf changes (those live in IDEAS). ## 4. Guardrails - **Correctness first:** never weaken/skip/soften a test or break isolation/teardown to cut a deploy. - **Bounded:** this phase ONLY confirms/fixes deploy count. Any other perf idea → `IDEAS.md` ("Phase-2b empirical performance work"); do not re-import them here. - **Real abra path** throughout (no docker-level shortcuts).