4.6 KiB
REVIEW — Phase 2b (Adversary) — confirm minimal deploy budget
Phase plan (SSOT): /srv/cc-ci/cc-ci-plan/plan-phase2b-test-performance.md
Loop state for THIS phase: STATUS-2b / BACKLOG-2b / REVIEW-2b / JOURNAL-2b (DECISIONS.md shared).
Phase 1*/2 STATUS/BACKLOG/REVIEW files are other phases' state — not this phase's.
Standing state
- No Phase-2b gate CLAIMED yet. As of @2026-05-31T05:33Z there is no STATUS-2b.md, no
docs/perf/deploys.md/DECISIONS Phase-2b note, and no B1–B4 claim. The Builder is still finishing Phase 2 (plausible Q4.7b + drone Q4.10 + Q5; Phase-2 STATUS not yet## DONE). - Queue dependency (plan §0 / status line): Phase 2b is documented as starting after Phase 2
reaches
## DONE. Operator kicked off the Phase-2b Adversary loop now (manual transition). Phase-2b DoD (B1–B4) is independent of Phase-2 completion — it is a property of the already-existing harness — so the cold analysis below can be done now; the formal verdict awaits the Builder's claim. - No VETO from this phase. (The standing Phase-2 DONE VETO lives in REVIEW-2.md and is unaffected.)
Pre-claim independent cold analysis (anti-anchoring baseline) @2026-05-31T05:33Z
Done from a cold read of the harness ONLY (code + git), with NO Builder narrative consulted — this is my own minimal-budget expectation, to be compared against whatever the Builder later claims.
Deploy call sites (every lifecycle.deploy_app = one abra app new = one counted deploy)
_record_deploy() (lifecycle.py:107) is invoked ONLY from inside deploy_app (lifecycle.py:211), so
the run's deploy-count == number of deploy_app calls during the run. Call sites:
run_recipe_ci.py:819— the single base deploy of the recipe under test.version=basewherebase = UPGRADE_BASE_VERSION-or-previous if "upgrade" in stages else target. Shared by ALL tiers.runner/harness/deps.py:100— one deploy per COLD declared dependency (warm/live deps deploy 0; they only get a per-run realm).run_recipe_ci.py:699— WC5 promote-on-green-cold reseed — NOT part of the test sequence and NOT counted: at line 697 the run popsCCCI_DEPLOY_COUNT_FILE(countfile already asserted+removed at 958–961) before this deploy. It is a post-run, green-cold-only canonical warm-cache reseed.
Tiers that do NOT add a deploy (deploy-sharing — the heart of the budget)
_perform_op (run_recipe_ci.py:242, docstring 246–251 explicit): "None of these call deploy_app, so
the deploy-count guard (DG4.1) stays 1."
- upgrade →
generic.perform_upgrade= in-placeabra app deploy --force --chaosto PR-head (HC1 reconciliation, real old→new crossover) — reuses the base deploy, no newapp new. - backup / restore → operate on the same live deployment.
- install → has no op (assertion-only on the base deploy).
- custom / OIDC wiring → in-place
--chaosredeploy (_run_setup_custom_tests_hook), not counted.
Enforcement (B2)
run_recipe_ci.py:958–1010: reads countfile → deploy_count; computes
expected_deploy_count = 1 + deps_deployed_count (deps_deployed = cold deps only; warm excluded,
984/982). Prints RUN SUMMARY → deploy-count = N (expect M). If deploy_count != expected →
overall = 1 + stderr !! deploy-count N != M (DG4.1 violation). So a redundant deploy_app ANYWHERE
in the sequence fails the run. This is a genuine, non-vacuous guard.
My independent minimal-budget conclusion
Per-recipe test sequence: deploys == 1 (base, shared by install+upgrade+backup+restore+custom) + N_cold_deps, enforced by DG4.1. This is MINIMAL — and tighter than B1's stated expectation of
1 (base) + 1 (upgrade tier) + N_deps: the upgrade tier needs NO separate deploy because the base
deploy IS the prior version and the upgrade is an in-place chaos reconcile. So B1's stated minimum is
conservative; the implementation already beats it. Nothing to remove — already minimal.
Open item for the Builder's B1/B4 doc (must be addressed honestly, not a defect yet)
The B1 doc must NOT claim "exactly 1+N_deps deploys per run, full stop" without noting the WC5
green-cold reseed (call site 3): on a green COLD run there is one additional uncounted abra app new
for canonical warm-cache maintenance. It is outside the test-sequence budget and is not redundant, but
B1 asks for "exactly how many deploy cycles happen and why each is necessary" — the doc must mention it
or it is materially incomplete. I will check the doc for this when claimed.
Verdicts
(none yet — awaiting B1–B4 claim in STATUS-2b.md)