Files
cc-ci/machine-docs/REVIEW-2b.md

64 lines
4.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# REVIEW — Phase 2b (Adversary) — confirm minimal deploy budget
**Phase plan (SSOT):** `/srv/cc-ci/cc-ci-plan/plan-phase2b-test-performance.md`
**Loop state for THIS phase:** STATUS-2b / BACKLOG-2b / REVIEW-2b / JOURNAL-2b (DECISIONS.md shared).
Phase 1*/2 STATUS/BACKLOG/REVIEW files are other phases' state — not this phase's.
## Standing state
- **No Phase-2b gate CLAIMED yet.** As of @2026-05-31T05:33Z there is no STATUS-2b.md, no
`docs/perf/deploys.md`/DECISIONS Phase-2b note, and no B1B4 claim. The Builder is still finishing
Phase 2 (plausible Q4.7b + drone Q4.10 + Q5; Phase-2 STATUS not yet `## DONE`).
- **Queue dependency (plan §0 / status line):** Phase 2b is documented as starting *after* Phase 2
reaches `## DONE`. Operator kicked off the Phase-2b Adversary loop now (manual transition). Phase-2b
DoD (B1B4) is independent of Phase-2 completion — it is a property of the already-existing harness —
so the cold analysis below can be done now; the formal verdict awaits the Builder's claim.
- No VETO from this phase. (The standing Phase-2 DONE VETO lives in REVIEW-2.md and is unaffected.)
## Pre-claim independent cold analysis (anti-anchoring baseline) @2026-05-31T05:33Z
Done from a cold read of the harness ONLY (code + git), with NO Builder narrative consulted — this is
my own minimal-budget expectation, to be compared against whatever the Builder later claims.
### Deploy call sites (every `lifecycle.deploy_app` = one `abra app new` = one counted deploy)
`_record_deploy()` (lifecycle.py:107) is invoked ONLY from inside `deploy_app` (lifecycle.py:211), so
the run's deploy-count == number of `deploy_app` calls during the run. Call sites:
1. `run_recipe_ci.py:819`**the single base deploy** of the recipe under test. `version=base` where
`base = UPGRADE_BASE_VERSION-or-previous if "upgrade" in stages else target`. Shared by ALL tiers.
2. `runner/harness/deps.py:100`**one deploy per COLD declared dependency** (warm/live deps deploy 0;
they only get a per-run realm).
3. `run_recipe_ci.py:699`**WC5 promote-on-green-cold reseed** — NOT part of the test sequence and
NOT counted: at line 697 the run pops `CCCI_DEPLOY_COUNT_FILE` (countfile already asserted+removed
at 958961) before this deploy. It is a post-run, green-cold-only canonical warm-cache reseed.
### Tiers that do NOT add a deploy (deploy-sharing — the heart of the budget)
`_perform_op` (run_recipe_ci.py:242, docstring 246251 explicit): "None of these call deploy_app, so
the deploy-count guard (DG4.1) stays 1."
- **upgrade** → `generic.perform_upgrade` = in-place `abra app deploy --force --chaos` to PR-head
(HC1 reconciliation, real old→new crossover) — reuses the base deploy, no new `app new`.
- **backup / restore** → operate on the same live deployment.
- **install** → has no op (assertion-only on the base deploy).
- **custom / OIDC wiring** → in-place `--chaos` redeploy (`_run_setup_custom_tests_hook`), not counted.
### Enforcement (B2)
`run_recipe_ci.py:9581010`: reads countfile → `deploy_count`; computes
`expected_deploy_count = 1 + deps_deployed_count` (deps_deployed = cold deps only; warm excluded,
984/982). Prints `RUN SUMMARY → deploy-count = N (expect M)`. If `deploy_count != expected`
`overall = 1` + stderr `!! deploy-count N != M (DG4.1 violation)`. So a redundant `deploy_app` ANYWHERE
in the sequence fails the run. This is a genuine, non-vacuous guard.
### My independent minimal-budget conclusion
Per-recipe test sequence: **`deploys == 1 (base, shared by install+upgrade+backup+restore+custom) +
N_cold_deps`**, enforced by DG4.1. This is **MINIMAL — and tighter than B1's stated expectation** of
`1 (base) + 1 (upgrade tier) + N_deps`: the upgrade tier needs NO separate deploy because the base
deploy IS the prior version and the upgrade is an in-place chaos reconcile. So B1's stated minimum is
conservative; the implementation already beats it. Nothing to remove — already minimal.
### Open item for the Builder's B1/B4 doc (must be addressed honestly, not a defect yet)
The B1 doc must NOT claim "exactly 1+N_deps deploys per run, full stop" without noting the **WC5
green-cold reseed** (call site 3): on a green COLD run there is one additional uncounted `abra app new`
for canonical warm-cache maintenance. It is outside the test-sequence budget and is not redundant, but
B1 asks for "exactly how many deploy cycles happen and why each is necessary" — the doc must mention it
or it is materially incomplete. I will check the doc for this when claimed.
## Verdicts
_(none yet — awaiting B1B4 claim in STATUS-2b.md)_