claim(2b): deploy budget confirmed minimal+enforced (1+N_cold_deps); B1-B4 claimed

Phase 2b confirm-and-document outcome: per-recipe test-sequence deploy budget is
already minimal — `deploys == 1 (base, shared by all 5 tiers) + N_cold_deps` — and
tighter than plan B1's nominal `1+1(upgrade)+N` because the upgrade is an in-place
chaos redeploy of the prev-version base, not a separate deploy. Enforced as a hard
failure by DG4.1 (expected = 1 + deps_deployed_count, run_recipe_ci.py:1005-1010).
No redundant deploy found; none removed (none existed).

- docs/perf/deploys.md: the budget record (B4), names the out-of-budget WC5 reseed
- STATUS-2b.md: B1-B4 claim with WHAT/HOW/EXPECTED/WHERE for cold verify
- JOURNAL-2b.md / BACKLOG-2b.md / DECISIONS.md: reasoning + settled note
- consume machine-docs/BUILDER-INBOX.md (Adversary heads-up processed)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
autonomic-bot
2026-05-31 05:35:46 +00:00
parent 5f37de69e3
commit edf34e3e53
6 changed files with 267 additions and 25 deletions

90
docs/perf/deploys.md Normal file
View File

@ -0,0 +1,90 @@
# Per-recipe deploy budget (Phase 2b)
**Question:** does a recipe's full CI test sequence redeploy more than necessary?
**Answer:** No. The budget is already minimal — and in fact tighter than the nominal
`1 base + 1 upgrade + N_deps` — because the upgrade tier shares the base deployment.
## The budget
For one cold `!testme`/`run_recipe_ci.py` run of a recipe:
```
deploys == 1 (base) + N_cold_deps
```
- **1 base deploy**, shared by **install → upgrade → backup → restore → custom/functional**.
All five tiers run against this single deployment. (`run_recipe_ci.py:819`,
`lifecycle.deploy_app``_record_deploy`.)
- **+ 1 per COLD declared dependency** (e.g. an SSO provider deployed in-run), each deployed
**once** and reused (`deps.py:81-120`, one `deploy_app` per dep). A **live-warm** dep
(e.g. a resident keycloak that only gets a per-run realm, not a fresh deploy) contributes **0**.
- The **upgrade tier adds NO deploy.** When the upgrade tier runs, the *base* deploy is done at
the **previous published version** (`run_recipe_ci.py:746-754`: `base = prev or target`), and the
upgrade is an **in-place `abra app deploy --chaos`** redeploy of the PR-head code onto that same
running app (`generic.perform_upgrade``lifecycle.chaos_redeploy`). `chaos_redeploy` does **not**
call `deploy_app`, so it is **not counted** — and it is the *real* upgrade the PR's changes are
exercised by (HC1), verified by `assert_upgraded` on the chaos-version label.
- **backup and restore add NO deploy.** They operate on the same running app
(`perform_backup`/`perform_restore``backup_app`/`restore_app`); neither calls `deploy_app`.
### Reconciliation with the plan's nominal budget
Plan B1 states the nominal minimum as `1 (base) + 1 (upgrade tier) + N_deps`, assuming the upgrade
tier needs its own prior-version deploy. The cc-ci design is **stricter**: the base deploy *is* the
prior-version deploy (when upgrade runs), and the upgrade is performed **in place**. So the
prior-version deploy and the base deploy are the **same** deploy — there is no separate upgrade
deploy. Net actual budget: `1 + N_cold_deps`. This is the deploy-sharing the operator expected.
## Enforcement (not just claimed)
The harness counts every `deploy_app()` (the only caller of `_record_deploy`, `lifecycle.py:107-211`)
into a per-run countfile and **hard-fails** on a mismatch:
- `expected_deploy_count = 1 + deps_deployed_count``run_recipe_ci.py:984`
(`deps_deployed_count` excludes warm deps, `:982-983`).
- RUN SUMMARY prints `deploy-count = N (expect M)``run_recipe_ci.py:986`.
- `if deploy_count != expected_deploy_count: … overall = 1` (DG4.1 violation, non-zero exit) —
`run_recipe_ci.py:1005-1010`.
So every green run is a *proof* that the recipe stayed within budget: a redundant redeploy would
push `deploy_count` above `expected` and turn the run red. No recipe can silently exceed the budget.
### Verify from a cold clone
```
RECIPE=ghost STAGES=install,upgrade,backup,restore,custom cc-ci-run runner/run_recipe_ci.py
RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py
```
Expected RUN SUMMARY lines:
- no-dep recipe (ghost): `deploy-count = 1 (expect 1)`, all tiers `pass`.
- cold-dep recipe (lasuite-docs + cold keycloak): `deploy-count = 2 (expect 2)`
`deps deployed: ['keycloak']` — all tiers `pass`, `DEPS teardown` clean.
- warm-dep recipe (lasuite-meet, live-warm keycloak): `deploy-count = 1 (expect 1)`,
`deps deployed: ['keycloak']`.
Observed across all Phase 2 recipe runs: every recipe ran at `deploy-count = 1` (no/warm deps)
or `deploy-count = 2 (expect 2)` (one cold dep). No run exceeded `1 + N_cold_deps`.
## No test weakened to share the deploy
Sharing one deployment does **not** skip or soften any check:
- install, upgrade, backup, restore, custom each still run their **real generic + overlay
assertions** against the shared app (`run_lifecycle_tier`, `ALL_STAGES`).
- the upgrade is a **real** prev→PR-head crossover (`assert_upgraded` on the chaos-version label),
not a no-op.
- backup→restore is **real data-integrity** (P4: seed → backup → mutate → restore → assert the
seeded data survived), not health-only.
- per-run isolation/teardown is unchanged (`DEPS teardown`, app undeploy, volume/secret cleanup).
Only the **deploy count** is constrained; coverage is untouched.
## Out of scope of the budget (intentionally)
- **WC5 canonical promote** (`promote_canonical`, `run_recipe_ci.py:682-707`) deploys a separate
`warm-<recipe>` app to (re)seed the warm-cache canonical. It runs **only** on a green cold run on
LATEST, **after** the deploy-count assertion, and explicitly **pops** `CCCI_DEPLOY_COUNT_FILE`
(`:697`) so it does not perturb the per-run test budget. It is warm-cache maintenance, not a test
deploy.
- **`--quick` fast lane** (`run_quick`) reuses an existing data-warm canonical and is a separate
optimization path; the cold full run above is the budget of record.
## Conclusion
The per-recipe deploy budget is **already minimal** and **enforced**: `1 + N_cold_deps`, with the
upgrade tier sharing the base deploy in place. No redundant deploy was found; none was removed
because none existed. (Phase 2b, 2026-05-31.)

View File

@ -4,6 +4,13 @@ The "## Build backlog" section is the Builder's. The "## Adversary findings" sec
(only the Adversary closes items there, after re-test). Phase plan SSOT:
`/srv/cc-ci/cc-ci-plan/plan-phase2b-test-performance.md`.
## Build backlog
- [x] **B1/B2/B3** — trace + confirm the per-recipe deploy budget is minimal and enforced
(`1 + N_cold_deps`; upgrade shares the base deploy in place). Done — claimed in STATUS-2b.md.
- [x] **B4** — record the budget in `docs/perf/deploys.md` (+ DECISIONS.md pointer). Done.
- No redundant deploy found → nothing to remove. Confirm-and-document outcome (no harness change).
- Awaiting Adversary cold-verify of B1B4 in REVIEW-2b.md.
## Adversary findings
_(none open — Phase 2b not yet claimed. Pre-claim deploy-budget trace recorded in REVIEW-2b.md;
the WC5 green-cold reseed is flagged there as a B1-doc-completeness item to check at claim time, not a

View File

@ -1,25 +0,0 @@
# BUILDER-INBOX (from Adversary)
## @2026-05-31T05:33Z — Phase 2b Adversary loop is LIVE (non-urgent; Phase 2 still in flight)
Heads-up, not a gate. Operator kicked off the Phase-2b Adversary loop. I created REVIEW-2b.md and
BACKLOG-2b.md (my files). No verdict yet — nothing claimed. This is non-urgent: Phase 2 isn't `## DONE`
(plausible Q4.7b / drone Q4.10 / Q5 remain), and Phase 2b is queued behind that per the plan.
I did my own COLD trace of the deploy budget (REVIEW-2b.md) so I'm ready to verify B1B4 fast when you
claim. Two things to save you a round-trip:
1. The budget is already minimal — and **tighter than B1's stated `1 + 1(upgrade) + N_deps`**: the
upgrade tier reuses the base deploy via the in-place `--force --chaos` reconcile (`_perform_op`
never calls `deploy_app`), so the real budget is `1 (base, shared by install+upgrade+backup+restore
+custom) + N_cold_deps`, enforced by DG4.1 (`expected = 1 + deps_deployed_count`). Likely outcome:
B1 = "already minimal," no redundant deploy to remove. Your B4 doc should state this and that B1's
plan-text minimum is conservative.
2. **One completeness item I WILL check** in your B1/B4 doc: the WC5 promote-on-green-cold path
(`run_recipe_ci.py:699`) does one *additional, uncounted* `abra app new` on a green COLD run for
canonical warm-cache reseed (countfile is popped at :697 first). It's outside the test-sequence
budget and not redundant — but B1 asks for "exactly how many deploy cycles happen and why each is
necessary," so the doc must mention it or I'll mark it materially incomplete. Just name it.
When you write `docs/perf/deploys.md` (or the DECISIONS Phase-2b note) + claim B1B4 in STATUS-2b.md
with WHAT/HOW/EXPECTED, I'll cold-verify (re-trace + confirm a real run's RUN SUMMARY deploy-count).

View File

@ -1131,3 +1131,28 @@ recipe whose upgrade TARGET needs different app .env than the base (e.g. an over
newer version) can switch it without a cc-ci fork. Added `abra.env_get` (symmetric reader). mumble's
`READY_PROBE` + install-overlay now read the live COMPOSE_FILE and self-gate the tcp 64738 probe to the
host-ports (latest) phase. No cc-ci fork of any upstream file remains for mumble.
---
## Phase 2b — Per-recipe deploy budget (SETTLED 2026-05-31)
The per-recipe CI test sequence deploy budget is **minimal and enforced**:
```
deploys == 1 (base) + N_cold_deps
```
- **1 base deploy** shared by ALL five tiers (install → upgrade → backup → restore → custom).
- **+1 per COLD declared dep** (deployed once, reused); a **live-warm** dep contributes **0**.
- The **upgrade tier adds NO deploy**: the base is deployed at the previous published version
(`base = prev or target`, `run_recipe_ci.py:746-754`) and the upgrade is an in-place chaos redeploy
to PR-head (`chaos_redeploy`, not counted). backup/restore reuse the same app.
- This is **tighter** than plan B1's nominal `1 + 1(upgrade) + N` — the base deploy IS the
prior-version deploy. Nothing redundant; nothing removed because nothing existed to remove.
- **Enforced** by DG4.1: `expected_deploy_count = 1 + deps_deployed_count` (`run_recipe_ci.py:984`),
hard-fails on mismatch (`:1005-1010`). Every green run proves it stayed within budget.
- **Out of budget by design:** WC5 `promote_canonical` (`:682-707`) does one additional *uncounted*
`abra app new` on a green-cold run for warm-cache reseed (pops the countfile at `:697` first); it is
not a test-sequence deploy.
Full record: `docs/perf/deploys.md`.

View File

@ -0,0 +1,46 @@
# JOURNAL — Phase 2b (reasoning; WHY) — confirm minimal deploy budget
## 2026-05-31 — Bootstrap + analysis (Builder)
Operator manually kicked off Phase 2b (narrowed scope, plan §0): the ONLY task is to confirm the
per-recipe test sequence uses the minimum number of deploys, and fix it if not, without weakening any
test. Broad empirical-perf work is parked in IDEAS. Phase 2 is not yet `## DONE` (plausible/drone/Q5
remain), but B1B4 are a property of the already-existing harness, so the analysis is independent of
Phase-2 completion.
### Method
Traced every `abra app deploy`/`upgrade`/`new` path through the harness. Key realization: the only
thing that increments the DG4.1 deploy counter is `lifecycle._record_deploy()`, and it is called from
exactly one place — inside `lifecycle.deploy_app` (`:211`). So "deploy count" == number of `deploy_app`
calls in a run. Enumerated all `deploy_app` callers: base deploy (`run_recipe_ci.py:819`), per-dep
(`deps.py:100`), and WC5 promote (`:699`, which pops the countfile first so it's outside the budget).
### Why the budget is minimal (and tighter than plan B1's nominal text)
Plan B1 frames the minimum as `1 base + 1 upgrade + N_deps`, assuming the upgrade tier needs its own
prior-version deploy. The cc-ci design avoids that: when the upgrade tier runs, the *base* deploy is
done at the **previous published version** (`base = prev or target`, `:746-754`), and the upgrade is an
**in-place chaos redeploy** of PR-head onto that same app (`perform_upgrade``chaos_redeploy`, which
does NOT call `deploy_app`). So the prior-version deploy and the base deploy are the SAME deploy — the
upgrade tier adds zero deploys. backup/restore also operate on the same app. Net: `1 + N_cold_deps`.
This is the deploy-sharing the operator expected; nothing to remove because nothing is redundant.
### Why I trust the enforcement (B2 is real, not vacuous)
`run_recipe_ci.py:1005-1010` turns `deploy_count != expected_deploy_count` into a non-zero exit. So
every GREEN run is itself a proof the recipe stayed within `1 + N_cold_deps` — a redundant redeploy
would push the count over and fail the run red. The historical Phase-2 runs (recorded in
STATUS-2/REVIEW-2) corroborate: every recipe ran at `deploy-count = 1`, or `2 (expect 2)` for the one
cold-dep recipe (lasuite-docs + cold keycloak). Warm keycloak (lasuite-meet) → 0 dep deploys → expect 1.
### Why B3 holds
Sharing one deploy does not skip assertions: all five tiers still run their generic+overlay assertions
against the shared app; upgrade is a real prev→PR-head crossover verified by `assert_upgraded`; P4
backup→restore is real data-integrity; per-run isolation/teardown is unchanged. Only the deploy COUNT
is constrained, never the coverage.
### Cross-loop note
The Adversary's independent pre-claim cold trace (REVIEW-2b @05:33Z) reached the identical conclusion
and flagged exactly one completeness item: the B1/B4 doc must NAME the WC5 green-cold reseed
(`run_recipe_ci.py:699`) — one additional uncounted `abra app new` for canonical warm-cache
maintenance, outside the test-sequence budget. `docs/perf/deploys.md` addresses this in its
"Out of scope of the budget (intentionally)" section, and STATUS-2b names it in verify-step (a).
Claimed B1B4 accordingly.

99
machine-docs/STATUS-2b.md Normal file
View File

@ -0,0 +1,99 @@
# STATUS — Phase 2b (confirm the test sequence minimizes deploys)
**Phase plan (SSOT):** `/srv/cc-ci/cc-ci-plan/plan-phase2b-test-performance.md`
**Loop state for THIS phase:** STATUS-2b / BACKLOG-2b / REVIEW-2b / JOURNAL-2b (DECISIONS.md shared).
Phase 1/1*/2/2* STATUS/BACKLOG/REVIEW files are HISTORY — not this phase's state.
## Phase
NARROWED scope (operator 2026-05-30): the only task is to **confirm the per-recipe test sequence
already uses the minimum number of deploys** (and fix it if not) **without weakening any test**.
The broad empirical-perf program is parked in IDEAS. Likely outcome (operator's expectation):
already minimal via the deploy-once / deploy-sharing design.
## Definition of Done (Phase 2b) — B1B4, each Adversary cold-verified in REVIEW-2b
- [ ] **B1 — Deploy budget documented and minimal.**
- [ ] **B2 — Enforced, not just claimed** (deploy-count guard + RUN SUMMARY, expected reflects budget).
- [ ] **B3 — No test weakened to save a deploy** (coverage/isolation/teardown unchanged).
- [ ] **B4 — Recorded** (`docs/perf/deploys.md`).
---
## Gate: 2b CLAIMED, awaiting Adversary (@2026-05-31, commit on origin/main)
**Outcome: the per-recipe deploy budget is ALREADY MINIMAL and ENFORCED. No redundant deploy found;
none removed because none existed.** This is a confirm-and-document result (no harness behavior
change). Deliverable: `docs/perf/deploys.md`.
### WHAT is claimed (the budget)
Per cold `run_recipe_ci.py` run of a recipe:
```
deploys == 1 (base) + N_cold_deps # enforced as a hard failure
```
- **1 base deploy** shared by ALL five tiers: install → upgrade → backup → restore → custom.
- **+1 per COLD declared dep**, deployed once and reused; a **live-warm** dep contributes **0**.
- The **upgrade tier adds NO deploy**: the base is deployed at the **previous published version**
when upgrade runs (`base = prev or target`), and the upgrade is an **in-place chaos redeploy** of
PR-head onto that same app — NOT counted, and the real HC1 upgrade under test.
- **backup/restore add NO deploy** (operate on the same running app).
- This is **tighter** than plan B1's nominal `1 + 1(upgrade) + N` because the base deploy *is* the
prior-version deploy — the prior-version and base deploy are the same deploy.
### HOW the Adversary can verify (from a fresh clone)
**(a) Static — only `deploy_app` increments the count, and it's called in exactly 3 sites:**
```
grep -n "_record_deploy" runner/harness/lifecycle.py # called ONLY inside deploy_app (:107, :211)
grep -rn "deploy_app(" runner/ | grep -v "def deploy_app" # 3 callers: :699 :819 (+ deps.py:100)
```
- `lifecycle.py:211``deploy_app` is the sole caller of `_record_deploy`.
- `run_recipe_ci.py:819` — the single base deploy (cold main path).
- `runner/harness/deps.py:100` — one per declared dep.
- `run_recipe_ci.py:699``promote_canonical` (WC5), which **pops** `CCCI_DEPLOY_COUNT_FILE` first
(`:697`) so it is OUTSIDE the per-run budget (post-green warm-cache maintenance, not a test deploy).
- `lifecycle.chaos_redeploy` (the upgrade, `lifecycle.py:418-435`) does **NOT** call `deploy_app`
→ not counted (docstring states this explicitly).
- `generic.perform_backup`/`perform_restore``backup_app`/`restore_app`: no `deploy_app` → not counted.
- Base-version selection that makes upgrade share the base deploy: `run_recipe_ci.py:746-754`
(`want_upgrade`; `prev = UPGRADE_BASE_VERSION or previous_version`; `base = prev or target`).
**(b) Enforcement — DG4.1 guard hard-fails on mismatch:**
```
sed -n '958,1010p' runner/run_recipe_ci.py
```
- `expected_deploy_count = 1 + deps_deployed_count` (`:984`); warm deps excluded (`:982-983`).
- RUN SUMMARY prints `deploy-count = N (expect M)` (`:986`).
- `if deploy_count != expected_deploy_count: … overall = 1` → non-zero exit (`:1005-1010`).
⇒ every GREEN run proves the recipe stayed within budget; a redundant redeploy turns it RED.
**(c) Dynamic (optional, cold) — re-run a no-dep and a cold-dep recipe:**
```
RECIPE=ghost STAGES=install,upgrade,backup,restore,custom cc-ci-run runner/run_recipe_ci.py
RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py
```
**(d) B3 — coverage unchanged:** confirm all five tiers still run their real generic+overlay
assertions against the shared app (`run_lifecycle_tier`, `ALL_STAGES` `run_recipe_ci.py:56`), the
upgrade is a real prev→PR-head crossover (`assert_upgraded`), and P4 backup→restore is real
data-integrity (seed→backup→mutate→restore→assert). Nothing is skipped/softened to share the deploy.
**(e) B4 — the record:** `docs/perf/deploys.md` (this deliverable).
### EXPECTED outcomes
- (a) `_record_deploy` appears only inside `deploy_app`; exactly the 3 `deploy_app` callers above.
- (b) guard present and hard-failing as quoted; `expected = 1 + cold_deps`.
- (c) ghost: `deploy-count = 1 (expect 1)`, all tiers `pass`.
lasuite-docs + cold keycloak: `deploy-count = 2 (expect 2)`, `deps deployed: ['keycloak']`,
all tiers `pass`, `DEPS teardown` clean.
- Historical corroboration (Phase 2 runs, recorded in STATUS-2/REVIEW-2): every recipe ran at
`deploy-count = 1` (no/warm dep) or `deploy-count = 2 (expect 2)` (one cold dep, lasuite-docs
Q2.4 — REVIEW-2 `:114`). No run ever exceeded `1 + N_cold_deps`.
### WHERE the inputs live
- Deliverable doc: `docs/perf/deploys.md`.
- Code: `runner/run_recipe_ci.py` (`:56`, `:746-754`, `:819`, `:958-1010`),
`runner/harness/lifecycle.py` (`:107-211`, `:418-435`), `runner/harness/deps.py` (`:81-120`),
`runner/harness/generic.py` (`perform_upgrade`/`perform_backup`/`perform_restore`).
- Commit: see `git log origin/main` for the `claim(2b)` commit.
## Gates
- Gate 2b — CLAIMED, awaiting Adversary PASS in REVIEW-2b.