plan: per-test image pre-pull sub-plan (warm images before deploy + upgrade; cheap on warm cache)
Resolve a recipe's images (docker compose config --images) and docker pull them (skip-if-present for pinned tags) at the start of the recipe sequence + before the upgrade-new-version deploy, then the normal abra deploy. Separates pull from converge (clear pull failures vs murky convergence timeouts), speeds convergence (fits abra-native window). No layer re-download on warm cache; nightly all-recipes run warms everything. Complements (not replaces) the recipe healthcheck for slow-init convergence. Near-term Phase-2 harness unit; real abra deploy unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
62
cc-ci-plan/plan-prepull-images.md
Normal file
62
cc-ci-plan/plan-prepull-images.md
Normal file
@ -0,0 +1,62 @@
|
||||
# Sub-plan — pre-pull a recipe's images at the start of its test sequence
|
||||
|
||||
**Status:** QUEUED — a **small harness addition** to `runner/run_recipe_ci.py` (+ `harness`). Picks up
|
||||
as a near-term Phase-2 harness unit (not a phase-pause). Auto-applies in the nightly all-recipes run.
|
||||
**Owner:** Builder + Adversary loops. **This file:** `/srv/cc-ci/cc-ci-plan/plan-prepull-images.md`
|
||||
|
||||
---
|
||||
|
||||
## What
|
||||
At the **start of a recipe's test sequence — before the first `abra app deploy`** (and before the
|
||||
**upgrade tier's** new-version deploy) — resolve the recipe's images and `docker pull` them so they
|
||||
are in the local store before the deploy runs.
|
||||
|
||||
```
|
||||
# at the top of the per-recipe run, after `abra app new` + checkout + .env set:
|
||||
imgs = docker compose --env-file <app .env> -f <COMPOSE_FILE> config --images # resolves interpolation
|
||||
for img in imgs:
|
||||
docker image inspect "$img" >/dev/null 2>&1 || docker pull "$img" # skip-if-present (pinned tags)
|
||||
# then the normal: abra app deploy … (unchanged — real abra)
|
||||
# repeat the pre-pull for the UPGRADE target image set before `abra app upgrade`.
|
||||
```
|
||||
|
||||
## Why
|
||||
- **Separate "pull" from "converge."** A rate-limit / bad-tag / slow-pull then **fails fast and
|
||||
clearly as a pull error**, instead of surfacing later as a murky `not converged` deploy timeout
|
||||
(the F2-12-class confusion).
|
||||
- **Faster, more reliable convergence.** Images already local → swarm starts services immediately →
|
||||
the deploy fits **abra's native convergence window** better (supports "prefer abra convergence" —
|
||||
can reduce the `-c`/READY_PROBE workaround for *pull-bound* cases).
|
||||
- **Nightly:** the all-recipes nightly run pre-pulls implicitly, warming the cache for everything.
|
||||
|
||||
## Cheap on a warm cache (the key property)
|
||||
`docker pull image:tag` **does not re-download cached layers** — it does a cheap manifest check and
|
||||
reports `Already exists`; only missing/changed layers download. With coop-cloud's **pinned**
|
||||
(immutable) tags, the **skip-if-present** check (`docker image inspect`) makes it **zero network**
|
||||
when already cached. So per-test pre-pull is near-free after the first/nightly pull; it only does
|
||||
real work on a cold image or an upgrade to a genuinely new version.
|
||||
|
||||
## Honest caveats
|
||||
- **Removes *pull* time/variance from convergence, NOT *app-init* time.** Slow-starting apps
|
||||
(collabora's heavy init, F2-12) still need the **recipe healthcheck/`start_period`**
|
||||
(`plan-lasuite-drive-recipe-pr.md`). Pre-pull is complementary, not a replacement.
|
||||
- **Resolve via `docker compose config --images`** (handles `$VERSION`-style interpolation) using the
|
||||
**same `COMPOSE_FILE` set abra uses** (read from the app `.env`) — a naive `grep image:` misses
|
||||
interpolated tags and multi-compose recipes.
|
||||
- **Not an abra bypass.** `docker pull` only warms the local store; the deploy is still real
|
||||
`abra app deploy`/`upgrade`. Consistent with the "real abra commands" guardrail.
|
||||
- **Don't weaken anything** — a failed pre-pull is a real (clearer) test failure, reported as such.
|
||||
|
||||
## Definition of done (Adversary cold-verifies)
|
||||
- [ ] Pre-pull step runs at the start of the recipe sequence (before the deploy) **and** before the
|
||||
upgrade tier's new-version deploy; images resolved correctly (incl. interpolation, multi-compose).
|
||||
- [ ] **Cheap on warm cache:** a 2nd run's pre-pull does **no layer re-download** (skip-if-present, or
|
||||
`Already exists`); proven.
|
||||
- [ ] **Clear failure mode:** a pull failure (e.g. a deliberately-bad tag) is reported as a **pull
|
||||
error before deploy**, not as a deploy/convergence timeout.
|
||||
- [ ] No test weakened; deploy path unchanged (real abra). Bounded — this one step only.
|
||||
|
||||
## Scope note
|
||||
Per-test pre-pull only (at the recipe sequence start + upgrade). NOT a boot-time "pull all enrolled
|
||||
recipes" sweep — the local cache already accumulates across runs and the nightly all-recipes run
|
||||
warms everything, so per-test is the simple, targeted form.
|
||||
Reference in New Issue
Block a user