feat(2): HQ1 image pre-pull (plan-prepull-images.md) — warm local store before deploy

lifecycle.prepull_images(recipe, domain): resolve images via docker compose config --images (COMPOSE_FILE
from the app .env — handles $VERSION interpolation + multi-compose) → docker pull each, skip-if-present
(zero network for cached pinned tags). Called in deploy_app before the (unchanged, real) abra.deploy AND
in generic.perform_upgrade before the chaos redeploy (warms new-version images). A pull failure RAISES a
clear pre-deploy error (not a converge timeout); deploy path unchanged (no docker service update/scale).
Removes PULL time not app-INIT time. 4 unit tests (tests/unit/test_prepull.py): present→skip, missing→
pull, pull-fail→raise, no-images→skip. NOT claimed yet — validating cold-verify criteria next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 16:02:21 +01:00
parent e6e5436942
commit 2bf40d69d6
3 changed files with 142 additions and 0 deletions

View File

@ -237,6 +237,9 @@ def perform_upgrade(
before = lifecycle.deployed_identity(domain)
if head_ref:
lifecycle.recipe_checkout_ref(recipe, head_ref)
# HQ1: warm the NEW-version image set before the chaos redeploy (the head_ref checkout's pinned
# tags) so a pull failure is a clear pre-deploy error and convergence isn't pull-bound.
lifecycle.prepull_images(recipe, domain)
lifecycle.chaos_redeploy(domain, deploy_timeout=deploy_timeout, no_converge_checks=True)
# Own the convergence verification (abra's monitor was skipped via -c).
lifecycle.wait_healthy(