fix(2): Q4.2 mumble — CHAOS_BASE_DEPLOY meta flag for chaos base deploy (clean-tree gate)

mumble's pinned base deploy (prev version 0.2.0) FATAs 'has locally unstaged changes' because
install_steps provides an untracked compose.host-ports.yml. New recipe_meta CHAOS_BASE_DEPLOY=True +
lifecycle._recipe_meta_flag + deploy_app branch -> base uses chaos (skips clean-tree/lint, deploys the
checked-out pinned version, not LATEST), mirroring the lightweight-tag chaos-base path. DECISIONS.md
records the full mumble enrollment design.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 19:32:48 +01:00
parent 1b6c77c76a
commit 999dd0d564
3 changed files with 64 additions and 0 deletions

View File

@ -857,3 +857,34 @@ under an active throttle. Therefore the gate's full-lifecycle green still depend
the install-tier deploy's first download (achievable after a rate-limit cooldown with a single clean
run). The recipe PR is filed as a deferred robustness follow-up (Q4.7b), mirroring the Q3.2b/immich
pattern; Adversary/operator weigh whether it gates Phase-2 DONE.
## mumble: TCP/voice recipe enrollment — mumbleweb HTTP readiness + host-ports + CHAOS_BASE_DEPLOY (2026-05-29)
**Decision (settled):** mumble is a non-HTTP TLS voice server (port 64738). To enroll it in cc-ci's
HTTP-readiness + on-host (cc-ci-run) test model, deploy it with the two upstream overlays
`COMPOSE_FILE=compose.yml:compose.mumbleweb.yml:compose.host-ports.yml` (recipe_meta.EXTRA_ENV):
- **compose.mumbleweb.yml** — the upstream mumble-web HTTP client, routed through Traefik on the app
domain. Gives the generic harness a real HTTP serving/readiness signal (HEALTH_PATH "/") AND the
web_client.py parity surface. Present in every published mumble version.
- **compose.host-ports.yml** — publishes 64738 (tcp+udp, mode:host) on the cc-ci host, so the on-host
protocol tests connect to 127.0.0.1:64738. The voice server has NO HTTP API and cc-ci's Traefik only
exposes 80/443 (no `mumble` TCP entrypoint; the gateway forwards 443 only, out of our control), so a
host-published port is the reachable path. The `proxy` overlay is attachable (an ephemeral-container
path was considered) but host-ports is simpler and needs no extra image.
**Two enrollment hazards + their fixes:**
1. The upstream `compose.host-ports.yml` exists only from version **1.0.0+**, but the upgrade tier's
base deploy is the **previous** published version (0.2.0+), which predates it → COMPOSE_FILE fails
to resolve on the base deploy. Fix: `tests/mumble/install_steps.sh` provides a cc-ci-owned identical
`compose.host-ports.yml` to the recipe checkout when absent (no-op when the version ships it natively).
2. That provided file is UNTRACKED in the older checkout → abra's PINNED base-deploy clean-tree check
FATAs ("has locally unstaged changes"). Fix: new recipe_meta flag **`CHAOS_BASE_DEPLOY=True`** +
harness support (`lifecycle._recipe_meta_flag` + a `deploy_app` branch) → the base deploy uses chaos
(skips lint/clean-tree, deploys the EXPLICITLY-checked-out pinned version — not LATEST), mirroring the
existing lightweight-tag chaos-base mechanism. HC1/deploy-count unaffected (upgrade still chaos-redeploys
to PR-head; base chaos-version=prev-commit != head → real crossover).
**P4 (backup data-integrity):** mumble persists server state in `/data/mumble-server.sqlite` (the exact
file the recipe's backupbot hooks `.backup`/restore). ops.py seeds a `ci_marker` row there (using
`PRAGMA busy_timeout` to wait out the running murmur server's transient sqlite locks), backup, drop,
restore, assert the row survived.

View File

@ -90,6 +90,19 @@ def _recipe_extra_env(recipe: str, domain: str) -> dict[str, str]:
return {str(k): str(v) for k, v in (ee or {}).items()}
def _recipe_meta_flag(recipe: str, key: str) -> bool:
"""Read a boolean flag from tests/<recipe>/recipe_meta.py (e.g. CHAOS_BASE_DEPLOY). Returns
False if the recipe ships no meta or the flag is absent/falsey. Trusted in-repo exec, same as
_recipe_extra_env."""
path = os.path.join(os.path.dirname(__file__), "..", "..", "tests", recipe, "recipe_meta.py")
if not os.path.exists(path):
return False
ns: dict = {}
with open(path) as fh:
exec(compile(fh.read(), path, "exec"), ns) # noqa: S102 (trusted, in-repo)
return bool(ns.get(key))
def _record_deploy() -> None:
"""Increment the per-run deploy counter (DG4.1: one deploy per run). No-op unless the
orchestrator set CCCI_DEPLOY_COUNT_FILE — so it never affects standalone/manual use."""
@ -217,6 +230,19 @@ def deploy_app(
flush=True,
)
chaos = True
# A recipe may force a chaos base deploy via recipe_meta CHAOS_BASE_DEPLOY=True when cc-ci adds
# an untracked compose overlay to the recipe checkout (e.g. mumble's host-ports.yml, provided
# by install_steps for older versions that predate it). The untracked file makes abra's
# pinned-deploy clean-tree check FATA ('has locally unstaged changes'); chaos skips lint +
# the clean-tree gate and deploys the EXPLICITLY-checked-out pinned version (we already ran
# recipe_checkout(version) above) — NOT latest. Same mechanism as the lightweight-tag branch.
elif _recipe_meta_flag(recipe, "CHAOS_BASE_DEPLOY"):
print(
f" deploy_app({recipe}@{version}): CHAOS_BASE_DEPLOY set → chaos base deploy of the "
"checked-out pinned version (skips clean-tree/lint; deploys version, not LATEST)",
flush=True,
)
chaos = True
# Pin DOMAIN to the run domain explicitly. `abra app new -D` fills it for recipes whose
# .env.sample uses a literal placeholder, but NOT for ones using a `{{ .Domain }}` Go-template
# (this abra version leaves it unexpanded → deploy fails "can't evaluate field Domain"). Setting

View File

@ -16,6 +16,13 @@
HEALTH_PATH = "/" # mumble-web client UI
HEALTH_OK = (200,)
# install_steps.sh provides compose.host-ports.yml to recipe versions that predate it (the upgrade
# tier's base deploy is the previous published version, 0.2.0+, which lacks the upstream overlay).
# That untracked file makes abra's PINNED base-deploy clean-tree check FATA, so deploy the
# explicitly-checked-out pinned version with chaos (skips lint/clean-tree; deploys the version, not
# LATEST). No-op for the upgrade tier (already a PR-head chaos redeploy). See DECISIONS.md.
CHAOS_BASE_DEPLOY = True
DEPLOY_TIMEOUT = 900 # two images to pull (mumble-server + mumble-web) on a cold node
HTTP_TIMEOUT = 300