Per REVIEW-2 ## Q2 FAIL: runner/harness/deps.py::teardown_deps suppressed ALL exceptions via
contextlib.suppress(Exception), silently swallowing teardown failures. The 'DEPS teardown' print
fired even when undeploy actually raised — leaving leftover swarm services/volumes/secrets that
broke the NEXT run targeting the same deterministic dep domain (this is what caused the Q3.1 dep
flake I saw immediately after the Q2.4 acceptance run).
Fix:
- runner/harness/deps.py: teardown_deps now uses lifecycle.teardown_app(..., verify=True) so
residuals raise TeardownError. Errors are LOGGED LOUDLY per-dep but we continue to other deps
so one failure doesn't strand the rest. After all attempts: raise a combined TeardownError if
any dep failed.
- runner/run_recipe_ci.py: orchestrator catches the dep TeardownError in finally, prints it,
captures into dep_teardown_error; the run summary surfaces it and the exit code is non-zero.
The run STILL prints the diagnosable summary so a leak doesn't hide other failures.
Per §9 teardown sacred / DG7: a green run that leaks state is not 'green'. F2-5 now correctly
fails the run instead of silently passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
F1e-1 (Adversary): exec_in_app silently returned '' on a failed docker exec, flipping a healthy
recipe RED under opt-out (post-backup container cycle, no readiness buffer). Now polls (re-resolve
container + re-exec) until rc==0 or 90s, then RAISES — never masks an exec failure as empty data.
No assertion weakened. Verified: opt-out install,backup,restore on custom-html now PASS.
HC1: head_ref = ref or recipe_head_commit (prefer explicit PR head sha $REF — robust, no git race;
production !testme always sets REF). assert_upgraded, when head_ref known, REQUIRES the deployed
chaos-version commit to MATCH head_ref (direct + non-vacuous proof the PR-head code was deployed; a
stale prev-checkout chaos redeploy fails). Falls back to version/image/chaos move check otherwise.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- orchestrator: per mutating tier, run optional pre-op seed hook (ops.py pre_<op>) → perform the op
ONCE (harness-owned) → run generic assertion (unless opted out) AND overlay assertion, both against
the shared post-op deployment. Op results passed op→assertion via run-scoped CCCI_OP_STATE_FILE.
- opt-out: CCCI_SKIP_GENERIC / CCCI_SKIP_GENERIC_<OP> / recipe_meta.SKIP_GENERIC (declarative).
- generic.py: split do_* into op primitives (perform_upgrade/backup/restore) + assertions
(assert_upgraded/backup_artifact/restore_healthy) reading op_state(); deployed_identity now returns
{version,image,chaos} (chaos label ready for HC1).
- generic test_<op>.py + all 6 recipe overlays migrated to assertion-only; pre-op seeding moved to
per-recipe ops.py (pre_upgrade/pre_backup/pre_restore). install overlays unchanged (no op).
- deploy-count stays 1 (op primitives never call deploy_app). lint PASS; 8 unit tests PASS on cc-ci.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git fetch --tags <url> without a refspec errors 'couldn't find remote ref HEAD'; use
'refs/tags/*:refs/tags/*'. Verified: brings custom-html's 18 upstream version tags into the mirror
PR clone so the upgrade stage finds a previous published version (was skipping).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fetch_recipe (SRC+REF/PR path) now read-only fetches published version tags from the public upstream
into the mirror clone, so the upgrade stage finds a previous published version (mirror PR branches
carry no tags → upgrade would skip). Guardrail-safe: only fetches tags, never pushes to the recipe
repo; plain git so the bot token isn't sent to upstream. Adds the 6 D10 recipes to the bridge
POLL_REPOS so !testme on their PRs triggers runs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/secrets.md documents the 3 secret classes (A1 external, A2 internal-generated, B recipe-app),
the sops-nix decryption chain, and rotation procedures for each (cert version bump, sops re-encrypt +
swarm-secret version bump, recipe-app ephemeral). run_recipe_ci streams each stage's output through a
redaction filter that masks any /run/secrets/* value (>=8 chars) before it reaches Drone logs —
belt-and-suspenders over 'harness never prints secrets + abra doesn't echo'. Live streaming + exit
code preserved (locally tested). Recipe-ci clones cc-ci fresh per build, so this applies next run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes surfaced by the first real recipe-ci run through Drone:
- abra app backup/restore now pass -C -o (current checkout, no remote fetch) like
every other recipe-touching call — without -o they fetch recipe tags from the
(private) remote and fail 'authentication required: Unauthorized'.
- fetch_recipe's catalogue path rm's the recipe dir first so a leftover private-mirror
remote from a prior SRC+REF run can't poison version resolution / backup.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
D4 snapshots recipe-shipped tests/ and runs them against the live app. abra -C -o
everywhere + token clone for private mirror PRs. keycloak install green with no
harness surgery (D5). docs/enroll-recipe.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Recipe-agnostic harness (no surgery to enroll a recipe): recipe_meta.py for
health path/codes/timeouts; run_recipe_local discovers + runs recipe-shipped
tests/ against the live app. install non-regressed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3-stage run green (install/upgrade/backup), clean teardown. backupbot deployed
via reconcile oneshot; PTY (script) for abra backup/restore; -m for secret generate
(no value leak). M5 CLAIMED.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>