When a DEPS-declaring recipe's setup_custom_tests fails, its @requires_deps (SSO/OIDC)
tests skip; a skip-only pytest file exits 0 so the run previously reported overall=0
(GREEN) while the only SSO test never ran (violates P7). Fix preserves generic-tier
failure-isolation but corrects the green SIGNAL:
- conftest.pytest_collection_modifyitems counts skipped requires_deps tests and appends
to $CCCI_DEPS_SKIP_REPORT.
- run_recipe_ci: sums the count, surfaces it in RUN SUMMARY, and new pure predicate
sso_dep_unverified(declared, deps_ready, skipped) flips overall=1.
- 7 new unit tests (tests/unit/test_f211_sso_skip.py).
Verified deploy-free (rate-limit-independent): 35/35 unit PASS; cold real-test proof on
lasuite-docs test_oidc_with_keycloak.py -> 1 skipped + skip-report==1 -> orchestrator
would set overall=1. Full e2e deferred until Docker Hub rate limit lifts.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Harness change (small, surgical):
- runner/harness/lifecycle.deploy_app gains a deploy_timeout param (default 900s); passes
through to abra.deploy(timeout=...). For heavy recipes (ghost, matrix-synapse, lasuite-meet),
the orchestrator + dep resolver now read recipe_meta.DEPLOY_TIMEOUT and pass it so the Python
subprocess wrapping abra deploy doesn't SIGKILL it before the recipe's INTERNAL TIMEOUT
(via EXTRA_ENV) finishes swarm convergence.
- runner/run_recipe_ci.py + runner/harness/deps.py: thread recipe_meta.DEPLOY_TIMEOUT into
the per-recipe deploy_app call.
Q4.4 ghost enrollment:
- recipe_meta.py: HEALTH_PATH=/, DEPLOY_TIMEOUT=1200 (subprocess), EXTRA_ENV={TIMEOUT: 1200}
(recipe internal). Ghost cold-start with theme + DB migration runs ~12-15min on cc-ci.
- functional/test_health_check.py: GET / returns 200 (themed site).
- functional/test_content_api.py: GET /ghost/api/content/settings/ returns 200 (settings JSON)
or 401/403 (Ghost error envelope) — distinguishes ghost-server up + JSON API working from
static fallback.
- functional/test_admin_redirect.py: GET /ghost/ returns 200 or 302 + Ghost branding;
proves admin route is wired through nginx proxy.
- PARITY.md: recipe-maintainer corpus has no ghost tests/, Phase-2 health_check is the
parity baseline; create-a-post deeper test deferred (DEFERRED.md, --extra-tests linked).
Cold-verifiable (log /root/ccci-q44-ghost-r3.log):
RECIPE=ghost STAGES=install,custom cc-ci-run runner/run_recipe_ci.py
install + 3 functional tests PASS, deploy-count=1. 28/28 unit tests still PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per REVIEW-2 ## Q2 FAIL: runner/harness/deps.py::teardown_deps suppressed ALL exceptions via
contextlib.suppress(Exception), silently swallowing teardown failures. The 'DEPS teardown' print
fired even when undeploy actually raised — leaving leftover swarm services/volumes/secrets that
broke the NEXT run targeting the same deterministic dep domain (this is what caused the Q3.1 dep
flake I saw immediately after the Q2.4 acceptance run).
Fix:
- runner/harness/deps.py: teardown_deps now uses lifecycle.teardown_app(..., verify=True) so
residuals raise TeardownError. Errors are LOGGED LOUDLY per-dep but we continue to other deps
so one failure doesn't strand the rest. After all attempts: raise a combined TeardownError if
any dep failed.
- runner/run_recipe_ci.py: orchestrator catches the dep TeardownError in finally, prints it,
captures into dep_teardown_error; the run summary surfaces it and the exit code is non-zero.
The run STILL prints the diagnosable summary so a leak doesn't hide other failures.
Per §9 teardown sacred / DG7: a green run that leaks state is not 'green'. F2-5 now correctly
fails the run instead of silently passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
F1e-1 (Adversary): exec_in_app silently returned '' on a failed docker exec, flipping a healthy
recipe RED under opt-out (post-backup container cycle, no readiness buffer). Now polls (re-resolve
container + re-exec) until rc==0 or 90s, then RAISES — never masks an exec failure as empty data.
No assertion weakened. Verified: opt-out install,backup,restore on custom-html now PASS.
HC1: head_ref = ref or recipe_head_commit (prefer explicit PR head sha $REF — robust, no git race;
production !testme always sets REF). assert_upgraded, when head_ref known, REQUIRES the deployed
chaos-version commit to MATCH head_ref (direct + non-vacuous proof the PR-head code was deployed; a
stale prev-checkout chaos redeploy fails). Falls back to version/image/chaos move check otherwise.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- orchestrator: per mutating tier, run optional pre-op seed hook (ops.py pre_<op>) → perform the op
ONCE (harness-owned) → run generic assertion (unless opted out) AND overlay assertion, both against
the shared post-op deployment. Op results passed op→assertion via run-scoped CCCI_OP_STATE_FILE.
- opt-out: CCCI_SKIP_GENERIC / CCCI_SKIP_GENERIC_<OP> / recipe_meta.SKIP_GENERIC (declarative).
- generic.py: split do_* into op primitives (perform_upgrade/backup/restore) + assertions
(assert_upgraded/backup_artifact/restore_healthy) reading op_state(); deployed_identity now returns
{version,image,chaos} (chaos label ready for HC1).
- generic test_<op>.py + all 6 recipe overlays migrated to assertion-only; pre-op seeding moved to
per-recipe ops.py (pre_upgrade/pre_backup/pre_restore). install overlays unchanged (no op).
- deploy-count stays 1 (op primitives never call deploy_app). lint PASS; 8 unit tests PASS on cc-ci.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git fetch --tags <url> without a refspec errors 'couldn't find remote ref HEAD'; use
'refs/tags/*:refs/tags/*'. Verified: brings custom-html's 18 upstream version tags into the mirror
PR clone so the upgrade stage finds a previous published version (was skipping).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fetch_recipe (SRC+REF/PR path) now read-only fetches published version tags from the public upstream
into the mirror clone, so the upgrade stage finds a previous published version (mirror PR branches
carry no tags → upgrade would skip). Guardrail-safe: only fetches tags, never pushes to the recipe
repo; plain git so the bot token isn't sent to upstream. Adds the 6 D10 recipes to the bridge
POLL_REPOS so !testme on their PRs triggers runs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/secrets.md documents the 3 secret classes (A1 external, A2 internal-generated, B recipe-app),
the sops-nix decryption chain, and rotation procedures for each (cert version bump, sops re-encrypt +
swarm-secret version bump, recipe-app ephemeral). run_recipe_ci streams each stage's output through a
redaction filter that masks any /run/secrets/* value (>=8 chars) before it reaches Drone logs —
belt-and-suspenders over 'harness never prints secrets + abra doesn't echo'. Live streaming + exit
code preserved (locally tested). Recipe-ci clones cc-ci fresh per build, so this applies next run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes surfaced by the first real recipe-ci run through Drone:
- abra app backup/restore now pass -C -o (current checkout, no remote fetch) like
every other recipe-touching call — without -o they fetch recipe tags from the
(private) remote and fail 'authentication required: Unauthorized'.
- fetch_recipe's catalogue path rm's the recipe dir first so a leftover private-mirror
remote from a prior SRC+REF run can't poison version resolution / backup.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
D4 snapshots recipe-shipped tests/ and runs them against the live app. abra -C -o
everywhere + token clone for private mirror PRs. keycloak install green with no
harness surgery (D5). docs/enroll-recipe.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Recipe-agnostic harness (no surgery to enroll a recipe): recipe_meta.py for
health path/codes/timeouts; run_recipe_local discovers + runs recipe-shipped
tests/ against the live app. install non-regressed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3-stage run green (install/upgrade/backup), clean teardown. backupbot deployed
via reconcile oneshot; PTY (script) for abra backup/restore; -m for secret generate
(no value leak). M5 CLAIMED.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>