When a DEPS-declaring recipe's setup_custom_tests fails, its @requires_deps (SSO/OIDC)
tests skip; a skip-only pytest file exits 0 so the run previously reported overall=0
(GREEN) while the only SSO test never ran (violates P7). Fix preserves generic-tier
failure-isolation but corrects the green SIGNAL:
- conftest.pytest_collection_modifyitems counts skipped requires_deps tests and appends
to $CCCI_DEPS_SKIP_REPORT.
- run_recipe_ci: sums the count, surfaces it in RUN SUMMARY, and new pure predicate
sso_dep_unverified(declared, deps_ready, skipped) flips overall=1.
- 7 new unit tests (tests/unit/test_f211_sso_skip.py).
Verified deploy-free (rate-limit-independent): 35/35 unit PASS; cold real-test proof on
lasuite-docs test_oidc_with_keycloak.py -> 1 skipped + skip-report==1 -> orchestrator
would set overall=1. Full e2e deferred until Docker Hub rate limit lifts.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per orchestrator's SSO-dep plan + the refactor in 41ede13, DEFERRED.md entry #5 (lasuite-docs
OIDC parity ports + create-a-doc) closes by execution.
- tests/lasuite-docs/functional/test_oidc_login.py: parity port of recipe-maintainer
oidc_login.py. Anonymous GET /api/v1.0/users/me/ → 302 to keycloak realm OR 401/403;
password-grant token → 200 with user.email matching the provisioned test user.
- tests/lasuite-docs/functional/test_create_doc.py: plan §4.3 prescribed create-an-object +
read-it-back. POST /api/v1.0/documents/ with OIDC Bearer → captured id; GET
/api/v1.0/documents/<id>/ → asserts id+title round-trip.
Both marked \@pytest.mark.requires_deps; skipped with 'deps-not-ready' if setup_custom_tests
fails (failure isolation per plan-sso-dep-testing.md §4).
Cold-verifiable: ssh cc-ci 'RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py'
install: 2 PASS; custom: 5 PASS incl. test_oidc_login_via_keycloak +
test_create_doc_and_read_back; deploy-count=2 (recipe + keycloak dep).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tests/plausible/recipe_meta.py + tests/plausible/functional/test_health_check.py drafted with
EXTRA_ENV setting required Phoenix vars (DISABLE_AUTH, DISABLE_REGISTRATION, SECRET_KEY_BASE).
Stack converges 1/1 but the served app returns HTTP 500 from / for the full 600s HTTP_TIMEOUT
window — config-class failure, not a deploy-timing issue. Diagnosing needs live container-log
inspection + iterative env tuning, more debug cycles than fit autonomous mode. Committing the
draft + a DEFERRED.md entry; operator can iterate when they want.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harness change (small, surgical):
- runner/harness/lifecycle.deploy_app gains a deploy_timeout param (default 900s); passes
through to abra.deploy(timeout=...). For heavy recipes (ghost, matrix-synapse, lasuite-meet),
the orchestrator + dep resolver now read recipe_meta.DEPLOY_TIMEOUT and pass it so the Python
subprocess wrapping abra deploy doesn't SIGKILL it before the recipe's INTERNAL TIMEOUT
(via EXTRA_ENV) finishes swarm convergence.
- runner/run_recipe_ci.py + runner/harness/deps.py: thread recipe_meta.DEPLOY_TIMEOUT into
the per-recipe deploy_app call.
Q4.4 ghost enrollment:
- recipe_meta.py: HEALTH_PATH=/, DEPLOY_TIMEOUT=1200 (subprocess), EXTRA_ENV={TIMEOUT: 1200}
(recipe internal). Ghost cold-start with theme + DB migration runs ~12-15min on cc-ci.
- functional/test_health_check.py: GET / returns 200 (themed site).
- functional/test_content_api.py: GET /ghost/api/content/settings/ returns 200 (settings JSON)
or 401/403 (Ghost error envelope) — distinguishes ghost-server up + JSON API working from
static fallback.
- functional/test_admin_redirect.py: GET /ghost/ returns 200 or 302 + Ghost branding;
proves admin route is wired through nginx proxy.
- PARITY.md: recipe-maintainer corpus has no ghost tests/, Phase-2 health_check is the
parity baseline; create-a-post deeper test deferred (DEFERRED.md, --extra-tests linked).
Cold-verifiable (log /root/ccci-q44-ghost-r3.log):
RECIPE=ghost STAGES=install,custom cc-ci-run runner/run_recipe_ci.py
install + 3 functional tests PASS, deploy-count=1. 28/28 unit tests still PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per orchestrator note: my prior append (commit 650ab47) accidentally landed under the
'## Closed deferrals' header instead of '## Open deferrals'. All 5 entries (lasuite-docs OIDC
parity, cryptpad create-a-pad, uptime-kuma create-a-monitor, ghost create-a-post, authentik
enrollment) are still OPEN (unchecked boxes) — section relocation only, no content change.
'## Closed deferrals' restored to its (none yet) placeholder.
Per orchestrator note: machine-docs/DEFERRED.md is now the single canonical registry for any
deliberately-deferred work. Every entry MUST carry a specific RE-ENTRY TRIGGER. The orchestrator
seeded 4 matrix-synapse entries; this commit migrates the other Phase-2 deferrals I'd buried
in JOURNAL/PARITY/DECISIONS:
- lasuite-docs OIDC parity ports + create-a-doc (re-entry: before any Q3 gate claim — Adversary
already flagged this in Q3/Q4 checkpoint).
- cryptpad create-a-pad + content round-trip Playwright (re-entry: Adversary F2-9 conditional —
MUST lift before Phase-2 DONE; Q5.2 cold-sample must include).
- uptime-kuma create-a-monitor via Socket.IO (re-entry: --extra-tests flag OR another recipe
needing Socket.IO).
- ghost create-a-post round-trip (re-entry: --extra-tests flag OR Q4 deeper-test pass before
Phase-2 DONE).
- Q2.2 authentik enrollment + setup_authentik_realm backend (re-entry: when cryptpad oidc_login
parity lifts — uses authentik — OR Phase-2 DONE review).
All linked to IDEAS.md --extra-tests flag where relevant. Phase-4 cleanup pass MUST review this
file per plan.md §6.1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per REVIEW-2 ## Q3/Q4 partial checkpoint, F2-8: 'goat CLI in container / account state cleanup'
was the §7.1-prohibited 'needs X' excuse class (same shape as F2-4). The recipe-maintainer
corpus literally calls the goat CLI via abra app run — it works fine.
Added tests/bluesky-pds/functional/test_account_and_post.py:
- goat pds describe → assert did:web:<live_app> in output (PDS self-identifies correctly).
- goat pds admin account create with UUID-suffixed handle + email + per-run password (class-B);
parse new account's did:plc:<id>.
- POST /xrpc/com.atproto.server.createSession with the new handle+password → accessJwt.
- POST /xrpc/com.atproto.repo.createRecord (collection=app.bsky.feed.post) with a UUID-marker
text → returns at://<did>/app.bsky.feed.post/<rkey>.
- GET /xrpc/com.atproto.repo.getRecord with that rkey → assert value.text == marker (round-trip).
- Best-effort goat account delete cleanup in finally.
This is the §4.3 prescribed test in full (create account + create post + fetch back + delete).
Cold-verifiable: ssh cc-ci 'RECIPE=bluesky-pds STAGES=install,custom cc-ci-run runner/run_recipe_ci.py'
install + 4 functional tests (health_check + describe_server + session_auth + account_and_post)
all PASS, deploy-count=1.
PARITY.md updated to show goat_account.py as ported.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code-only commit. The Phase-2 functional tests + PARITY.md are written and locally consistent,
but the e2e cold-verify on cc-ci is BLOCKED by abra deploy timing out (900s) on the
matrix-synapse stack. The deploy hits the orchestrator's wait_healthy timeout — synapse +
postgres-autoupgrade are too slow on this host (28GB disk, 3.5GB RAM, single node).
Even after pruning Docker images (freed disk from 90% → 55% used), the deploy still times out.
Root cause appears to be CPU/IO-bound startup on this host rather than disk space.
What's landed (code-only):
- tests/matrix-synapse/PARITY.md: parity table; the 3 recipe-maintainer shell-script tests
(compress_state / test_complexity_limit / test_purge) deferred with technical rationale
(operational regressions against persistent state — incompatible with the ephemeral per-run
model). Phase-2 health_check added (the corpus has no health_check.py).
- tests/matrix-synapse/functional/test_health_check.py: GET /_matrix/client/versions → 200 + JSON.
- tests/matrix-synapse/functional/test_federation_version.py: GET /_matrix/federation/v1/version
→ 200, asserts server.name='Synapse' + non-empty server.version (plan §4.3 prescribed).
- tests/matrix-synapse/functional/test_register_and_message.py: plan §4.3 prescribed test —
registers two users via the public client API (m.login.dummy UIAA flow), logs in, creates a
private_chat room, invites + joins user_b, sends an m.room.message with a uuid marker, reads
the room's messages, asserts the marker appears in user_b's view. Non-vacuous full client-API
roundtrip.
- tests/matrix-synapse/recipe_meta.py: EXTRA_ENV adds ENABLE_REGISTRATION=true (lets the test
use public client registration; admin endpoints aren't routed publicly by this recipe) and
TIMEOUT=900 (overrides the recipe's default 300s abra-deploy convergence timeout).
**Cold-verify status: BLOCKED on cc-ci host capacity for matrix-synapse deploys** — needs
operator review (more disk / RAM / a heavier-recipe sequencing strategy). Filed in JOURNAL-2 +
PushNotification.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Q3.1 lasuite-docs: parity + 2 specific (oidc_with_keycloak + auth_required); deeper oidc_login
+ upload_conversion + create-a-doc need lasuite-docs OIDC env wiring (install_steps.sh). Tracked.
Q3.4 cryptpad: parity + 2 specific (spa_assets + Playwright render); §4.3-prescribed create-pad
deeper test deferred with technical rationale (version-specific UI selectors). DECISIONS.md
Phase-2 Q3.4 section logs the deferral for Adversary sign-off per §7.1.
Both meet the ≥2 specific floor; both have open follow-ups documented for the Q3 gate (and/or
Q5 catch-up).
Initial Q3.4 (commit 0fb1458) shipped two tests that failed cold:
- test_api_config.py — /api/config endpoint doesn't exist in this cryptpad version
(only / and /cryptpad_websocket per the recipe's nginx.conf.tmpl). REMOVED.
- test_pad_create.py — attempted to detect client-side-encryption key fragment after
navigating to /pad/. CryptPad's pad-creation flow is version-specific; this release
(10.6.0+5.7.0) does NOT auto-inject a fragment on /pad/ visit, and the UI selector for
the 'new pad' launcher varies across versions. Deeper test deferred.
Revised:
- tests/cryptpad/functional/test_spa_assets.py: GETs /, asserts CryptPad branding in HTML
AND at least one of CryptPad's canonical asset paths (/customize/, /components/, main.js,
/api/broadcast). Non-vacuous: catches the wedged-cryptpad-server-fallback-page case.
- tests/cryptpad/playwright/test_pad_create.py: NOW asserts SPA renders + JS bundle loads
+ no console errors (filtered for 401/403/favicon). Documents the create-pad deeper test
as deferred in-file. The maximal testable subset per §7.1 is what's shipped here.
- PARITY.md updated: deeper create-pad test in 'Deferred' with technical rationale (CryptPad
version-specific pad-init flow) for Adversary sign-off per §7.1.
Cold-verifiable on cc-ci (log /root/ccci-q34-cryptpad-r4.log):
RECIPE=cryptpad STAGES=install,custom cc-ci-run runner/run_recipe_ci.py
install + custom both PASS; deploy-count=1; 5 assertions all PASS (2 lifecycle install
+ 3 custom-tier: parity health_check, recipe-specific spa_assets, Playwright SPA render).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- tests/cryptpad/PARITY.md: parity table for health_check.py (ported);
oidc_login.py documented as authentik-deferred (cross-recipe; needs Q2.2 enrollment).
- tests/cryptpad/functional/test_health_check.py: parity port, SOURCE comment present.
- tests/cryptpad/functional/test_api_config.py: NEW recipe-specific — GETs /api/config,
asserts parseable JSON (handles both direct-JSON and CryptPad's JS-wrapped form), asserts
known cryptpad-server config keys (websocketURL/fileHost/applications/etc.). Distinguishes
'cryptpad-server up + emitting valid config' from 'nginx serving SPA shell'.
- tests/cryptpad/playwright/test_pad_create.py: NEW Playwright create-and-read-back. Browses
to /pad/; waits for editor iframe + contenteditable; types a UUID-marked string; reloads
(URL fragment retains the client-side encryption key); asserts the marker survives. This
is the plan §4.3-prescribed CryptPad-specific test ('use Playwright, not bare curl').
- STATUS-2 updated to record Q2 Adversary PASS (REVIEW-2 ## Q2 — PASS).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per REVIEW-2 ## Q2 FAIL @2026-05-28 (F2-5 dep teardown leak + F2-6 cold install flake + F2-7
SSO setup keycloak-hardcoded):
F2-5 closed by commit c6e94af: teardown_deps now uses verify=True so residuals raise; failures
propagate to orchestrator exit code + run summary. Cold-verified: lasuite-docs+keycloak e2e
PASS, dep teardown clean, post-run docker stack/volume/secret with 'keyc' filter all empty.
This also explained my Q3.1 flake — the leaked Q2.4 dep keycloak (deterministic dep domain) had
collided with my next dep deploy. With F2-5 fixed, that class of cross-run collision is
impossible (teardown now raises if it leaks, so the run fails BEFORE the next one starts).
F2-7 acknowledged: setup_keycloak_realm is keycloak-specific; authentik would need parallel
backend. Logged for Q2.2/Q5.
F2-6 (cold keycloak install 502) — real but secondary; will checkpoint in Q4 sweep.
Side-effect: Q3.1 partial also landed (PARITY.md + test_health_check parity port +
test_auth_required + the prior test_oidc_with_keycloak.py as Q3.1 third specific test).
Cold evidence: ssh cc-ci 'RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py'
deploy-count=2 (expect 2), all 5 assertions PASS, dep teardown clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- tests/lasuite-docs/PARITY.md: parity table for health_check.py (ported);
oidc_login.py + upload_conversion.py documented as Q3.1 follow-up needing OIDC env wiring;
≥2 recipe-specific tests rationale (test_oidc_with_keycloak + test_auth_required).
- tests/lasuite-docs/functional/test_health_check.py: parity port of
recipe-info/lasuite-docs/tests/health_check.py — HTTP 200/301/302 from root.
- tests/lasuite-docs/functional/test_auth_required.py: NEW recipe-specific —
GET /api/v1.0/users/me/ asserts 401/403 (auth required). Non-vacuous: distinguishes
correctly-wired OIDC gate from anonymous access (200), missing route (404), broken (5xx).
The Q2.4 acceptance test (test_oidc_with_keycloak.py) continues to verify the dep resolver +
SSO harness against the per-run keycloak dep (F2-5 fix verified cold; see ccci-f25-verify.log).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per REVIEW-2 ## Q2 FAIL: runner/harness/deps.py::teardown_deps suppressed ALL exceptions via
contextlib.suppress(Exception), silently swallowing teardown failures. The 'DEPS teardown' print
fired even when undeploy actually raised — leaving leftover swarm services/volumes/secrets that
broke the NEXT run targeting the same deterministic dep domain (this is what caused the Q3.1 dep
flake I saw immediately after the Q2.4 acceptance run).
Fix:
- runner/harness/deps.py: teardown_deps now uses lifecycle.teardown_app(..., verify=True) so
residuals raise TeardownError. Errors are LOGGED LOUDLY per-dep but we continue to other deps
so one failure doesn't strand the rest. After all attempts: raise a combined TeardownError if
any dep failed.
- runner/run_recipe_ci.py: orchestrator catches the dep TeardownError in finally, prints it,
captures into dep_teardown_error; the run summary surfaces it and the exit code is non-zero.
The run STILL prints the diagnosable summary so a leak doesn't hide other failures.
Per §9 teardown sacred / DG7: a green run that leaks state is not 'green'. F2-5 now correctly
fails the run instead of silently passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 lesson from F2-3 (n8n install Playwright flake on net::ERR_NETWORK_CHANGED): every
install overlay that does page.goto needs the same try/except PlaywrightError + status retry.
Centralize in runner/harness/browser.py::goto_with_retry; apply to ALL install overlays.
- runner/harness/browser.py: shared helper. Polls page.goto until status in accept_statuses;
catches PlaywrightError (net::ERR_*) as a retryable signal, not a failure. Raises AssertionError
with last_status + last_err diagnostic only on deadline expiry.
- tests/custom-html/test_install.py: now uses goto_with_retry (200 only, wait_until=load).
- tests/custom-html/playwright/test_browser_smoke.py: same.
- tests/n8n/test_install.py: replaced inline retry loop with goto_with_retry (200, 304).
- tests/keycloak/test_install.py: goto_with_retry for admin console (200, 302, 303; 45s goto).
- tests/cryptpad/test_install.py: goto_with_retry (200, 304; 60s goto, wait_until=load).
- tests/lasuite-docs/test_install.py: goto_with_retry (200, 301, 302; 60s goto).
Cold-verifiable: ssh cc-ci 'RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py'
all 5 stages PASS (including the install overlay that flaked in the deps_smoke run),
deploy-count=1, head_ref=8a026066==chaos-version=8a026066 (HC1 non-vacuous).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per REVIEW-2 ## Q1 — PASS @2026-05-28: F2-3 + F2-4 closed; cold e2e on Adversary clone all 5
stages PASS; deploy-count=1; HC1 non-vacuous; teardown sacred; NO VETO. Builder may advance to Q2.
Q2.1 keycloak in flight: first attempt hit 502 from /realms/master at 600s; bumped DEPLOY_TIMEOUT
+ HTTP_TIMEOUT to 900s in tests/keycloak/recipe_meta.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Adversary cold (REVIEW-2 Q1 FAIL):
- F2-4: 'needs owner setup' rationale was the prohibited 'needs SSO setup' class per plan §7.1.
Fixed by tests/n8n/functional/test_workflow_roundtrip.py (commit fc89552) — the plan §4.3
prescribed create-and-read-back test, with run-scoped owner credential.
- F2-3: page.goto raised PlaywrightError outside the retry loop on net::ERR_*. Fixed by wrapping
page.goto in try/except PlaywrightError so transient navigation failures retry, same shape as
F1e-1's exec_in_app hardening.
Cold-verifiable: ssh cc-ci 'RECIPE=n8n cc-ci-run runner/run_recipe_ci.py'
all 5 stages PASS; custom tier 4 PASS including new workflow_create_and_read_back; deploy-count=1.
Keycloak Q2.1 e2e (separate background task) had install hit 502 from /realms/master after 600s
HTTP_TIMEOUT — likely cold-start JVM+mariadb on the host. Will investigate post Q1 verdict.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>