b4a6c02dde
journal(2): DONE-VETO checklist complete; plausible Q4.7b mirror+PR#1+run launched
2026-05-31 05:28:57 +00:00
470afbff98
fix(discourse F2-15): add N/A PARITY.md (P2 §4.1) — parity genuinely N/A (no upstream corpus); documents functional tests + P4 integrity
2026-05-31 05:24:19 +00:00
88ad05ac5c
journal(2): mumble F2-14c green+claimed; DONE-VETO checklist complete; remaining plausible Q4.7b + drone Q4.10
2026-05-31 05:18:04 +00:00
7ee4c2b717
decisions+journal(2): mumble F2-14c disposition + UPGRADE_EXTRA_ENV hook; run launched
2026-05-31 05:08:46 +00:00
190247f3a1
journal(2): discourse full7 (category fix worked, title_prettify hit); fixed 588a087; full8 launched
2026-05-31 04:49:52 +00:00
0c31af1b50
journal(2): discourse full6 all-green except create-topic category bug; fixed ( 1f92776); full7 relaunched
2026-05-31 04:41:34 +00:00
3dc8fdf507
journal(2): consumed orchestrator inbox + re-baseline (new Hetzner box 8GB/135GB free); launched discourse full6
2026-05-31 04:34:54 +00:00
87823b195b
journal(2): RESUMED — cc-ci migrated to Hetzner node (still ~8GB); discourse full6 setup + memory-shed
2026-05-31 04:20:55 +00:00
707752cd14
journal(2): cc-ci VM offline mid discourse full5 — likely OOM on 7-GiB node; polling recovery
2026-05-31 01:43:55 +00:00
cc952903df
journal(2): discourse full4 timeout root-cause + full5 fixes (warm image cache + 3600s)
2026-05-31 01:26:41 +00:00
7d07f1f79b
journal(2): full8 flaky-green (restore won the race this time) — intermittent, not claiming; harness verify+retry fix next
2026-05-30 21:21:32 +00:00
c2c66f21d8
journal(2): backupbot enumerate-once flow → harness must verify+re-invoke backup if db volume missing (chosen fix)
2026-05-30 21:19:08 +00:00
ad7b3d0e8c
journal(2): ghost full8 instrumented — DEFINITIVE root cause = db container cycled by backup op, racing backupbot volume capture (not OOM/not-healthcheck); next: read backupbot backup flow
2026-05-30 21:17:44 +00:00
1aca09d4db
journal(2): ghost full6 restore RED = SYSTEMATIC (db-grace correlated); ruled out label-drop; full7 live restore-tier diagnosis
2026-05-30 20:31:51 +00:00
01fd43bcd5
journal(2): ghost full5 restore RED (ci_marker absent) — full6 instrumented re-run to characterize flaky vs systematic
2026-05-30 20:14:13 +00:00
3a706bd96e
journal(2): ghost full4 timeout root-cause (mysql init + migration > 1200s) + DEPLOY_TIMEOUT bump
2026-05-30 19:55:33 +00:00
8ff5ad246a
journal+decisions(2): ghost migration-lock deadlock root cause + healthcheck-overlay fix + abra +U chaos-version normalization
2026-05-30 05:54:54 +01:00
8e160af997
journal(2): mattermost PASS (3rd this session); next bluesky-pds P4 scoped (account-based marker to catch running-app-sqlite-hold restore gap)
2026-05-30 02:39:05 +01:00
11d6d82aad
status/journal(2): Q4.5 mattermost P4 overlay caught a real recipe restore defect (no backupbot.restore.post-hook → DB not reimported); recipe-PR queued (immich pattern); node clean
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-30 01:30:19 +01:00
b73018c9ab
journal(2): Q4.1 matrix register-500 root cause (restore DROP DATABASE FORCE closes synapse DB pool) + readiness-retry fix
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-30 01:01:09 +01:00
191a647dcf
journal(2): immich claimed; remaining-recipe scope + backup-capability survey (ghost/bluesky/uptime-kuma/mattermost all backup-capable → P4 overlays required)
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-30 00:22:12 +01:00
7e2a5bc09c
journal(2): immich Q3.5 P4 decision — recipe-PR to add postgres backup (recipe backs up NO DB as published); validate vchord dump/restore empirically first
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 23:44:47 +01:00
f86a58addf
journal(2): drone+gitea integration fully scoped (gitea dep config + admin/token/OAuth-app + install_steps wiring; §4.3 build-creation deferred)
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 21:59:07 +01:00
3c79e3de32
journal(2): drone Q4.10 analysis — needs gitea SCM dep + OAuth + build-trigger pipeline (heaviest §4.3)
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 21:46:06 +01:00
3ab04cd07a
journal(2): mailu Q4.9 deeper recon — certdumper/ACME TLS friction; start with TLS_FLAVOR=notls
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 20:57:39 +01:00
7282caef30
journal(2): mailu Q4.9 enrollment plan + discourse Q4.6 block recorded (handoff to next iteration)
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 20:54:21 +01:00
f4e11d4cca
journal(2): next-recipe recon — discourse chosen (only remaining recipe with a backup mechanism for real P4)
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 20:33:03 +01:00
f9ebb3f610
journal(2): Q4.7 plausible — root cause of clickhouse-backup boot-download crash-loop + decision
2026-05-29 18:48:56 +01:00
475ad5c774
claim(2): HQ1 image pre-pull — warm local store before deploy (4 unit tests + warm-cache-skip + bad-tag-clear-error + abra-unchanged)
...
lifecycle.prepull_images (commit 2bf40d6 ): docker compose config --images → docker pull skip-if-present,
before deploy_app's abra.deploy + perform_upgrade's chaos redeploy. Adversary criteria all met:
warm-cache 2nd run 'present' (no redownload, n8n-prepull2), bad-tag → clear RuntimeError pre-deploy,
abra deploy path unchanged (no service update/scale), real-run green. 4 unit tests pass. Gate evidence
in STATUS-2.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 16:14:25 +01:00
9272c20727
journal/deferred(2): Q3.5 immich PARTIAL — restore P4 blocked by upstream recipe (volume backup, no pg_dump hook); recipe-PR unit filed (drive/meet pg_backup.sh pattern)
2026-05-29 15:53:22 +01:00
62ac9b59e0
journal/status(2): F2-13 cryptpad read-back robustness FIXED ( b44d75b, poll-all-frames) — 3x green vs cold probe; awaiting Adversary re-verify/F2-9 close
2026-05-29 15:26:25 +01:00
9c9a0059c1
journal(2): record operator clarification — 3x repeat-green is flakiness-specific (lasuite-drive), not the general gate standard (normal = 1 cold-verified green)
2026-05-29 13:25:56 +01:00
3a8c5ca076
journal(2): both Phase-2 blockers cleared (Q3.2 PASS, F2-9 resolved); scout Q3.3 lasuite-meet as next (reuses lasuite-drive OIDC-at-install machinery)
2026-05-29 13:13:32 +01:00
a48543f57b
status/journal/deferred(2): cryptpad F2-9 RESOLVED — roundtrip green in full harness custom tier (cold deploy); awaiting Adversary close
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 13:11:35 +01:00
5e0af07b86
journal(2): Q3.2a fixed-code run 1 FULL SUITE GREEN (collabora-ready gate fixed upgrade tier); launching 3x repeat-green
2026-05-29 10:52:44 +01:00
a151489996
feat(2): lasuite-drive Q3.2a Part A — wire OIDC at INSTALL, eliminate flaky redeploy
...
Q3.2a / plan-lasuite-drive-oidc-robustness.md Part A. The old setup_custom_tests.sh did a
post-deploy in-place `abra app deploy --force --chaos` of the heavy 12-service stack to apply
the OIDC env — flaky (collabora WOPI-discovery race + gunicorn-perms; JOURNAL Step 0). Since
the OIDC env only affects backend/app and keycloak is live-warm, provision the per-run realm
BEFORE the single deploy and wire OIDC into the .env at install time (no reconverge).
- runner/run_recipe_ci.py: new _provision_deps() helper (warm/cold split + SSO enrich + write
$CCCI_DEPS_FILE), used by both paths. New per-recipe OIDC_AT_INSTALL meta flag (added to
_load_meta whitelist). When set + deps live-warm: provision BEFORE deploy_app; the install
tier's install_steps.sh wires OIDC into the single deploy; post-deploy step runs only the
MinIO bucket one-shot — no re-provision, no redeploy. Legacy post-deploy path unchanged for
all other dep recipes (gated on `not oidc_at_install`).
- tests/lasuite-drive/install_steps.sh (NEW): install-time OIDC env + secret wiring; no-ops on
empty deps file (recipe still boots, OIDC test skips → F2-11 RED).
- tests/lasuite-drive/setup_custom_tests.sh: trimmed to MinIO-bucket-only (OIDC moved out).
- tests/lasuite-drive/recipe_meta.py: OIDC_AT_INSTALL = True.
- JOURNAL-2: Step-0 root-cause failure logs captured before the fix.
NOT a claim — validating 3x green (incl. now-required upgrade tier) before claiming Q3.2.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-29 10:10:05 +01:00
dd137f9683
status(2): disk resize 30->70GB in progress (orchestrator) — disk-blocker LIFTING; deploys paused; plan to re-run lasuite-drive FULL lifecycle + mattermost after cc-ci healthy
2026-05-29 08:36:17 +01:00
9df900d1cc
journal(2): mumble scope correction — non-HTTP health = high-blast-radius core-harness feature (wait_healthy/canonical/generic), deserves dedicated effort; re-pick next unit = mattermost-lts (HTTP-native, no core changes)
2026-05-29 08:06:03 +01:00
7997b98935
journal(2): scouted mumble (Q4.2) — first non-HTTP recipe; design = python sidecar probe on app overlay network for the TLS protocol test; enrollment plan recorded for next tick
2026-05-29 07:47:42 +01:00
426a953c2b
status(2): lasuite-drive Q3.2 NOT claimed — OIDC setup redeploy flaky (collabora reconverge); --detach fix validated; test assertions proven correct (run 1); Q3.2a robustness item added; prune-during-deploy lesson recorded
2026-05-29 07:27:50 +01:00
b78d708c49
decisions/deferred(2): lasuite-drive upgrade tier = disk env-blocker (28GB host, dual multi-GB office image crossover); maximal subset in flight; operator disk-resize escalation; adversary heads-up
2026-05-29 05:51:31 +01:00
2c245c83c7
journal(2): Phase 2 RESUMED post-2w — foundation re-confirmed (72 unit + custom-html full e2e green), reference-corpus mapping, lasuite-drive e2e in flight
2026-05-29 05:03:46 +01:00
5b34496557
fix(2): F2-11 — SSO-dep deps-not-ready SKIP no longer yields GREEN !testme
...
When a DEPS-declaring recipe's setup_custom_tests fails, its @requires_deps (SSO/OIDC)
tests skip; a skip-only pytest file exits 0 so the run previously reported overall=0
(GREEN) while the only SSO test never ran (violates P7). Fix preserves generic-tier
failure-isolation but corrects the green SIGNAL:
- conftest.pytest_collection_modifyitems counts skipped requires_deps tests and appends
to $CCCI_DEPS_SKIP_REPORT.
- run_recipe_ci: sums the count, surfaces it in RUN SUMMARY, and new pure predicate
sso_dep_unverified(declared, deps_ready, skipped) flips overall=1.
- 7 new unit tests (tests/unit/test_f211_sso_skip.py).
Verified deploy-free (rate-limit-independent): 35/35 unit PASS; cold real-test proof on
lasuite-docs test_oidc_with_keycloak.py -> 1 skipped + skip-report==1 -> orchestrator
would set overall=1. Full e2e deferred until Docker Hub rate limit lifts.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-28 21:25:27 +01:00
4a118eafee
journal(2): correct drive note — cannot trim onlyoffice (recipe-as-is); registry creds is the fix
...
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-28 20:56:31 +01:00
1138d77cbb
blocked(2): Q3.2 drive base-deploy hits Docker Hub rate limit + Gitea outage
...
- recipe_meta: bump drive abra TIMEOUT 900->1500, DEPLOY_TIMEOUT 1200->1800 (12-svc
stack w/ onlyoffice+collabora; cold pulls need a wide window).
- STATUS-2 ## Blocked: two Class-A1 external blocks documented w/ verify commands —
(1) Docker Hub anon pull rate limit (registry-creds finding per plan §1.5; blocks all
new deploys), (2) Gitea git.autonomic.zone 404 outage (coordination down; 2 watchdog
pings unconsumable until recovery). JOURNAL-2: full disk->prune->rate-limit chain.
- Queued locally; push + Adversary-inbox processing deferred to Gitea recovery.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com >
2026-05-28 20:48:52 +01:00
076fa31552
status(2): Q4.1+Q4.3 GREEN; Q3.1+Q3.4 partial; pausing for Adversary cold-verify
...
After capacity unblock:
- Q4.1 matrix-synapse: parity-aligned + 3 specific (incl. §4.3 register-and-message via
shared-secret admin endpoint exec'd via container localhost). Cold green.
- Q4.3 bluesky-pds: enrolled (install_steps.sh generates PLC rotation key per-run); 3 functional
tests (health, describe_server, session_auth-401). Cold green.
- Q3.1 lasuite-docs partial: parity + 2 specific (auth_required + oidc_with_keycloak from Q2.4).
- Q3.4 cryptpad partial: parity + 2 specific (spa_assets + Playwright SPA-render).
Remaining substantial: Q3.2 lasuite-drive (needs mirror), Q3.3 lasuite-meet (mirrored + needs
OIDC wire), Q3.5 immich (needs mirror), Q4.2/4-10 (mostly need mirror). Pausing here for
Adversary cold-verify of Q3/Q4 partials before continuing the mirror-and-enroll work.
2026-05-28 16:07:57 +01:00
374e755aac
journal(2): Q4.1 matrix-synapse code-only; cc-ci host capacity ceiling reached
2026-05-28 11:38:15 +01:00
f79416bcf4
journal(2): Q2 PASS + Q3 partial checkpoint + 'probe before assert' lesson
2026-05-28 10:21:23 +01:00
54b1fe326c
status(2): Q2 RE-CLAIMED — F2-5 dep-teardown-verify fix cold-verified clean
...
Per REVIEW-2 ## Q2 FAIL @2026-05-28 (F2-5 dep teardown leak + F2-6 cold install flake + F2-7
SSO setup keycloak-hardcoded):
F2-5 closed by commit c6e94af : teardown_deps now uses verify=True so residuals raise; failures
propagate to orchestrator exit code + run summary. Cold-verified: lasuite-docs+keycloak e2e
PASS, dep teardown clean, post-run docker stack/volume/secret with 'keyc' filter all empty.
This also explained my Q3.1 flake — the leaked Q2.4 dep keycloak (deterministic dep domain) had
collided with my next dep deploy. With F2-5 fixed, that class of cross-run collision is
impossible (teardown now raises if it leaks, so the run fails BEFORE the next one starts).
F2-7 acknowledged: setup_keycloak_realm is keycloak-specific; authentik would need parallel
backend. Logged for Q2.2/Q5.
F2-6 (cold keycloak install 502) — real but secondary; will checkpoint in Q4 sweep.
Side-effect: Q3.1 partial also landed (PARITY.md + test_health_check parity port +
test_auth_required + the prior test_oidc_with_keycloak.py as Q3.1 third specific test).
Cold evidence: ssh cc-ci 'RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py'
deploy-count=2 (expect 2), all 5 assertions PASS, dep teardown clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-28 09:22:24 +01:00
ad6b25982f
status(2): Q2 CLAIMED — dep resolver + SSO harness + Q2.4 acceptance proven cold
...
Q2.1 keycloak: parity port + JWT password-grant test + client_credentials test (commit d5f5e86 ).
Q2.2 authentik DEFERRED: SSO harness is provider-pluggable; Q2.4 already proven via keycloak.
Q2.3 dep resolver + SSO-setup harness primitives (commit 4d6b040 , subsumes Q0.4). 28/28 unit PASS.
Q2.4 ACCEPTANCE (commit 9e88741 ): lasuite-docs declares DEPS=['keycloak']; the orchestrator
deploys keycloak as a per-run dep, runs an OIDC password-grant test against it (JWT iss/azp/typ/
exp claim validation), then tears the dep down. deploy-count=2 (1 parent + 1 dep, DG4.1 reconciled
with deps).
Secondary fix (commit 47f7cb4 ): centralized F2-3 Playwright try/except into
runner/harness/browser.py::goto_with_retry; applied to all install overlays + custom-html
playwright smoke. Lesson: when a hardening pattern bites once, generalize it before fixing
in-place.
Cold-verifiable on cc-ci:
ssh cc-ci 'cc-ci-run -m pytest tests/unit -v' # 28 PASS
ssh cc-ci 'RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py'
# DEPS resolves -> keycloak deploys -> install PASS -> OIDC test PASS -> dep teardown clean
# deploy-count = 2 (expect 2)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-28 08:09:56 +01:00