cc-ci

Author	SHA1	Message	Date
autonomic-bot	0f2cc2d704	feat(2): ghost F2-14b overlay migration — start_period bump moved to recipe-PR (ghost#1 head ae43ffe, literal 15m on app healthcheck); DELETE cc-ci compose.ccci-health.yml + install_steps.sh + COMPOSE_FILE/CHAOS_BASE_DEPLOY. Anti-drift (plan §9): recipe-as-tested == recipe-as-published. env-var start_period impossible (abra pre-subst duration validation, Adversary-reproduced `4b862f6`). Next: run ghost on ae43ffe head.	2026-05-30 17:20:20 +01:00
autonomic-bot	fb20321bd9	feat(2): discourse start_period via literal recipe-PR bump (abra can't env-interpolate start_period) abra rejects env-interpolation in healthcheck start_period (FATA 'Does not match format duration' for both ${VAR} and quoted forms — validates the literal compose duration before .env substitution). So §9 pt1's env-var route is impossible for this field; the §9-compliant fix is a LITERAL start_period:20m bump in the recipe-PR (recipe everyone runs, not a cc-ci overlay; strictly safer). Remove APP_START_PERIOD from recipe_meta EXTRA_ENV; record the finding in DECISIONS (ghost E1 must use the same approach); STATUS-2 → new PR head 7a2e0e0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 16:24:45 +01:00
autonomic-bot	c346b9763b	feat(2): discourse Q4.6 policy-compliant shape (plan §9) — env-var start_period, delete cc-ci overlay, upgrade N/A Migrate discourse off the cc-ci compose overlay per plan §9 / plan-prefer-env-over-compose-overlay.md: - recipe_meta: drop UPGRADE_BASE_VERSION + COMPOSE_FILE + CHAOS_BASE_DEPLOY; set APP_START_PERIOD=1200s via EXTRA_ENV (the recipe-PR exposes start_period: ${APP_START_PERIOD:-5m}); declare upgrade tier N/A (both published prev bases pin removed bitnami images; Adversary §7.1 granted, REVIEW-2 `efe3790`). - delete tests/discourse/compose.ccci-health.yml + install_steps.sh (existed only to copy the overlay). - DECISIONS.md + STATUS-2 record the §9 guardrail + discourse shape (upgrade N/A, env start_period, pg_backup restore-hook recipe-PR = 5th data-loss recipe cc-ci caught). recipe-PR head now 8b8df17 (start_period env var added). Not a claim — run STAGES=install,backup,restore,custom next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 15:47:28 +01:00
autonomic-bot	a750937fb0	feat(2): discourse Q4.6 honest upgrade crossover — UPGRADE_BASE_VERSION override (base-on-[-1]) + uniform bitnamilegacy image overlay Implements the real 0.7.0+3.3.1 -> 0.8.0+3.3.1 upgrade crossover instead of a §7.1 skip-with-sign-off (Adversary leans DENY on the deferral; agreed): - recipe_meta UPGRADE_BASE_VERSION=0.7.0+3.3.1 + generic support in run_recipe_ci (prev = meta override or previous_version). Harness default [-2]=0.6.3+3.1.2 is a hollow base (img 3.1.2 != head 3.3.1); [-1]=0.7.0+3.3.1 is the PR's true predecessor and shares head's servable 3.3.1 image. - compose.ccci-health.yml re-pins services.{app,sidekiq}.image to bitnamilegacy/discourse:3.3.1 so the 0.7.0 base (compose pins 404 bitnami:3.3.1) is servable; idempotent on the head (PR already bitnamilegacy). Consumes Adversary BUILDER-INBOX (deleted), leaves ADVERSARY-INBOX ack; STATUS-2 discourse section updated. Full lifecycle run launching next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 14:20:06 +01:00
autonomic-bot	d822550c7d	feat(2): discourse P3 functional tests — §4.3 create-topic round-trip + site.json config + admin-bootstrap helper _discourse.py: bootstrap an admin (recipe seeds none) + mint an ApiKey via rails runner in the app container (class-B run-scoped). test_create_topic.py: POST /posts.json (unique marker) -> GET /t/<id>.json title+cooked round-trip. test_site_basic.py: GET /site.json asserts discourse categories config. Meets P3 (>=2 functional beyond health).	2026-05-30 12:52:30 +01:00
autonomic-bot	0e3049b677	fix(2): discourse health overlay add version 3.8 (lint R011/R012 version-mismatch FATA vs compose.yml 3.8)	2026-05-30 12:09:51 +01:00
autonomic-bot	b2ed6cf989	fix(2): discourse recipe_meta — wire COMPOSE_FILE+CHAOS_BASE_DEPLOY+TIMEOUT 2400 (the overlay's missing half; prior commit `a432058` only added the files)	2026-05-30 11:49:51 +01:00
autonomic-bot	a432058aca	fix(2): discourse healthcheck start_period overlay (slow Rails boot) + CHAOS_BASE_DEPLOY + TIMEOUT 2400 Install timed out at 1800s: discourse's 15-25min Rails cold boot overran both the deploy timeout and the recipe healthcheck start_period:5m (swarm killed the booting app). Add compose.ccci-health.yml (app healthcheck start_period 1200s) via install_steps.sh + recipe_meta COMPOSE_FILE + CHAOS_BASE_DEPLOY, bump DEPLOY_TIMEOUT/TIMEOUT to 2400. Image re-pin (bitnamilegacy) already proven working. NO test weakened.	2026-05-30 11:48:18 +01:00
autonomic-bot	13da216f8d	fix(2): ghost healthcheck start_period overlay — fixes fresh-migration lock deadlock Root cause: Ghost's fresh-DB first boot runs a ~6-9min schema migration (round-trip-bound, not CPU); the recipe healthcheck start_period:1m (~6min grace) kills the still-migrating task, leaving a stale migrations_lock → every later task deadlocks (MigrationsAreLockedError). Hit on both 2- and 4-vCPU. Fix (cc-ci deploy overlay, NOT a recipe/test change): compose.ccci-health.yml raises app healthcheck start_period to 900s, wired via recipe_meta COMPOSE_FILE + install_steps.sh (+ CHAOS_BASE_DEPLOY for the untracked overlay). No assertion weakened. Budget 1200s = migration + convergence. Only the install tier needs it (upgrade redeploys on the populated DB → fast boot).	2026-05-30 05:23:47 +01:00
autonomic-bot	9771b6e16a	fix(2): ghost timeout 2400->900 — VM now 4 dedicated vCPU (operator), migration converges in minutes; short bounded budget fails fast on the migrations_lock deadlock instead of a long blackout	2026-05-30 05:06:22 +01:00
autonomic-bot	bdaeb41496	fix(2): ghost DEPLOY_TIMEOUT/TIMEOUT 1200->2400 — MySQL cold-boot migration + healthcheck-kill+retry needs >20min on slow node (install timed out as it converged)	2026-05-30 04:41:59 +01:00
autonomic-bot	b4d03ccafe	feat(2): ghost P4 data-integrity overlay (MySQL ci_marker) + §4.3 create-post round-trip - ops.py + test_{upgrade,backup,restore}.py: seed ci_marker into the MySQL `ghost` DB (db service) via the mysql CLI; rides the recipe's mysqldump --tab backup. recipe is MySQL not sqlite (stale comment fixed). Expect restore RED -> recipe-PR (no backupbot.restore hook; immich/mattermost class). - functional/_ghost.py: cookie-aware Ghost Admin API client (stdlib http.cookiejar; Origin CSRF hdr). - functional/test_post_roundtrip.py: §4.3 create published post + read back (unique marker, non-vacuous); closes the DEFERRED ghost create-post item. - PARITY.md + recipe_meta.py updated. Authored node-free; full-lifecycle run next, NOT yet claimed.	2026-05-30 04:14:13 +01:00
autonomic-bot	74da6dc46b	feat(2): bluesky-pds P4 data-integrity overlay — deterministic atproto account marker (recipe-aware; catches running-app-holds-sqlite restore gap) via _p4.py + ops/test_upgrade/backup/restore Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 02:46:50 +01:00
autonomic-bot	e9d1e894b2	fix(2): mattermost functional tests share a deterministic admin bootstrap (_mm.bootstrap_admin) — only ONE unauthenticated first-user creation is allowed, so the multi-user test no longer collides with create_message Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 01:58:32 +01:00
autonomic-bot	7672f110f6	feat(2): mattermost-lts P3 2nd characteristic test (multi-user message visibility) + PARITY/DECISIONS for the postgres-restore recipe-PR Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 01:48:08 +01:00
autonomic-bot	012a477540	fix(2): mattermost-lts P4 overlay — postgres service is named 'postgres' not 'db' (exec_in_app container discovery) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 01:18:57 +01:00
autonomic-bot	80ad0a9ed1	feat(2): mattermost-lts P4 data-integrity overlay (ops.py postgres ci_marker seed + test_install/upgrade/backup/restore) — verifying recipe's PGDATA-dir restore brings the marker back Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 01:11:10 +01:00
autonomic-bot	db124d5107	fix(2): matrix register test — bounded readiness-retry on transient post-restore 5xx (synapse re-establishing DB pool after restore-tier DROP DATABASE); assertion unchanged, RAISEs on persistent failure Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 00:52:18 +01:00
autonomic-bot	ecd770b9ca	feat(2): immich P3 2nd functional test (asset-processing: metadata extraction + library statistics) + PARITY/DECISIONS for immich postgres-backup recipe-PR Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 00:08:10 +01:00
autonomic-bot	88449431e1	fix(2): Q4.9 mailu — rewrite mail-flow via in-container sendmail+doveadm; drop network IMAP-auth test Root cause of the 2 failing custom tests: TLS_FLAVOR=notls → dovecot refuses plaintext auth over network 143, so host-side IMAP login/auth isn't a meaningful signal. Smoke2 PROVED the in-container path: sendmail (postfix container) local-injects a marker mail → doveadm search (imap container) finds it in INBOX. test_mail_flow now exercises the real postfix→rspamd→dovecot deliver/store/fetch via exec_in_app(service=smtp/imap). Dropped test_imap_login (network plaintext-auth disallowed under notls). test_mailbox (create+config-export read-back) unchanged. PARITY.md updated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 21:33:11 +01:00
autonomic-bot	916bdd8b68	feat(2): Q4.9 mailu — recipe_meta + health + 3 functional (create-mailbox/imap-login/mail-flow); P4 N/A deferred mailu (full email stack). TLS_FLAVOR=notls avoids certdumper/ACME dep (cc-ci file-provider cert); MAIL_DOMAIN/HOSTNAMES=run domain; TRAEFIK_STACK_NAME for the letsencrypt-volume mount. P2 vacuous (no corpus). P3: test_mailbox (flask mailu user create + config-export read-back), test_imap_login (mailbox authenticates over dovecot IMAP:143), test_mail_flow (SMTP submission send → IMAP retrieve, auth to avoid greylisting). P4 N/A (no backupbot label) — DEFERRED.md + PARITY.md, Adversary §7.1 sign-off pending. Smoke-validated: 8 services converge, mail ports 25/587/143/993 host-open, flask CLI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 21:13:56 +01:00
autonomic-bot	ca7acf3d52	feat(2): Q4.6 discourse — recipe_meta + postgres P4 overlays + health (WIP, §4.3 create-topic next) discourse (forum: postgres+redis+sidekiq). HEALTH_PATH=/srv/status (slow Rails boot, DEPLOY_TIMEOUT=1800). P4 via postgres ci_marker (db service, pg_dump backupbot — matrix-synapse pattern). Health functional test. §4.3 create-a-topic + PARITY.md to follow after smoke discovers the admin/API bootstrap path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 20:38:25 +01:00
autonomic-bot	ec76072489	fix(2): Q4.2 mumble — TCP voice-server READY_PROBE gates backup past upgrade host-port churn Diagnostic (RECIPE=mumble STAGES=install,backup,restore,custom, no upgrade) PROVED backup+restore green on a stable 1.0.0 deploy incl. ci_marker survival (P4). The full-run backup 409 ('container not running') was the chaos UPGRADE redeploy: host-mode 64738 must be released by the old task + rebound by the new, and HEALTH_PATH '/' only proves the mumble-web sidecar (not the voice server), so wait_healthy passed while the app churned → backup-bot execed a not-running container. Fix: extend lifecycle.wait_ready_probes to support a TCP probe ({tcp_host,tcp_port,stable=N consecutive connects}); mumble recipe_meta READY_PROBE returns 64738 (stable=3) so the harness waits for the voice server up after install AND upgrade before backup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 20:19:07 +01:00
autonomic-bot	a0fd58b4c5	fix(2): Q4.2 mumble — set sqlite busy timeout via silent .timeout dot-command, not PRAGMA PRAGMA busy_timeout=N emits its own result row, polluting the read-back parse (seed read back '20000\nupgrade-survives' → AssertionError 'seed did not commit', failing upgrade/backup/restore ops — though the INSERT actually committed). Switch _sqlite to 'sqlite3 -cmd ".timeout 20000"' which sets the busy timeout silently. install+custom already green (handshake/welcome/web/tcp PASS); this fixes the P4 lifecycle ops. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 19:54:10 +01:00
autonomic-bot	999dd0d564	fix(2): Q4.2 mumble — CHAOS_BASE_DEPLOY meta flag for chaos base deploy (clean-tree gate) mumble's pinned base deploy (prev version 0.2.0) FATAs 'has locally unstaged changes' because install_steps provides an untracked compose.host-ports.yml. New recipe_meta CHAOS_BASE_DEPLOY=True + lifecycle._recipe_meta_flag + deploy_app branch -> base uses chaos (skips clean-tree/lint, deploys the checked-out pinned version, not LATEST), mirroring the lightweight-tag chaos-base path. DECISIONS.md records the full mumble enrollment design. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 19:32:48 +01:00
autonomic-bot	6bf0425f50	fix(2): Q4.2 mumble — provide host-ports overlay for every version via install_steps The upstream compose.host-ports.yml exists only from v1.0.0+, but the upgrade-tier base deploy is the previous published version (0.2.0+), which predates it — so EXTRA_ENV's COMPOSE_FILE failed to resolve on the base deploy (config --images rc=14, deploy FATA). install_steps.sh now copies a cc-ci-owned identical overlay into the recipe checkout when absent, so 64738 is host-published for every version (base + upgrade) and on-host protocol tests reach 127.0.0.1:64738. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 19:27:38 +01:00
autonomic-bot	6841048aae	feat(2): Q4.2 mumble — parity port (health/protocol-handshake/web) + 2 specific + P4 sqlite - functional/_mumble_proto.py: stdlib Mumble TLS protocol client (adapted from corpus mumble_connect.py) - 3 parity ports: test_tcp_health, test_protocol_handshake (channel presence+ServerSync), test_web_client - 2 NEW recipe-specific (P3): welcome-text + max-users config round-trips over the protocol - P4: ops.py + test_backup/test_restore seed ci_marker in /data/mumble-server.sqlite (recipe's own backupbot DB), busy_timeout for live-server locks - test_install overlay: voice server listening on 64738 (beyond web-sidecar readiness) - recipe_meta: COMPOSE_FILE=compose.yml:mumbleweb:host-ports; WELCOME_TEXT/USERS markers - PARITY.md mapping table Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 19:20:56 +01:00
autonomic-bot	b4f39cb51a	fix(2): plausible install overlay — assert /api/health subsystems, not `/` (auth_controller 500s under headless DISABLE_AUTH; / is not a valid readiness probe) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 18:13:20 +01:00
autonomic-bot	3943cd80e5	feat(2): Q4.7 plausible — §4.3 event-tracking functional tests + PARITY.md; /api/health readiness probe - functional/test_event_tracking.py: 2 recipe-specific tests (P3) — register site → POST /api/event (browser UA) → read back from clickhouse events_v2. test_pageview_event_roundtrip asserts stored name/pathname/hostname; test_custom_event_roundtrip asserts a custom-named goal lands under that name. - test_health_check.py: probe /api/health (200, asserts clickhouse+postgres+sites_cache ready) — fixes the broken/unterminated docstring from the prior WIP edit; / is unreliable (500 init / 302 ready). - recipe_meta.py: HEALTH_PATH=/api/health, HEALTH_OK=(200,); comment corrected. - PARITY.md: P2 vacuous (no recipe-maintainer corpus); documents P3/P4 coverage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 18:05:16 +01:00
autonomic-bot	baae41fe10	fix(2): plausible HTTP_TIMEOUT 600→1200 + DEPLOY_TIMEOUT 1200 — app 500s until clickhouse/migrations ready v1 failed wait_healthy 'not healthy / (last status 500)': plausible's app starts before clickhouse (plausible_events_db) is ready (recipe depends_on names events_db, mismatched → no swarm ordering) and returns 500 until DB migrations finish (several min on cold deploy). It serves 302 once ready; widen the health window. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 16:34:11 +01:00
autonomic-bot	f0f6b6f545	feat(2): Q4.7 plausible — ops + lifecycle overlays (postgres ci_marker; pg_dump backup hook) plausible (analytics; app + postgres db + clickhouse events_db). recipe_meta stub (DISABLE_AUTH/ REGISTRATION + SECRET_KEY_BASE) + health test pre-existing. Added ops.py (postgres ci_marker via db service, container-env psql) + test_install/upgrade/backup/restore overlays. plausible's postgres has a real pg_dump backup/restore hook (so P4 marker survives, unlike immich). §4.3 event-tracking test next (after live-API discovery). Tags annotated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 16:21:15 +01:00
autonomic-bot	2bf40d69d6	feat(2): HQ1 image pre-pull (plan-prepull-images.md) — warm local store before deploy lifecycle.prepull_images(recipe, domain): resolve images via docker compose config --images (COMPOSE_FILE from the app .env — handles $VERSION interpolation + multi-compose) → docker pull each, skip-if-present (zero network for cached pinned tags). Called in deploy_app before the (unchanged, real) abra.deploy AND in generic.perform_upgrade before the chaos redeploy (warms new-version images). A pull failure RAISES a clear pre-deploy error (not a converge timeout); deploy path unchanged (no docker service update/scale). Removes PULL time not app-INIT time. 4 unit tests (tests/unit/test_prepull.py): present→skip, missing→ pull, pull-fail→raise, no-images→skip. NOT claimed yet — validating cold-verify criteria next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 16:02:21 +01:00
autonomic-bot	82dc2d733d	feat(2): immich §4.3 asset upload→read-back→thumbnail test + PARITY test_asset_upload.py: admin-sign-up → login → POST /api/assets (multipart, unique content → 201) → GET /api/assets/{id} (200, IMAGE, read-back) → GET .../thumbnail (200, derivative generated, polled). Verified GREEN against a live immich probe (app v2.7.5). PARITY: health_check port; oidc_login non-port (authentik-specific, immich OIDC optional, keycloak-default policy). §4.3 floor + characteristic derivative-generation feature met. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 15:13:11 +01:00
autonomic-bot	b44d75b89c	fix(2): F2-13 cryptpad roundtrip read-back robustness — poll all frames for marker Adversary cold-verify of F2-9 FAILED: the read-back's CKEditor-frame-attach wait timed out on a fresh cold context (flaky, not 3x-reliable). Fix: read-back now polls EVERY frame's body text for the marker (don't require the specific ckeditor-inner frame to attach — that's the flaky part) with a generous ~240s deadline + periodic reloads to unstick cold loads. The marker appearing in a fresh context still proves server-side E2E-encrypted persistence (only URL+fragment key carried over). Also bumped the session-1 post-type sync wait 9s→12s. F2-13 Adversary-owned; will validate cold before it closes F2-9. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 15:08:52 +01:00
autonomic-bot	98a37d44b5	feat(2): Q3.5 immich enrollment (recipe_meta + ops + lifecycle overlays + health parity) immich (object-storage/large-volume photo mgmt; D10 category): 3 services (app incl. ML + web, redis, database/postgres), self-contained (no SSO dep — local admin; OIDC optional). recipe_meta (HTTP health, DEPLOY_TIMEOUT=1500), ops.py postgres ci_marker (postgres/immich, backupbot-labelled), lifecycle overlays, health_check parity. §4.3 upload-asset→list→thumbnail test next (after live-API discovery). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 14:40:57 +01:00
autonomic-bot	1f7806a9c4	fix(2): lasuite-meet meeting_flow — tolerant best-effort delete-verify (meet 0.3.0 soft-deletes) Full suite #5: install/upgrade/backup/restore + OIDC + create-room/read-back/LiveKit-token ALL pass (R014 chaos-base fix validated: upgrade crossover real 0.2.0→0.3.0). Only the final 404-after-DELETE assert failed — meet 0.3.0+v1.16.0 soft/async-deletes (DELETE 2xx, re-GET still 200). The §4.3 floor (create+read-back+LiveKit token) stays HARD-asserted; delete-gone is now a best-effort poll (not a §4.3 requirement). PARITY.md noted. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 14:24:21 +01:00
autonomic-bot	9c6cb539ee	feat(2): Q3.3 lasuite-meet §4.3 meeting_flow test + PARITY.md test_meeting_flow.py: OIDC token → POST /api/v1.0/rooms/ (201 + LiveKit token) → GET read-back (200) → assert LiveKit JWT grants the room → DELETE (204) → verify gone (404). The §4.3 create-an-object+ read-it-back + the distinctive WebRTC-signaling feature (LiveKit token issuance). PARITY.md maps health_check/oidc_login/meeting_flow ports + documents webrtc-media/relay non-port (UDP media relay = env-blocker per §7.1; maximal subset = LiveKit token issuance, shipped). install+OIDC already validated green (/root/ccci-meet-v1.log). Note: first-deploy 'No such image' was a one-time cold-pull race (images now cached + kept by conservative prune); deploy converges reliably. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 13:39:32 +01:00
autonomic-bot	31bda3995d	feat(2): Q3.3 lasuite-meet — install_steps (OIDC-at-install) + lifecycle overlays + health/OIDC parity tests Mirrors lasuite-drive machinery (sibling La Suite recipe): install_steps.sh wires OIDC at install (client_id from deps, scopes 'openid email'); ops.py + test_{install,upgrade,backup,restore}.py lifecycle overlays (postgres meet/meet ci_marker data-integrity); functional/test_health_check.py (parity) + test_oidc_with_keycloak.py (password-grant JWT vs dep keycloak, realm lasuite-meet-<6hex>). §4.3 meeting_flow + webrtc specifics next (after install+OIDC validated). No setup_custom_tests.sh (no post-deploy step — OIDC at install, no minio/collabora). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 13:22:30 +01:00
autonomic-bot	32a743f501	feat(2): Q3.3 lasuite-meet recipe_meta — DEPS=keycloak + OIDC_AT_INSTALL + livekit-domain flatten (reuses lasuite-drive machinery) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 13:14:42 +01:00
autonomic-bot	3484d25b5c	fix(2): cryptpad roundtrip — more patient pad-creation wait (240s + reload) for cold fresh deploy Full-suite custom-tier run showed the pad #/2/pad/edit fragment didn't appear within 80s on a fresh cold deploy (passed on the warm probe). Bump _open_pad hash-wait to ~240s + one mid-way reload. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 13:01:43 +01:00
autonomic-bot	6506c4ac3a	test(2): F2-12 P7-negative unit tests — owned upgrade-convergence wait fails on stuck convergence Proactively addresses the Adversary's pre-claim recon (`f7c5681`): since the F2-12 fix replaces abra's converge monitor (-c) with the harness's own wait, prove the replacement genuinely FAILS a broken convergence (non-vacuous), not just passes a slow one. 5 deterministic tests (fake clock, no deploy): - wait_ready_probes RAISES TimeoutError when the READY_PROBE never returns 200 (collabora wedged). - wait_ready_probes returns when it reaches 200; no-op without a READY_PROBE. - wait_healthy RAISES when services never converge, and when converged-but-never-serving. Run: cc-ci-run -m pytest tests/unit/test_f212_upgrade_convergence.py -q → 5 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 12:23:34 +01:00
autonomic-bot	e1147b5fe3	fix(2): F2-12 lasuite-drive upgrade tier — own convergence wait (abra -c) + collabora READY_PROBE Adversary cold-verify FAILed Q3.2 (F2-12): the prev→PR-head chaos upgrade's abra converge monitor FATAs while the NEW collabora 25.04.9.4.1's healthcheck is still in start_period (jail/config init), even though it converges given swarm's healthcheck retries. My WOPI pre-gate fixed the OLD collabora being killed mid-boot but not the NEW collabora's convergence. Flaky (3x green for me, 1x fail cold). Fix (cc-ci-side, stronger verification — not weaker): - abra.deploy gains no_converge_checks (`-c`); chaos_redeploy passes it for the upgrade op so abra's impatient monitor no longer FATAs (the stack spec is applied regardless). - perform_upgrade now OWNS the convergence verification after the redeploy: wait_healthy (services N/N + app HEALTH_PATH) + new lifecycle.wait_ready_probes (recipe READY_PROBE), bounded by the recipe DEPLOY_TIMEOUT (generous) not abra's impatient window. meta threaded _perform_op→perform_upgrade. - recipe_meta READY_PROBE hook (added to _load_meta whitelist): lasuite-drive probes collabora WOPI discovery (/hosting/discovery on collabora-<domain>) → 200. Called after install deploy AND after the upgrade redeploy. No-op for recipes without a READY_PROBE. NOT re-claiming yet — validating the upgrade tier is now reliably green (incl. the slow-collabora crossover) across multiple runs before re-claiming Q3.2. F2-12 stays open (Adversary-owned). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 11:55:53 +01:00
autonomic-bot	05d0dc14eb	feat(2): cryptpad create-pad content roundtrip Playwright test — resolves F2-9 (§4.3 create+read-back) Adds tests/cryptpad/playwright/test_pad_content_roundtrip.py: open /pad/ → CryptPad auto-creates a fragment-keyed pad → type a unique marker into the CKEditor body → wait for encrypted sync → open a FRESH browser context (no shared localStorage/cookies) → navigate to the captured pad URL → assert the marker survives in the re-decrypted body. Proves genuine end-to-end-encrypted server-side persistence (the fresh session carries only the URL+fragment key), the §4.3 create-and-read-back floor F2-9 requires — not a health/SPA stand-in. Empirically mapped against CryptPad 2026.2.0 (the prior deferral cited version-fragility on 5.7.0): editor is the deep nested frame …/pad/ckeditor-inner.html; ~15s cold-cache LESS-compile init; the fragment-keyed pad URL DOES appear after init; transient net::ERR_NETWORK_CHANGED handled by the shared goto_with_retry + a mid-load reload retry in the frame wait. PASSED against a live probe instance. PARITY.md updated (roundtrip = the P3/§4.3 test; SPA-render test kept as fast liveness). F2-9 is Adversary-owned — left for the Adversary to close after cold-verify. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 11:46:02 +01:00
autonomic-bot	4b38b66fa5	fix(2): lasuite-drive Q3.2a — gate upgrade redeploy on collabora-ready + plumb DEPLOY_TIMEOUT Q3.2a run 1: Part A (install-time OIDC) GREEN — deploy-count=1, install/backup/restore/custom + OIDC test all PASS. BUT upgrade tier FAILED: the in-place `abra app deploy --chaos` redeploy landed on a STILL-BOOTING collabora (coolwsd ~2min boot: 1300+ l10n files + RSA keygen) and SIGTERMed it mid-init ("Shutdown requested while starting up", forced exit 70) → abra aborted the deploy. The install wait_healthy returns on container 1/1 while coolwsd is still loading. Fixes (plan §C readiness-gating, no test weakened): - tests/lasuite-drive/ops.py::pre_upgrade — wait for collabora WOPI discovery (/hosting/discovery on collabora-<domain>) → 200 BEFORE the chaos redeploy, so it replaces a ready collabora cleanly. - runner/harness/lifecycle.chaos_redeploy + generic.perform_upgrade + run_recipe_ci._perform_op — plumb the recipe DEPLOY_TIMEOUT to the upgrade chaos redeploy (was abra.deploy's 900s default, while the .env internal TIMEOUT is 1500s → Python could SIGKILL abra mid-wait on the slow collabora/onlyoffice reconverge). Mirrors the install deploy_app timeout plumbing. Also (operator naming change 2026-05-29): renamed `--extra-tests` -> `--extra` in DEFERRED.md + BACKLOG-2.md Build-backlog section. 3 refs remain in BACKLOG-2 Adversary-findings section (241/248/292, closed findings) — left for the Adversary (single-writer); orchestrator updated IDEAS.md/plan-sso-dep-testing.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 10:37:55 +01:00
autonomic-bot	a151489996	feat(2): lasuite-drive Q3.2a Part A — wire OIDC at INSTALL, eliminate flaky redeploy Q3.2a / plan-lasuite-drive-oidc-robustness.md Part A. The old setup_custom_tests.sh did a post-deploy in-place `abra app deploy --force --chaos` of the heavy 12-service stack to apply the OIDC env — flaky (collabora WOPI-discovery race + gunicorn-perms; JOURNAL Step 0). Since the OIDC env only affects backend/app and keycloak is live-warm, provision the per-run realm BEFORE the single deploy and wire OIDC into the .env at install time (no reconverge). - runner/run_recipe_ci.py: new _provision_deps() helper (warm/cold split + SSO enrich + write $CCCI_DEPS_FILE), used by both paths. New per-recipe OIDC_AT_INSTALL meta flag (added to _load_meta whitelist). When set + deps live-warm: provision BEFORE deploy_app; the install tier's install_steps.sh wires OIDC into the single deploy; post-deploy step runs only the MinIO bucket one-shot — no re-provision, no redeploy. Legacy post-deploy path unchanged for all other dep recipes (gated on `not oidc_at_install`). - tests/lasuite-drive/install_steps.sh (NEW): install-time OIDC env + secret wiring; no-ops on empty deps file (recipe still boots, OIDC test skips → F2-11 RED). - tests/lasuite-drive/setup_custom_tests.sh: trimmed to MinIO-bucket-only (OIDC moved out). - tests/lasuite-drive/recipe_meta.py: OIDC_AT_INSTALL = True. - JOURNAL-2: Step-0 root-cause failure logs captured before the fix. NOT a claim — validating 3x green (incl. now-required upgrade tier) before claiming Q3.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 10:10:05 +01:00
autonomic-bot	fc6e35d617	feat(2): mattermost-lts create-message round-trip (§4.3 P3) — first-user→login→team→channel→post→read-back; harness http.post_with_headers (returns response headers, for mattermost login Token)	2026-05-29 08:31:37 +01:00
autonomic-bot	8ce62c4fa6	feat(2): enroll mattermost-lts (Q4.5) — recipe_meta (HTTP-native, self-contained postgres) + health_check (root + /api/v4/system/ping) + PARITY (no corpus → P2 vacuous; create-message §4.3 + P4 ops planned)	2026-05-29 08:24:41 +01:00
autonomic-bot	f1c626cc67	fix(2): lasuite-drive setup_custom_tests — docker service scale --detach for the run-once minio-createbuckets job (blocking scale hung the custom tier forever; --detach submits + returns, bucket-poll confirms)	2026-05-29 06:21:42 +01:00
autonomic-bot	40b03a9bf1	claim(2w): WC8 + WC9 (FINAL gates) — resource-safety consolidation + stale-warm prune + docs/warm.md + --quick rollback proof WC8: canonical.prune_stale (drop de-enrolled warm data + volumes) wired into the nightly sweep + df log; consolidated evidence (DRONE_RUNNER_CAPACITY=MAX_TESTS serialize; autoPrune drops --volumes so warm vols survive; cold teardown sacred; warm excluded from D8 — no nix source ref). +1 unit (72 pass). WC9: docs/warm.md documents the full warm/quick model; --quick rollback proof already proven live (W2 FAIL restores exact known-good; WC4 PASS byte-identical snapshot). On PASS, all WC1-WC9 (incl WC1.1/WC1.2) verified → DONE. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 04:43:34 +01:00
autonomic-bot	465e1059b0	claim(2w): WC6 nightly full-cold sweep — timer+service roll warm/infra (health-gated) then serial cold sweep promoting canonicals (WC5); proven live canonical.enrolled_recipes; runner/nightly_sweep.py (roll keycloak+traefik → serial full-cold over enrolled on latest → green promotes; skip if test active; operate against CCCI_REPO checkout for tests/); nix/modules/nightly-sweep.nix (timer 03:00 Persistent + oneshot service) wired in. 2 bugs fixed via live service run (repo-relative enrolled scan; util-linux for backup PTY). Live SERVICE sweep: enrolled=['custom-html'] → all tiers green → canonical advanced 1.10.0→1.11.0; red-run correctly does NOT promote. 71 unit pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 04:33:08 +01:00

1 2 3

106 Commits