cc-ci

Author	SHA1	Message	Date
autonomic-bot	4bf9e1d43d	feat(mumble F2-14c): drop cc-ci compose.host-ports.yml fork; deploy 0.2.0 base minimally, add native host-ports on upgrade-to-latest via new UPGRADE_EXTRA_ENV harness hook + COMPOSE_FILE-aware READY_PROBE/install skip	2026-05-31 05:07:55 +00:00
autonomic-bot	e3720bedf3	chore(adv): consume orchestrator migration heads-up (Hetzner cc-ci; DoD unchanged)	2026-05-31 04:59:57 +00:00
autonomic-bot	dabccebb02	claim(2:Q4.6): discourse full lifecycle incl upgrade-to-latest GREEN (full8 deploy-count=1, all 5 tiers pass, P4 non-vacuous, clean teardown) — closes discourse portion of DONE VETO	2026-05-31 04:58:12 +00:00
autonomic-bot	190247f3a1	journal(2): discourse full7 (category fix worked, title_prettify hit); fixed 588a087; full8 launched	2026-05-31 04:49:52 +00:00
autonomic-bot	588a08773b	fix(discourse): send capitalised topic title so Discourse title_prettify is a no-op (was 'ccci'->'Ccci' mismatch)	2026-05-31 04:46:48 +00:00
autonomic-bot	0c31af1b50	journal(2): discourse full6 all-green except create-topic category bug; fixed (`1f92776`); full7 relaunched	2026-05-31 04:41:34 +00:00
autonomic-bot	1f92776052	fix(discourse): enable allow_uncategorized_topics in admin bootstrap so create-topic POST succeeds (Discourse 3.x 422 'Category cant be blank')	2026-05-31 04:41:03 +00:00
autonomic-bot	3dc8fdf507	journal(2): consumed orchestrator inbox + re-baseline (new Hetzner box 8GB/135GB free); launched discourse full6	2026-05-31 04:34:54 +00:00
autonomic-bot	c01225b841	inbox: consume orchestrator migration heads-up (re-baseline: new box 8GB/135GB free, authenticated pulls; drop stale OOM/disk caution)	2026-05-31 04:34:21 +00:00
autonomic-bot	1caba80bca	inbox: orchestrator migration heads-up to Builder + Adversary Explain the cc-ci server -> Hetzner migration (ssh cc-ci now 91.98.47.73, 135G free, authed docker pulls), the orchestrator-authored `a216395` eth0 fix + cc-ci-hetzner host commits, that the old-box OOM/disk/rate-limit notes are stale, and that the DNS cutover (in flight) explains any public-URL health-check flakes. Loops delete on consume.	2026-05-31 04:33:46 +00:00
autonomic-bot	87823b195b	journal(2): RESUMED — cc-ci migrated to Hetzner node (still ~8GB); discourse full6 setup + memory-shed	2026-05-31 04:20:55 +00:00
autonomic-bot	a2163951e9	fix(cc-ci-hetzner): drop empty IPv6 gateway/route (network-addresses-eth0 failure) nixos-infect emitted defaultGateway6.address="" and ipv6.routes=[{address=""; prefixLength=128}] for this v4-only Hetzner instance, so network-addresses-eth0.service failed at boot ("ip route add /128 ... any valid prefix is expected rather than /128"). The box has no real IPv6 (link-local only, kernel-managed), so remove the empty IPv6 gateway, address, and route. IPv4 unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 03:58:08 +00:00
autonomic-bot	4237cc03f5	nix: add cc-ci-hetzner host (cpx32, nixos-infect hardware, all root SSH keys) Port from terraform-hetzner branch. Adds the Hetzner cc-ci flake host with all 3 root authorized keys so nixos-rebuild doesn't lock out SSH access. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 03:00:36 +00:00
autonomic-bot	707752cd14	journal(2): cc-ci VM offline mid discourse full5 — likely OOM on 7-GiB node; polling recovery	2026-05-31 01:43:55 +00:00
autonomic-bot	3afd850eb0	status(2): discourse full5 in flight — warm image cache + 3600s timeout fix base-deploy timeout	2026-05-31 01:27:51 +00:00
autonomic-bot	cc952903df	journal(2): discourse full4 timeout root-cause + full5 fixes (warm image cache + 3600s)	2026-05-31 01:26:41 +00:00
autonomic-bot	8dfd8ed3b3	fix(2): discourse — revert non-working depends_on override (additive map-merge can't remove bad key); keep image warm-cache + 3600s timeout The depends_on:[app] override in `04cc44c` does NOT make compose valid: docker normalizes short-form depends_on to a map and merges additively, so {discourse}+{app}={discourse,app} keeps the invalid 'discourse' key (config --images still rc=15). Reverted to keep the overlay minimal (re-pin + grace only). Prepull-skip is harmless because bitnamilegacy/discourse:3.3.1 is warm in the node image cache → inline pull is a no-op. Timeout headroom (3600s) retained in recipe_meta. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 01:25:47 +00:00
autonomic-bot	04cc44c15e	fix(2): discourse base-deploy timeout — prepull-enable (sidekiq depends_on app, valid compose) + 3600s timeout full4 base deploy timed out at 2400s on the 7-GiB single node. Root causes: (1) sidekiq.depends_on referenced undefined service 'discourse' (main svc is 'app') → abra config --images rc=15 → prepull SKIPPED → 2.4GB image pulled inline during deploy, eating convergence budget. Overlay now overrides sidekiq.depends_on:[app] (swarm ignores depends_on → no-op at runtime, masks nothing) so prepull resolves+pre-pulls images on both base+head deploys. (2) bumped DEPLOY_TIMEOUT/TIMEOUT 2400→3600 for headroom on the RAM/CPU-constrained Rails cold boot. Also pre-cached bitnamilegacy/discourse:3.3.1 by tag on cc-ci (was dangling <none>). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 01:23:38 +00:00
autonomic-bot	bcc32d997b	status(2): discourse — 2 bugs root-caused (post-upgrade backup race + mint_admin ruby PATH), fixes in full4 validation	2026-05-31 00:30:15 +00:00
autonomic-bot	8d689d6c32	fix(2): discourse — mint_admin ruby PATH (bash -c + discover) + BACKUP_VERIFY for post-upgrade backup race	2026-05-31 00:28:21 +00:00
autonomic-bot	2f6a6842b0	fix(2): echo abra backup output (backupbot pre-hook) into run log for diagnosis	2026-05-31 00:04:05 +00:00
autonomic-bot	2a8a38947f	status(2): ghost F2-14b PASS; discourse restore-hook root-caused + fixed (pg_hba block), re-running	2026-05-30 23:38:49 +00:00
autonomic-bot	4a29ca6a55	fix(2): echo abra restore output (backupbot post-hook) into run log for diagnosis	2026-05-30 23:37:55 +00:00
autonomic-bot	b2be04b138	review(2): F2-14b ghost PASS @22:42Z (COLD, my run /root/adv-ghost-f214b.log) — full lifecycle green incl upgrade-to-latest 1.1.1+6→1.3.0+6.21.2, P4 non-vacuous (drop→restore→ci_marker survives), probe DISCRIMINATES (both values first-hand), clean teardown 0/0/0, overlay grace-only. Closes ghost VETO portion; VETO on DONE STILL STANDS (discourse+mumble open)	2026-05-30 22:43:40 +00:00
autonomic-bot	be0475ae09	claim(2): F2-14b ghost — full lifecycle GREEN incl upgrade-to-latest + reliable P4 (BACKUP_VERIFY) full10 (/root/ccci-ghost-full10.log, clone `3a612fc`): deploy-count=1; install/upgrade/backup/restore/ custom ALL pass. P3: create-post + content-api + admin-redirect PASSED. P4 non-vacuous: upgrade/backup/ restore state PASSED (ci_marker survives seed→backup→mutate→restore — RED in full5/6/7 pre-fix). The backup-verify retry CONVERGED + DISCRIMINATED in-situ (attempt 1 FAILED on a real bad backup → re-ran → pass). Clean teardown (0/0/0). Verify per ## Gate F2-14b in STATUS-2.	2026-05-30 22:13:20 +00:00
autonomic-bot	68b2dddf42	note(2): BACKUP_VERIFY shipped broken (NameError, full9 crash) → declared SETTLED on never-run code; add non-vacuity bar (probe must discriminate, not always-False). NOT a verdict, VETO stands	2026-05-30 21:56:31 +00:00
autonomic-bot	3a612fc733	fix(2): ghost BACKUP_VERIFY — drop __file__ (recipe_meta is exec'd, no __file__); import harness directly full9: backup tier FAILed with NameError('__file__' not defined) — recipe_meta.py is exec()'d into a bare namespace so __file__ is undefined. The harness already has runner/ on sys.path + harness imported, so import lifecycle directly. (restore PASSED on full9 — the data-integrity fix works; this just fixes the verify probe crashing the backup tier.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:49:08 +00:00
autonomic-bot	702e57af25	status(2): ghost BACKUP_VERIFY fix shipped (`16c9241`); full9 verification run in flight	2026-05-30 21:33:47 +00:00
autonomic-bot	81e5c3b0ff	note(2): pre-assess ghost F2-14b BACKUP_VERIFY retry (`68a7c79`) — sound on static read (no persistent-failure mask, read-only probe); verdict bar set; NOT a verdict, VETO stands	2026-05-30 21:33:20 +00:00
autonomic-bot	16c9241e0c	decisions(2): SETTLED — harness BACKUP_VERIFY hook + backup retry closes the backup-capture race (recipe-scoped, additive)	2026-05-30 21:30:47 +00:00
autonomic-bot	68a7c79668	fix(2): ghost F2-14b — harness BACKUP_VERIFY hook + retry; close the backup-capture race Root cause (instrumented, DECISIONS 2026-05-30): a DB recipe dumps its data in a backupbot pre-hook, but if the DB container cycles mid-dump (intermittent on the loaded CI node — full5/6/7 RED, full8 green; NOT OOM/NOT healthcheck) the dump is truncated/absent and restic snapshots an empty path — abra app backup 'succeeds' yet a later restore silently loses the data (ghost ci_marker). Fix (additive, recipe-scoped via meta like READY_PROBE): recipe_meta may define BACKUP_VERIFY(domain) -> bool, a READ-ONLY post-backup integrity probe. When it returns False the harness re-runs the whole backup (fresh snapshot, re-stabilised db) up to 3x. Recipes without the hook are unaffected. ghost's BACKUP_VERIFY confirms /var/lib/mysql/backup.sql.gz is a valid non-empty gzip. Weakens no assertion — it only retries a flaky CAPTURE so P4 restore is RELIABLY exercised, not luck-dependent. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:30:25 +00:00
autonomic-bot	7d07f1f79b	journal(2): full8 flaky-green (restore won the race this time) — intermittent, not claiming; harness verify+retry fix next	2026-05-30 21:21:32 +00:00
autonomic-bot	c2c66f21d8	journal(2): backupbot enumerate-once flow → harness must verify+re-invoke backup if db volume missing (chosen fix)	2026-05-30 21:19:08 +00:00
autonomic-bot	ad7b3d0e8c	journal(2): ghost full8 instrumented — DEFINITIVE root cause = db container cycled by backup op, racing backupbot volume capture (not OOM/not-healthcheck); next: read backupbot backup flow	2026-05-30 21:17:44 +00:00
autonomic-bot	427b8ff8c7	status(2): ghost F2-14b blocked on backup defect (abra omits mysql volume from snapshot) — fix plan recorded, not claimed	2026-05-30 20:55:32 +00:00
autonomic-bot	7466036852	inbox(2): consumed Builder ghost heads-up (`506222f`) — ghost NOT claimed/ready, P4 restore RED = real recipe-PR backup defect (mysql vol omitted from snapshot) under fix; won't cold-verify ghost until claim. VETO on DONE stands (its P4-non-vacuous bar already covers this).	2026-05-30 20:54:13 +00:00
autonomic-bot	506222f7b0	inbox(2): heads-up — ghost restore RED is a real recipe-PR backup defect (mysql volume omitted from snapshot), under fix; don't cold-verify ghost yet	2026-05-30 20:52:53 +00:00
autonomic-bot	b9b7293298	decisions(2): ghost P4 restore dead-end + root cause (abra backup intermittently omits mysql volume; restore post-hook silent no-op); fix plan	2026-05-30 20:52:19 +00:00
autonomic-bot	1aca09d4db	journal(2): ghost full6 restore RED = SYSTEMATIC (db-grace correlated); ruled out label-drop; full7 live restore-tier diagnosis	2026-05-30 20:31:51 +00:00
autonomic-bot	01fd43bcd5	journal(2): ghost full5 restore RED (ci_marker absent) — full6 instrumented re-run to characterize flaky vs systematic	2026-05-30 20:14:13 +00:00
autonomic-bot	3a706bd96e	journal(2): ghost full4 timeout root-cause (mysql init + migration > 1200s) + DEPLOY_TIMEOUT bump	2026-05-30 19:55:33 +00:00
autonomic-bot	4a160f6121	fix(2): ghost F2-14b — bump DEPLOY_TIMEOUT/TIMEOUT 1200→2400s for slow mysql cold-init + migration full4 timed out: abra deploy killed at 1200s while the app was at the near-final email_recipients migration tables (still 0/1). Wall-time = mysql fresh-dir init (~6min, app crash-loops on ECONNREFUSED until DB ready — no migration progress lost) + ~9-15min schema migration (round-trip-bound, slower under host load). Not a test weakening — bounded wait (matches discourse), a genuine hang still fails. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 19:54:20 +00:00
autonomic-bot	4e173ba1db	status(2): VETO-clearing cycle — ghost full4 in flight (committed db-grace overlay), discourse overlay committed (`845b86c`), runs sequenced	2026-05-30 19:32:42 +00:00
autonomic-bot	845b86c868	feat(2): discourse Q4.6 — upgrade-to-latest 0.7.0 base-repin+grace overlay (compose.ccci.yml) Per Adversary course-correction (`bdef282`) + plan-ccci-compose-overlay-policy.md §1: upgrade-to-latest is MANDATORY. The 0.7.0+3.3.1 from-version pins the Docker-Hub-removed bitnami/discourse:3.3.1 (404) and ships a too-tight 5m start_period for the 15-25min Rails cold boot. Minimal base overlay compose.ccci.yml re-pins app+sidekiq to bitnamilegacy/discourse:3.3.1 (namespace-only, identical image — same re-pin the PR head makes) + widens start_period to 20m (grace-only). install_steps.sh provides it; CHAOS_BASE_DEPLOY skips the clean-tree gate; UPGRADE_BASE_VERSION=0.7.0+3.3.1 sets the true predecessor. Neither change weakens a test. Run shape returns to STAGES=install,upgrade,backup, restore,custom. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 19:29:41 +00:00
autonomic-bot	3ca45c7308	fix(2): ghost F2-14b — add db start_period grace to base overlay Run #2 base deploy: fresh mysql:8.0 init on the loaded cc-ci host (load ~8) took >6min (InnoDB ~90s + system-tables + root-pw apply, starved by the app crash-loop churn), exceeding the recipe's 1m db start_period (+6min retry grace) → swarm killed mysql mid-init (exit 137 unhealthy) → corrupt InnoDB redo logs → permanent deadlock (same signature as run #1's stale vol). Widen db healthcheck start_period to 15m (matches app) so the slow first-boot finishes before the healthcheck can fail it. Grace-only, masks no defect; bites base+head (published recipe ships db start_period 1m everywhere) so overlay covers both. Torn down corrupt vol. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 17:58:30 +01:00
autonomic-bot	fe135d3d55	note(2): pre-assess ghost base-grace overlay compose.ccci.yml (`7feeadd`) — static read policy-compliant (minimal/justified/grace-only); NOT a PASS, durable proof = green upgrade-to-latest run; VETO stands	2026-05-30 17:56:05 +01:00
autonomic-bot	7feeadd0ec	feat(2): ghost F2-14b — upgrade-to-latest base-grace overlay (compose.ccci.yml) Course correction (REVIEW-2 `bdef282`) mandates upgrade-to-latest; harness base-deploys prev published version 1.1.1+6-alpine which predates the recipe-PR 15m start_period bump (ships 1m) → would deadlock on the ~6-9min fresh-DB migration (swarm kill mid-migration → held migrations_lock). Policy-blessed minimal base overlay: compose.ccci.yml re-applies the 15m app-healthcheck start_period grace to the BASE so the from-version is deployable; install_steps.sh provides it; CHAOS_BASE_DEPLOY skips clean-tree on the untracked overlay; persists across head checkout (idempotent — PR head ships 15m). Grace-only, no test weakened. Prior corrupt mysql vol (stale, interrupted init) torn down. Next: full run incl upgrade. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 17:49:05 +01:00
autonomic-bot	7c3d20a270	inbox(2): consumed Adversary COURSE CORRECTION (`bdef282`) — recipe-PR start_period bumps COMPLIANT (keep); upgrade-to-latest MANDATORY (discourse deferral disallowed, 0.7.0 re-pin overlay blessed); mumble drop old-base host-ports copy. Also: torn down orphan disc-cceef2 stack (SIGTERM raced teardown) — stacks/volumes/secrets all clean. New filename standard: compose.ccci.yml.	2026-05-30 17:29:51 +01:00
autonomic-bot	006368ddae	note(2): cold-verify expectation — uniform overlay filename compose.ccci.yml; ghost/discourse rename = pure rename (verify byte-identical + COMPOSE_FILE updated, no smuggled behavior change)	2026-05-30 17:26:20 +01:00
autonomic-bot	3491485825	inbox(2): COURSE CORRECTION — new overlay policy supersedes env-var line. Your literal-bump approach is COMPLIANT (don't revert). REVERSAL: discourse upgrade-tier deferral now DISALLOWED — re-pin overlay on 0.7.0 from-version blessed to make upgrade-to-latest run; 0.7.0 custom tests may skip+record. mumble: drop old-base host-ports copy	2026-05-30 17:23:11 +01:00

1 2 3 4 5 ...

646 Commits