cc-ci

Author	SHA1	Message	Date
autonomic-bot	f94de22234	fix(canon): promote does a FAITHFUL warm install (clean tree + deps + install_steps) All checks were successful continuous-integration/drone/push Build is passing Details M2 finding (Adversary-flagged): promote_canonical did a bare `abra app deploy` that lacked the cold install's wiring, so recipes that passed the cold test still failed to promote: - ghost: `abra app new` FATA 'locally unstaged changes' — the CCCI_SKIP_FETCH per-run tree was left dirty by the tier suite. Fix: force re-checkout the tag + `git clean -fd` before deploy. - bluesky-pds: missing pds_plc_rotation_key (install_steps inserts it, #generate=false). - custom-html-tiny: 404 (install_steps seeds index.html). Fix: run install_steps_hook in promote. - OIDC recipes would miss their realm. Fix: provision DEPS in promote like the cold install. promote_canonical now: clean tree → provision deps → deploy_app with install_steps_hook + overlay + ready-probes, then snapshot. Also: sweep result label now derives from whether the canonical was actually written (promote is non-fatal; rc==0 did not imply promoted) — fixes the misleading 'PASS (promoted)'. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-17 08:50:59 +00:00
autonomic-bot	d4cc9e4530	fix(canon): promote the TESTED release version, not a re-derived latest tag All checks were successful continuous-integration/drone/push Build is passing Details Closes the head_version-vs-latest_version divergence: should_promote gates on head_version (code under test) but promote_canonical recorded latest_version(recipe_tags). In a manual RECIPE=<r> run whose main checkout sits on a tag OLDER than the newest published tag, the gate would pass on the older tag yet promote the newer (never-tested) one. promote_canonical now takes the tested `version` (head_version, guaranteed a release tag by the tagged-gate) and records exactly that. Sweep path unaffected (head==tag by construction). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-17 06:47:33 +00:00
autonomic-bot	27e06289f8	feat(canon): M1.1 tagged-promote gate — canonical only advances to a published release tag All checks were successful continuous-integration/drone/push Build is passing Details - should_promote_canonical gains a `tagged` requirement (canon §2.A): a green cold latest run promotes only when the tested head version is a published release tag; an untagged main commit never becomes a canonical. - warm_reconcile.is_released_version(recipe, version): release-tag membership (exact or by version_key). Caller computes `tagged` so the gate stays pure. - unit tests: untagged -> no promote; is_released_version cases. - drive-by (pre-existing reds, unrelated to canon, now green): test_warm_reconcile traefik assertion was stale vs the phase-pxgate spec (probes /api/version, no health_domain); meta.py UPGRADE_BASE_VERSION KEYS help synced to the prevb doc text. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-17 06:34:09 +00:00
autonomic-bot	b29bb3f804	feat(samever): step back to older base when last-green canonical == head version resolve_upgrade_base now reads the head's published version (abra.head_compose_version, the coop-cloud.<stack>.version label) and, when the last-green warm-canonical version equals it, steps back to the newest published version strictly older than head instead of deploying a same-version no-op. warm_reconcile gains version_key + newest_older_version (single coop-cloud ordering source; sort_versions refactored onto version_key, no behavior change). Skip only when no older published predecessor exists. Step-back returns kind=version so it inherits F1d-2 pinned-tag checkout. Extends tests/unit/test_upgrade_base.py (13 pass).	2026-06-17 04:24:14 +00:00
autonomic-bot	bb2e3c6b2c	feat(prevb): dynamic upgrade base (last-green→main→skip) + per-recipe previous/ overlay; migrate discourse off static base + leaky overlay All checks were successful continuous-integration/drone/push Build is passing Details - resolve_upgrade_base: BasePlan(kind=version\|ref\|skip); last-green (warm canonical) primary, main-tip fallback, declared skip else. UPGRADE_BASE_VERSION retained as optional override. - deploy_app: base_ref path (chaos-deploy a main-tip/last-green commit) + apply_previous wiring. - lifecycle: previous/ surface (has_previous, previous_target_version, previous_status decision, provide/remove overlay, compose_file add/remove, recipe_branch_commit, stack_service_names). - generic.perform_upgrade: strip previous/ overlay + COMPOSE_FILE entry before head redeploy. - discourse: compose.ccci.yml now environmental-only (order: stop-first); removed bitnamilegacy pins + sidekiq + UPGRADE_BASE_VERSION; test_upgrade.py asserts head image == official 3.5.3 + no sidekiq. - unit tests: resolve_upgrade_base matrix + previous/ apply/skip/stale + COMPOSE_FILE layering.	2026-06-17 00:15:06 +00:00
autonomic-bot	a121d2c069	fix(gtea): fix M2 blockers — LFS upgrade and REF=main HC1 Some checks failed continuous-integration/drone/push Build is failing Details continuous-integration/drone Build is failing Details Blocker 1 (LFS roundtrip fails on PR #1): - Add UPGRADE_EXTRA_ENV to gitea recipe_meta.py — after PR-head checkout (compose.lfs.yml now in ABRA_DIR), add compose.lfs.yml to COMPOSE_FILE and set SECRET_LFS_JWT_SECRET_VERSION=v1 so the upgrade chaos redeploy actually runs with LFS enabled. Without this, the base install checks out the 3.5.x tag (compose.lfs.yml removed), EXTRA_ENV sees no LFS, and the upgrade chaos redeploy inherits the no-LFS .env — so the LFS test runs (compose.lfs.yml is restored by recipe_checkout_ref) but LFS is off. - Add abra.secret_generate(domain) in generic.perform_upgrade when upgrade_env is non-empty — generates lfs_jwt_secret before chaos redeploy. Blocker 2 (REF=main upgrade fails HC1): - Always use recipe_head_commit (git rev-parse HEAD) for head_ref instead of using ref directly. When ref="main" (a branch name), the HC1 commit check "head_ref.startswith(chaos_commit)" always fails since "main" ≠ SHA. recipe_head_commit returns the actual SHA after the fetch/checkout. Side-fix (stale creds — build #675): - ops.py pre_install: delete the per-domain creds file before calling _ensure_admin. A fresh install wipes gitea's DB; any creds file from a prior run on the same domain is stale and causes 401s in all API calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 21:01:21 +00:00
autonomic-bot	1be74fb9e1	fix(lint): F821 undefined 'e' in test_scm_configured; shfmt/ruff auto-fixes All checks were successful continuous-integration/drone/push Build is passing Details continuous-integration/drone Build is passing Details - test_scm_configured.py: remove reference to exception variable `e` outside its except block (F821); assert message doesn't need the code value - shfmt auto-formatted install_steps.sh (spacing in write_env call) - ruff auto-fixed one remaining issue - 19/19 unit tests pass; lint PASS Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-11 22:17:19 +00:00
autonomic-bot	0aa46dbe72	fix(drone-dep): ADV-drone-02 — teardown fallback when SSO enrichment fails after deploy Some checks failed continuous-integration/drone/push Build is failing Details When _enrich_deps_with_sso raises after deploy_deps succeeds (e.g., gitea API call fails), deps_state stays {} and the finally block's `if deps_state:` guard skips teardown, orphaning the dep at its deterministic domain. Fix: add an `else` branch after the `if deps_state:` block that reads $CCCI_DEPS_FILE (the legacy-list written by deploy_deps) and calls teardown_deps on the cold entries so no dep is left running. Unit tests: test_load_run_state_provides_fallback_for_enrichment_failure and test_fallback_skips_warm_entries verify the data-flow that the fallback relies on. 19/19 unit tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-11 22:03:29 +00:00
autonomic-bot	51c3280163	feat(drone): enroll drone + gitea SCM dep (M1 implementation) Some checks failed continuous-integration/drone/push Build is failing Details - tests/gitea/recipe_meta.py: gitea as install-time dep provider; sqlite3 overlay EXTRA_ENV, health path /api/healthz, relaxed access for CI use - tests/drone/recipe_meta.py: DEPS=["gitea"]; health /healthz; 600s timeout - tests/drone/install_steps.sh: wires GITEA_CLIENT_ID + GITEA_DOMAIN + client_secret Docker secret + DRONE_USER_CREATE before single drone deploy - tests/drone/functional/test_scm_configured.py: Playwright-free SCM test — follows /login redirect, asserts final URL is gitea dep's OAuth2 authorize endpoint with matching client_id (per Adversary pre-probe REVIEW-drone.md) - tests/drone/PARITY.md: backup structural-skip justified (no backupbot labels) - runner/harness/sso.py: setup_gitea_oauth() — creates gitea admin user via CLI + OAuth2 app via API, returns {admin_user, admin_password, client_id, client_secret} for install_steps.sh consumption - runner/run_recipe_ci.py: _enrich_deps_with_sso now handles gitea dep (calls setup_gitea_oauth; keycloak path unchanged) - tests/unit/test_gitea_dep.py: unit tests for gitea dep path — meta loading, SSO routing, SCM redirect assertion logic (parametrized) - machine-docs: STATUS/JOURNAL/BACKLOG-drone.md phase state files initialized Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-11 21:31:43 +00:00
autonomic-bot	e9745c8c74	feat(bsky): EXPECTED_NA['upgrade'] suppresses the upgrade-tier base deploy — single deploy = PR head; bluesky-pds declares it (no deployable base: every published tag pins the republished moving :0.4). upgrade_base() extracted pure + 6 unit tests; meta-key doc regenerated. 253 unit tests + repo lint PASS All checks were successful continuous-integration/drone/push Build is passing Details continuous-integration/drone Build is passing Details	2026-06-11 11:51:12 +00:00
autonomic-bot	e219a7891d	feat(lvl5): P1 — 5-rung ladder (L5=abra recipe lint) + de-capped level semantics All checks were successful continuous-integration/drone/push Build is passing Details level.py: RUNGS += lint; statuses {pass,fail,skip,unver}; compute_level = max passed rung with all below pass-or-skip (fail/unver block); cap_reason/capped DELETED. harness/lint.py: lint executor — pristine scratch clone of the per-run tree at the exact tested ref (mirror-origin + untracked-overlay pollution solved by context, no rule filtered), PTY via script -qec, 60s hard budget, lint.txt artifact, table-parse classifier (rc only signals FATA), unver on any non-run (never silent pass). results.py: derive_rungs classifies every N/A source (structural/declared → skip, else unver), lint rung + synthetic lint stage + lint block in results.json, schema 2, cap fields removed. run_recipe_ci.py: lint call before tiers (double-wrapped, verdict-neutral), badge = level only. card/dashboard: 0-5 ramp, cap line → 'level N of {4\|5}', unverified rows, badge number+colour only, lint.txt servable, old schema-1 artifacts render untouched. Unit suite rewritten: 245 passed on cc-ci venv.	2026-06-11 07:42:30 +00:00
autonomic-bot	68954be53e	feat(harness): P5 — customization manifest (rcust) All checks were successful continuous-integration/drone/push Build is passing Details One block at run start answering "what does this recipe customize?" across every surface (non-default recipe_meta keys, ops.py pre-ops, install_steps.sh, compose.ccci.yml, lifecycle overlays by source, custom-test counts, active CCCI_SKIP_GENERIC* env overrides — !!-flagged when riding a CI run, P2c), printed to the run log and embedded verbatim in results.json under "customization". Pure presentation — building/printing it never influences a verdict; the manifest honors the HC2 repo-local gate so it never advertises code the run will not execute. Unit tests: synthetic recipe exercising every surface -> complete + deterministic + JSON-clean; HC2 invisibility; env-override flagging; render golden lines; build_results threads the dict verbatim (key always present, None when absent). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 18:57:26 +00:00
autonomic-bot	fd02d9f4b8	feat(harness): P3 — uniform ctx hook convention (rcust) All checks were successful continuous-integration/drone/push Build is passing Details harness.meta.HookCtx (frozen): .domain, .base_url, .meta (RecipeMeta), .deps (provisioned dep creds from $CCCI_DEPS_FILE or None), .op (current lifecycle op or None); built via meta.hook_ctx() at each hook call site. All recipe callables now take ctx: EXTRA_ENV(ctx), UPGRADE_EXTRA_ENV(ctx), READY_PROBE(ctx), BACKUP_VERIFY(ctx), SCREENSHOT(page, ctx), ops.py pre_<op>(ctx). Dict-valued EXTRA_ENV/UPGRADE_EXTRA_ENV unchanged (only the callable signature moved). Call sites converted: deploy_app env shaping, perform_upgrade, wait_ready_probes (gains op=), _perform_op BACKUP_VERIFY, screenshot.capture, _run_pre_hook. Legacy signatures fail FAST with a clear migration message: the registry carries hook_params per hook key, enforced at meta.load() (MetaError names the old vs new signature); ops.py pre-op hooks get the same check at the orchestrator call site (meta.check_hook_signature) — no silent TypeError mid-run. Migrated every in-repo user mechanically (17 ops.py files; cryptpad/lasuite-*/ mailu EXTRA_ENV; mumble+lasuite-drive READY_PROBE; ghost/discourse BACKUP_VERIFY) — seeded values, probes and assertions byte-identical (domain -> ctx.domain; keycloak pre_restore's meta arg -> ctx.meta). Unit tests: hook_ctx field contract, ctx.deps from the run deps file, legacy- signature MetaError (READY_PROBE/EXTRA_ENV/SCREENSHOT + pre-op checker), ctx signatures accepted. Docs table regenerated (signature docs in key docs). Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 180 passed; scripts/lint.sh -> PASS.	2026-06-10 17:10:26 +00:00
autonomic-bot	8cd72fd78d	feat(harness): P2 — delete legacy customization keys & paths (rcust) All checks were successful continuous-integration/drone/push Build is passing Details a) compose.ccci.yml is FIRST-CLASS: the harness auto-copies tests/<recipe>/ compose.ccci.yml into the run's recipe checkout (ABRA_DIR-aware, lifecycle. provide_ccci_overlay) and auto-chaoses the pinned base deploy on its presence (kills the R7 implicit coupling). ghost/discourse install_steps.sh (copy-only boilerplate) deleted; CHAOS_BASE_DEPLOY removed from both metas + the registry. b) install-time deps wiring is the ONLY mode: deps with DEPS provision BEFORE the single deploy; legacy post-deploy provisioning + the setup_custom_tests.sh invocation machinery deleted. lasuite-docs migrated to install_steps.sh OIDC wiring (same env names/values as the old hook — only the timing moved); lasuite-drive's remaining post-deploy MinIO bucket one-shot moved to ops.py pre_install; both setup_custom_tests.sh files deleted; OIDC_AT_INSTALL removed from drive/meet metas + the registry. c) SKIP_GENERIC meta key deleted (zero users). Env form CCCI_SKIP_GENERIC* stays as the documented dev-only escape hatch; when active in a drone CI run the orchestrator prints a loud !! warning (manifest embedding lands in P5). d) conftest cleanup: dead pre-deploy-once fixtures deployed/deployed_app deleted (zero users), app_domain + _short + _wait_healthy dropped (only users were the deleted fixtures); deps_apps+deps_creds consolidated into ONE deps fixture (entries expose .domain etc. as attributes; dict access intact); the 6 lasuite test files renamed deps_creds->deps (fixture name only — assertions and flows byte-identical). requires_deps marker + F2-11 skip-report plumbing unchanged. Registry is now exactly the 14 final keys; docs §4 table regenerated. Stale setup_custom_tests/OIDC_AT_INSTALL prose in docstrings/comments/assert MESSAGES updated (no assert logic or expected value touched). Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 175 passed; scripts/lint.sh -> PASS.	2026-06-10 17:01:33 +00:00
autonomic-bot	472a68b32c	feat(harness): P1 — single registry-backed meta loader (rcust) All checks were successful continuous-integration/drone/push Build is passing Details One loader: runner/harness/meta.py::load(recipe) -> RecipeMeta (frozen dataclass, attribute access), backed by the declarative KEYS registry (14 final keys + 3 P2-deprecated). The ONLY exec() of tests/<recipe>/recipe_meta.py. Validation per the locked decision: unknown ALL-CAPS top-level name or type mismatch = MetaError (hard error at load); underscore-prefixed names recipe-private; callables only on hook-typed keys. Migrated all six legacy loaders (spec §4 L1–L6): - run_recipe_ci.py::_load_meta deleted; orchestrator loads once, passes meta down - tests/conftest.py::_recipe_meta deleted; meta fixture returns full RecipeMeta (R3) - lifecycle.py::_recipe_extra_env/_recipe_meta_flag deleted; deploy_app takes meta - deps.py::declared_deps deleted; callers read meta.DEPS - canonical.py::is_enrolled reads through meta.load() - screenshot.py now actually receives SCREENSHOT through the orchestrator path (R2 fix; proven by unit test through the real load path) Mumble private constants underscore-prefixed (_WELCOME_TEXT_MARKER/_MAX_USERS) + importers fixed. New tests/unit/test_meta.py (all-recipes-load-clean typo gate, MetaError cases, spec §2 baseline defaults, underscore exemption, doc sync). Docs §4 key table now GENERATED from the registry (scripts/gen-meta-docs.py); drift fails CI. Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 175 passed; scripts/lint.sh -> PASS.	2026-06-10 16:46:58 +00:00
autonomic-bot	b6e12ef428	fix(harness): run-keyed run-scoped state files — CONC-A1 (same-domain runs corrupted shared deploy-count) All checks were successful continuous-integration/drone/push Build is passing Details The four CCCI state files (deploys countfile, opstate, deps, depskip) were keyed by app domain in shared /tmp. A second run of the same domain executes its main() preamble + deploy_app's pre-lock _record_deploy BEFORE blocking at the app lock, so it reset/polluted the live first run's counter (false DG4.1 deploy-count=2, build 279) and the first run's end-of-run os.remove crashed the second (FileNotFoundError, build 281). Masked pre-restructure by the end-to-end recipe flock. Now keyed by run id + harness pid via _run_state_path(); children receive exact paths via the CCCI_*_FILE env vars, so domain keying was never load-bearing. tests/concurrency/test_run_state.py: path-invariant cases + a real-process regression (helpers.py deploy-count-run) reproducing the live interleaving — verified to FAIL under simulated shared keying. docs/concurrency.md §3 updated.	2026-06-10 08:16:09 +00:00
autonomic-bot	17ebdf39ac	feat(harness): P3 per-run ABRA_DIR — structural recipe-tree isolation, recipe flock deleted All checks were successful continuous-integration/drone/push Build is passing Details - run_recipe_ci.setup_run_abra_dir(): builds <runs_dir>/<run-id>/abra with servers/ and catalogue/ symlinked to the canonical ~/.abra (app .env files keep landing in the shared canonical path, so janitor discovery and env-based teardown are unchanged; per-domain filenames + the P2 app-domain lock prevent write conflicts) and a FRESH empty recipes/ — each run clones + checkouts its own recipe trees. Exported as $ABRA_DIR (honored by the abra CLI, verified on-host) before ANY abra call. Manual runs get manual-<pid> isolation. - fetch_recipe(): plain clone into $ABRA_DIR/recipes/<recipe> — no shared-tree rm-rf, no lock. CCCI_SKIP_FETCH=1 now copies the canonically-staged clone into the per-run tree (same staging workflow, run reads staged state). - abra.abra_dir()/recipe_dir(): single resolution rule ($ABRA_DIR else ~/.abra), used by recipe_checkout, has_lightweight_version_tags, recipe_head_commit, recipe_versions, generic._recipe_dir, lifecycle.prepull_images, snapshot_recipe_tests, and warm_reconcile._recipe_dir (which keeps the canonical default for its own systemd runs but follows the per-run tree when imported by promote_canonical inside a run). - deleted: lifecycle.acquire_recipe_lock, RECIPE_LOCK_DIR, the main() call site and the must-lock-before-fetch ordering rule. - tests/{ghost,discourse}/install_steps.sh: RECIPE_DIR resolves ${ABRA_DIR:-$HOME/.abra} so the compose.ccci.yml overlay lands in the tree the run actually deploys from (mechanical path fix required by per-run trees; no assertion/gate touched — see DECISIONS.md). - .drone.yml comments updated (HOME=/root rationale now via the servers symlink).	2026-06-10 04:18:33 +00:00
autonomic-bot	b492f995bd	feat(harness): P1 lock-lifetime hardening — PDEATHSIG + SIGTERM/SIGALRM teardown funnel + 60-min hard deadline All checks were successful continuous-integration/drone/push Build is passing Details - new harness/lifetime.py: install_lifetime_guards() arms PR_SET_PDEATHSIG(SIGTERM) (with post-prctl ppid==1 orphan refusal), a SIGTERM handler raising SystemExit through the run's finally: teardown funnel (exit 143), and signal.alarm(3600) funnelling SIGALRM the same way with a distinct deadline log line (exit 142). Re-entrant signals during teardown are logged and ignored (begin_teardown guard) so a second signal can't abort the running cleanup. - run_recipe_ci.main(): guards installed first thing, before any abra call/lock; both teardown finally: blocks (cold + quick) mark begin_teardown(). - .drone.yml recipe-ci step: harness runs under setsid in its own process group; a trap forwards the step shell's TERM/EXIT to the whole group so drone cancel reaches the harness instead of leaking it (docs/concurrency.md §8.1). - PEP 446 note on the recipe-lock open(): the fd is non-inheritable, children never carry it.	2026-06-10 04:04:28 +00:00
autonomic-bot	c0df77d0d9	fix(harness): make concurrent recipe runs safe (per-recipe flock + active-run registry) All checks were successful continuous-integration/drone/push Build is passing Details capacity=2 went live with three stale capacity=1-era assumptions that corrupted concurrent runs (immich 229/230 '/pg_backup.sh: No such file'): - ~/.abra/recipes/<recipe> is ONE shared working tree that fetch_recipe rm-rf's/ reclones and the upgrade tier git-checkouts mid-run. Same-recipe runs now serialise on an exclusive flock (/run/lock/cc-ci-recipe-<recipe>.lock), taken in main() BEFORE fetch_recipe and held for the whole run; the kernel releases it on any process death, so there is no stale-lock failure mode. Different recipes still run in parallel. - CCCI_JANITOR_MAX_AGE=0 made a starting build reap ANY in-flight run app. Every run now registers its app domain + pid in /run/cc-ci-active/<domain> before app creation; the janitor checks the owner: alive (pid is a live run_recipe_ci process) -> never reaped; dead -> reaped immediately; unknown (pre-registry or post-reboot) -> age fallback (default 2h). The MAX_AGE=0 env override is gone from .drone.yml. - .drone.yml: concurrency.limit 1 -> 2 to match DRONE_RUNNER_CAPACITY=2; the 'safe because capacity=1' comments now describe the flock+registry model. lint: PASS, unit tests: 138 passed.	2026-06-09 21:56:25 +00:00
autonomic-bot	c51cd84159	feat(harness): intentional skips + custom-html-tiny functional test; 4-rung ladder (#6 ) Some checks failed continuous-integration/drone/push Build is failing Details Declare intentional skips + custom-html-tiny functional test; 4-rung level ladder - recipe_meta.EXPECTED_NA = {rung: reason} lists intentionally-skipped rungs; any essential rung skipped and not listed is unintentional. Skips still cap the level (never inflate). results.json: skips:{intentional,unintentional} + level_cap_rung. - Level ladder = the four essential rungs (install, upgrade, backup/restore, functional; top = L4). integration & recipe-local are optional, not leveled (SSO still enforced for the run verdict, unchanged). - Card shows skipped rungs as INTENTIONAL SKIP (green, reason below) / UNINTENTIONAL SKIP (amber); level badge gains an expected/gap? third segment. - custom-html-tiny: functional serve test (exact-byte round-trip + 404); declares backup_restore intentionally skipped (stateless static server). Independently verified by the adversary: 138 unit tests pass cold; live full-stage run on custom-html-tiny green (upgrade tier ran; level 2; correct skips/badge); clean teardown.	2026-06-09 03:12:11 +00:00
autonomic-bot	799cceb54a	fix(3 U5.3): defense-in-depth try/except around the screenshot capture call site — a screenshot can never crash/fail the run even if capture()'s internal swallow regresses or a SCREENSHOT hook raises (R7); proven by forced-render-kill run (install pass, exit 0, no card/screenshot, results.json intact) Some checks failed continuous-integration/drone/push Build is failing Details	2026-05-31 10:13:30 +00:00
autonomic-bot	afe5e51057	feat(3 U2-wiring): render summary card PNG + level badge SVG into run artifact dir (best-effort, R7; not yet served)	2026-05-31 07:03:10 +00:00
autonomic-bot	5fa15d4949	feat(3 U1): wire app screenshot capture into run_recipe_ci (best-effort, post-healthy, secret-safe; sets results.json screenshot)	2026-05-31 06:56:20 +00:00
autonomic-bot	52e5d210d8	feat(3 U0.2+U0.3): per-test results + results.json with computed level harness/results.py: JUnit-XML parsing (stdlib) → per-stage/per-test rows; derive_rungs (documented tier+deps/SSO → rung mapping); build_results assembles results.json {recipe,version,pr,ref,run_id, stages[],level,level_cap_reason,rungs,flags{clean_teardown,no_secret_leak},screenshot,summary_card}; write_results (atomic). run_recipe_ci.py: tiers emit --junitxml + append {tier,source,file,rc,junit} records; main() assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7), incl. a narrow leak-scan of the serialised artifact. 17 new unit tests (test_results.py). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 05:55:58 +00:00
autonomic-bot	4bf9e1d43d	feat(mumble F2-14c): drop cc-ci compose.host-ports.yml fork; deploy 0.2.0 base minimally, add native host-ports on upgrade-to-latest via new UPGRADE_EXTRA_ENV harness hook + COMPOSE_FILE-aware READY_PROBE/install skip	2026-05-31 05:07:55 +00:00
autonomic-bot	68a7c79668	fix(2): ghost F2-14b — harness BACKUP_VERIFY hook + retry; close the backup-capture race Root cause (instrumented, DECISIONS 2026-05-30): a DB recipe dumps its data in a backupbot pre-hook, but if the DB container cycles mid-dump (intermittent on the loaded CI node — full5/6/7 RED, full8 green; NOT OOM/NOT healthcheck) the dump is truncated/absent and restic snapshots an empty path — abra app backup 'succeeds' yet a later restore silently loses the data (ghost ci_marker). Fix (additive, recipe-scoped via meta like READY_PROBE): recipe_meta may define BACKUP_VERIFY(domain) -> bool, a READ-ONLY post-backup integrity probe. When it returns False the harness re-runs the whole backup (fresh snapshot, re-stabilised db) up to 3x. Recipes without the hook are unaffected. ghost's BACKUP_VERIFY confirms /var/lib/mysql/backup.sql.gz is a valid non-empty gzip. Weakens no assertion — it only retries a flaky CAPTURE so P4 restore is RELIABLY exercised, not luck-dependent. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:30:25 +00:00
autonomic-bot	aebe93c299	fix(2): _load_meta whitelist UPGRADE_BASE_VERSION (override was silently dropped → base fell back to [-2]) The override added in `a750937` had no effect: _load_meta only copies a fixed key whitelist into the meta dict, and UPGRADE_BASE_VERSION wasn't in it, so meta.get(...) returned None and the upgrade base fell back to previous_version() = recipe_versions[-2] (0.6.3+3.1.2). Add it to the whitelist so discourse's honest 0.7.0 base is selected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 14:30:39 +01:00
autonomic-bot	a750937fb0	feat(2): discourse Q4.6 honest upgrade crossover — UPGRADE_BASE_VERSION override (base-on-[-1]) + uniform bitnamilegacy image overlay Implements the real 0.7.0+3.3.1 -> 0.8.0+3.3.1 upgrade crossover instead of a §7.1 skip-with-sign-off (Adversary leans DENY on the deferral; agreed): - recipe_meta UPGRADE_BASE_VERSION=0.7.0+3.3.1 + generic support in run_recipe_ci (prev = meta override or previous_version). Harness default [-2]=0.6.3+3.1.2 is a hollow base (img 3.1.2 != head 3.3.1); [-1]=0.7.0+3.3.1 is the PR's true predecessor and shares head's servable 3.3.1 image. - compose.ccci-health.yml re-pins services.{app,sidekiq}.image to bitnamilegacy/discourse:3.3.1 so the 0.7.0 base (compose pins 404 bitnami:3.3.1) is servable; idempotent on the head (PR already bitnamilegacy). Consumes Adversary BUILDER-INBOX (deleted), leaves ADVERSARY-INBOX ack; STATUS-2 discourse section updated. Full lifecycle run launching next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 14:20:06 +01:00
autonomic-bot	e1147b5fe3	fix(2): F2-12 lasuite-drive upgrade tier — own convergence wait (abra -c) + collabora READY_PROBE Adversary cold-verify FAILed Q3.2 (F2-12): the prev→PR-head chaos upgrade's abra converge monitor FATAs while the NEW collabora 25.04.9.4.1's healthcheck is still in start_period (jail/config init), even though it converges given swarm's healthcheck retries. My WOPI pre-gate fixed the OLD collabora being killed mid-boot but not the NEW collabora's convergence. Flaky (3x green for me, 1x fail cold). Fix (cc-ci-side, stronger verification — not weaker): - abra.deploy gains no_converge_checks (`-c`); chaos_redeploy passes it for the upgrade op so abra's impatient monitor no longer FATAs (the stack spec is applied regardless). - perform_upgrade now OWNS the convergence verification after the redeploy: wait_healthy (services N/N + app HEALTH_PATH) + new lifecycle.wait_ready_probes (recipe READY_PROBE), bounded by the recipe DEPLOY_TIMEOUT (generous) not abra's impatient window. meta threaded _perform_op→perform_upgrade. - recipe_meta READY_PROBE hook (added to _load_meta whitelist): lasuite-drive probes collabora WOPI discovery (/hosting/discovery on collabora-<domain>) → 200. Called after install deploy AND after the upgrade redeploy. No-op for recipes without a READY_PROBE. NOT re-claiming yet — validating the upgrade tier is now reliably green (incl. the slow-collabora crossover) across multiple runs before re-claiming Q3.2. F2-12 stays open (Adversary-owned). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 11:55:53 +01:00
autonomic-bot	4b38b66fa5	fix(2): lasuite-drive Q3.2a — gate upgrade redeploy on collabora-ready + plumb DEPLOY_TIMEOUT Q3.2a run 1: Part A (install-time OIDC) GREEN — deploy-count=1, install/backup/restore/custom + OIDC test all PASS. BUT upgrade tier FAILED: the in-place `abra app deploy --chaos` redeploy landed on a STILL-BOOTING collabora (coolwsd ~2min boot: 1300+ l10n files + RSA keygen) and SIGTERMed it mid-init ("Shutdown requested while starting up", forced exit 70) → abra aborted the deploy. The install wait_healthy returns on container 1/1 while coolwsd is still loading. Fixes (plan §C readiness-gating, no test weakened): - tests/lasuite-drive/ops.py::pre_upgrade — wait for collabora WOPI discovery (/hosting/discovery on collabora-<domain>) → 200 BEFORE the chaos redeploy, so it replaces a ready collabora cleanly. - runner/harness/lifecycle.chaos_redeploy + generic.perform_upgrade + run_recipe_ci._perform_op — plumb the recipe DEPLOY_TIMEOUT to the upgrade chaos redeploy (was abra.deploy's 900s default, while the .env internal TIMEOUT is 1500s → Python could SIGKILL abra mid-wait on the slow collabora/onlyoffice reconverge). Mirrors the install deploy_app timeout plumbing. Also (operator naming change 2026-05-29): renamed `--extra-tests` -> `--extra` in DEFERRED.md + BACKLOG-2.md Build-backlog section. 3 refs remain in BACKLOG-2 Adversary-findings section (241/248/292, closed findings) — left for the Adversary (single-writer); orchestrator updated IDEAS.md/plan-sso-dep-testing.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 10:37:55 +01:00
autonomic-bot	a151489996	feat(2): lasuite-drive Q3.2a Part A — wire OIDC at INSTALL, eliminate flaky redeploy Q3.2a / plan-lasuite-drive-oidc-robustness.md Part A. The old setup_custom_tests.sh did a post-deploy in-place `abra app deploy --force --chaos` of the heavy 12-service stack to apply the OIDC env — flaky (collabora WOPI-discovery race + gunicorn-perms; JOURNAL Step 0). Since the OIDC env only affects backend/app and keycloak is live-warm, provision the per-run realm BEFORE the single deploy and wire OIDC into the .env at install time (no reconverge). - runner/run_recipe_ci.py: new _provision_deps() helper (warm/cold split + SSO enrich + write $CCCI_DEPS_FILE), used by both paths. New per-recipe OIDC_AT_INSTALL meta flag (added to _load_meta whitelist). When set + deps live-warm: provision BEFORE deploy_app; the install tier's install_steps.sh wires OIDC into the single deploy; post-deploy step runs only the MinIO bucket one-shot — no re-provision, no redeploy. Legacy post-deploy path unchanged for all other dep recipes (gated on `not oidc_at_install`). - tests/lasuite-drive/install_steps.sh (NEW): install-time OIDC env + secret wiring; no-ops on empty deps file (recipe still boots, OIDC test skips → F2-11 RED). - tests/lasuite-drive/setup_custom_tests.sh: trimmed to MinIO-bucket-only (OIDC moved out). - tests/lasuite-drive/recipe_meta.py: OIDC_AT_INSTALL = True. - JOURNAL-2: Step-0 root-cause failure logs captured before the fix. NOT a claim — validating 3x green (incl. now-required upgrade tier) before claiming Q3.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 10:10:05 +01:00
autonomic-bot	125453df20	claim(2w): WC5 promote-on-green-cold proven — green cold run advances canonical (1.10.0→1.11.0); --quick never promotes; only cold advances should_promote_canonical (enrolled+green+cold+latest) + promote_canonical (re-seed canonical at green-verified latest, snapshot+registry, old known-good replaced only on green). +5 unit (70 pass). Live: custom-html canonical advanced 1.10.0+1.28.0 → 1.11.0+1.29.0 via a full green cold run; snapshot refreshed; idle; per-run app torn down. WC6 nightly sweep next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 04:08:14 +01:00
autonomic-bot	191ebde466	fix(2w): W2 --quick live-proof fixes (time import + stale-TYPE reset) 3 bugs found by the live PASS+FAIL proof on the custom-html canonical: - import time (run_quick._wait_undeployed used it → the FAIL rollback crashed with NameError before restore ran). - canonical.deploy_canonical now resets .env TYPE=<recipe>:<version> before redeploy, so a stale TYPE left by a prior --quick upgrade (pointing at a since-removed broken PR commit) can't FATAL abra 'unable to resolve <commit>'. - run_quick FAIL rollback resets TYPE to known-good after restore (idle .env agrees with the registry). LIVE PROOF (custom-html canonical), ALL PASS: (A) PASS quick run → undeploy keep-volume, known-good UNCHANGED, marker intact; (B) FAIL quick run (broken image) → 'rolling back' → 'restored known-good data; canonical idle' → exit 1, known-good UNCHANGED, DATA RESTORED. Canonical left clean (idle, 1.11.0+1.29.0). 61 unit pass; cold path untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 03:05:39 +01:00
autonomic-bot	f68e9d463f	feat(2w): W2 --quick mode in run_recipe_ci.py (WC4+WC7) run_quick(): opt-in fast lane (CCCI_QUICK=1 / MODE=quick) — reattach the data-warm canonical (canonical.deploy_canonical, known-good volume) → deps wiring (warm keycloak + per-run realm) → UPGRADE to PR head (chaos, run_lifecycle_tier 'upgrade': reconverge+moved+serving + overlay) → custom tier. PASS → undeploy_keep_volume, known-good UNCHANGED (NEVER promote); FAIL → warmsnap.restore last-known-good + undeploy (roll back, data safe). Always deletes per-run warm realm. mode=quick labelled lower-confidence (WC7); skips install/backup/restore; no deploy-count guard (no deploy_app). main() dispatches to run_quick when a canonical exists, else clean no-canonical fallback to COLD. Cold path byte-identical (deps wiring intentionally mirrored, not refactored). 61 unit pass; cold untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 02:45:44 +01:00
autonomic-bot	1b8d26b504	feat(2w): W0.2 live-warm keycloak dep mode in orchestrator (WC1) - runner/harness/warm.py: stable-domain scheme (warm-<recipe>), is_warm_up probe, live_app_hexes scan, per-run realm_for naming, reap_orphan_realms. - run_recipe_ci.py: split declared deps into live-warm (shared provider + per-run realm, no deploy, realm deleted at teardown) vs cold (co-deploy). Warm path used only when provider is up; cold fallback otherwise. Reap orphan realms at run start (concurrency-safe). deploy-count excludes warm deps. Realm naming now per-run namespaced (<parent>-<6hex>). - dependent tests assert the namespaced realm pattern (stronger than ==parent). Live proof on warm keycloak: realm create -> password-grant JWT -> discovery issuer -> delete(idempotent) -> reap(keeps live hex, deletes orphan): PASS. 43 unit pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-28 23:26:02 +01:00
autonomic-bot	5b34496557	fix(2): F2-11 — SSO-dep deps-not-ready SKIP no longer yields GREEN !testme When a DEPS-declaring recipe's setup_custom_tests fails, its @requires_deps (SSO/OIDC) tests skip; a skip-only pytest file exits 0 so the run previously reported overall=0 (GREEN) while the only SSO test never ran (violates P7). Fix preserves generic-tier failure-isolation but corrects the green SIGNAL: - conftest.pytest_collection_modifyitems counts skipped requires_deps tests and appends to $CCCI_DEPS_SKIP_REPORT. - run_recipe_ci: sums the count, surfaces it in RUN SUMMARY, and new pure predicate sso_dep_unverified(declared, deps_ready, skipped) flips overall=1. - 7 new unit tests (tests/unit/test_f211_sso_skip.py). Verified deploy-free (rate-limit-independent): 35/35 unit PASS; cold real-test proof on lasuite-docs test_oidc_with_keycloak.py -> 1 skipped + skip-report==1 -> orchestrator would set overall=1. Full e2e deferred until Docker Hub rate limit lifts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-28 21:25:27 +01:00
autonomic-bot	41ede13042	feat(2): refactor — SSO-dep plan refinement (deps AFTER generic + setup_custom_tests + failure isolation) Per operator-2026-05-28 SSO-dep plan (plan-sso-dep-testing.md). Substantial orchestrator restructuring: NEW LIFECYCLE ORDER: 1. Recipe deploy ALONE (no deps). 2. install / upgrade / backup / restore — recipe-only generic tiers. 3. setup_custom_tests step (NEW): a. Deploy each declared dep + provision realm/client/test-user via harness.sso. b. Write $CCCI_DEPS_FILE in dict shape {dep_recipe: {domain, realm, client_id, client_secret, admin_user, admin_password, discovery_url, token_url, ...}}. c. Run tests/<recipe>/setup_custom_tests.sh hook (jq-readable; wires OIDC env via abra secret insert + .env edits + in-place 'abra app deploy --force --chaos'). 4. CUSTOM tier with deps-ready flag; @pytest.mark.requires_deps tests skip with 'deps-not-ready: <reason>' when setup_custom_tests fails. NON-deps custom tests still run normally — FAILURE ISOLATION (a DoD item per plan). 5. Teardown: recipe first, deps in reverse declaration order. Harness changes: - runner/run_recipe_ci.py: deps deploy moves from BEFORE recipe deploy to AFTER restore tier. Adds _enrich_deps_with_sso() + _run_setup_custom_tests_hook(). DG4.1 generalised to 'one abra app new per app' (recipe + each dep); in-place redeploys (\--force) don't count. - runner/harness/deps.py: write_run_state + load_run_state accept dict OR list shape; deps_as_dict() coerces either to a recipe→entry map. - runner/harness/sso.py: admin_password_inside() public re-export. - tests/conftest.py: deps_creds fixture (full creds dict); deps_apps fixture flattens to recipe→domain string. pytest_collection_modifyitems hook skips \@pytest.mark.requires_deps tests when CCCI_DEPS_READY=0. pytest_configure registers the marker. Recipe content: - tests/lasuite-docs/setup_custom_tests.sh: NEW hook reads $CCCI_DEPS_FILE via jq; inserts oidc_rpcs secret at BUMPED version (v1→v2) since abra app new -S generates v1 first and Swarm forbids overwriting; updates SECRET_OIDC_RPCS_VERSION in .env; writes 9 OIDC env vars (REALM/DISCOVERY/AUTH/TOKEN/USERINFO/LOGOUT/JWKS/CLIENT_ID/SCOPES); ensures trailing newline on .env so writes don't concatenate (caught a 'TIMEOUT=900OIDC_REALM=...' bug); triggers in-place 'abra app deploy --force --chaos --no-input'. - tests/lasuite-docs/functional/test_oidc_with_keycloak.py: refactored to consume deps_creds fixture (no longer calls setup_keycloak_realm itself — the orchestrator does it in setup_custom_tests). Marked \@pytest.mark.requires_deps. Cold-verifiable on cc-ci (log /root/ccci-refactor-lasuite-r5.log): RECIPE=lasuite-docs STAGES=install,custom cc-ci-run runner/run_recipe_ci.py install: PASS, custom: 3 PASS incl. test_oidc_password_grant_against_dep_keycloak. deploy-count = 2 (expect 2) — DG4.1 generalised holds. Smoke regression: RECIPE=custom-html STAGES=install,custom → 5 PASS, deploy-count=1. Closes DEFERRED.md #5 (lasuite-docs OIDC parity ports via this plan). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 19:11:42 +01:00
autonomic-bot	1bd7c7a1d3	feat(2): Q4.4 ghost + DEPLOY_TIMEOUT plumb-through for heavy recipes Harness change (small, surgical): - runner/harness/lifecycle.deploy_app gains a deploy_timeout param (default 900s); passes through to abra.deploy(timeout=...). For heavy recipes (ghost, matrix-synapse, lasuite-meet), the orchestrator + dep resolver now read recipe_meta.DEPLOY_TIMEOUT and pass it so the Python subprocess wrapping abra deploy doesn't SIGKILL it before the recipe's INTERNAL TIMEOUT (via EXTRA_ENV) finishes swarm convergence. - runner/run_recipe_ci.py + runner/harness/deps.py: thread recipe_meta.DEPLOY_TIMEOUT into the per-recipe deploy_app call. Q4.4 ghost enrollment: - recipe_meta.py: HEALTH_PATH=/, DEPLOY_TIMEOUT=1200 (subprocess), EXTRA_ENV={TIMEOUT: 1200} (recipe internal). Ghost cold-start with theme + DB migration runs ~12-15min on cc-ci. - functional/test_health_check.py: GET / returns 200 (themed site). - functional/test_content_api.py: GET /ghost/api/content/settings/ returns 200 (settings JSON) or 401/403 (Ghost error envelope) — distinguishes ghost-server up + JSON API working from static fallback. - functional/test_admin_redirect.py: GET /ghost/ returns 200 or 302 + Ghost branding; proves admin route is wired through nginx proxy. - PARITY.md: recipe-maintainer corpus has no ghost tests/, Phase-2 health_check is the parity baseline; create-a-post deeper test deferred (DEFERRED.md, --extra-tests linked). Cold-verifiable (log /root/ccci-q44-ghost-r3.log): RECIPE=ghost STAGES=install,custom cc-ci-run runner/run_recipe_ci.py install + 3 functional tests PASS, deploy-count=1. 28/28 unit tests still PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 17:23:40 +01:00
autonomic-bot	c6e94af766	fix(2): F2-5 — dep teardown verify=True, errors propagate to run-fail (Adversary cold) Per REVIEW-2 ## Q2 FAIL: runner/harness/deps.py::teardown_deps suppressed ALL exceptions via contextlib.suppress(Exception), silently swallowing teardown failures. The 'DEPS teardown' print fired even when undeploy actually raised — leaving leftover swarm services/volumes/secrets that broke the NEXT run targeting the same deterministic dep domain (this is what caused the Q3.1 dep flake I saw immediately after the Q2.4 acceptance run). Fix: - runner/harness/deps.py: teardown_deps now uses lifecycle.teardown_app(..., verify=True) so residuals raise TeardownError. Errors are LOGGED LOUDLY per-dep but we continue to other deps so one failure doesn't strand the rest. After all attempts: raise a combined TeardownError if any dep failed. - runner/run_recipe_ci.py: orchestrator catches the dep TeardownError in finally, prints it, captures into dep_teardown_error; the run summary surfaces it and the exit code is non-zero. The run STILL prints the diagnosable summary so a leak doesn't hide other failures. Per §9 teardown sacred / DG7: a green run that leaks state is not 'green'. F2-5 now correctly fails the run instead of silently passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 09:00:37 +01:00
autonomic-bot	4d6b040ba7	feat(2): Q2.3 — dep resolver + SSO-setup harness primitives - runner/harness/deps.py: dep resolver primitive (Phase 2 §4.2 / Q2.3). - declared_deps(recipe) reads DEPS list from tests/<recipe>/recipe_meta.py - dep_domain(parent, pr, ref, dep) — per-run domain per (parent, dep) pair so two recipes' deps of the same kind don't collide on a host - deploy_deps / teardown_deps — sequential deploy + reverse-order teardown - read/write of run-scoped $CCCI_DEPS_FILE - runner/harness/sso.py: SSO-setup / OIDC-flow primitive (Phase 2 §4.2 / Q2.3). - setup_keycloak_realm: idempotent realm + confidential OIDC client + test user with generated 25-char alphanumeric password (class-B per §4.4-B); returns SsoCreds dict with discovery_url, token_url, all identifiers. - oidc_password_grant: exercises the password-grant OIDC flow; returns access_token (a JWT) or raises. - assert_discovery_endpoint: GET /.well-known/openid-configuration; asserts issuer matches the per-run provider domain+realm. - runner/run_recipe_ci.py: wired in dep deploy BEFORE recipe-under-test, dep teardown LAST in finally (reverse order). DG4.1 deploy-count guard now expects 1 + len(deps_state) — accommodates declared deps without breaking the no-extra-deploys invariant. - tests/conftest.py: deps_apps fixture reads $CCCI_DEPS_FILE -> dict mapping dep_recipe -> dep_domain. - tests/unit/test_deps.py: 7 unit tests covering declared_deps parsing, per-(parent,dep) domain distinctness, run-state JSON write/load, env-var no-op semantics. 28/28 unit tests PASS on cc-ci. Smoke test confirmed deploy_count == expected (1) when no deps declared (custom-html install run, log /root/ccci-q2-deps-smoke.log). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 07:41:56 +01:00
autonomic-bot	74725610ab	fix(1e): HC1 upgrade/restore tier calls now pass head_ref (multi-line edit miss) Earlier perl substitution missed the multi-line upgrade and restore run_lifecycle_tier calls (still passed `target` = VERSION env, None for !testme runs), so perform_upgrade got head_ref=None for upgrade tier → re-checkout skipped → chaos redeploy of leftover prev checkout (vacuous prev→prev that 'passed' via the chaos-label move fallback). Verified e2e on hedgedoc (install,upgrade; commit pending push): upgrade→PR-head: head_ref=09bf4d54 chaos-version=09bf4d54 version=3.0.9+1.10.7→3.0.10+1.10.8 deploy-count=1, install/upgrade=pass, clean teardown. The chaos-version label deterministically matches head_ref — direct proof PR-head code was deployed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 04:04:13 +01:00
autonomic-bot	6eabfdc0fb	fix(1e): F1e-1 exec_in_app race + HC1 head_ref/move hardening F1e-1 (Adversary): exec_in_app silently returned '' on a failed docker exec, flipping a healthy recipe RED under opt-out (post-backup container cycle, no readiness buffer). Now polls (re-resolve container + re-exec) until rc==0 or 90s, then RAISES — never masks an exec failure as empty data. No assertion weakened. Verified: opt-out install,backup,restore on custom-html now PASS. HC1: head_ref = ref or recipe_head_commit (prefer explicit PR head sha $REF — robust, no git race; production !testme always sets REF). assert_upgraded, when head_ref known, REQUIRES the deployed chaos-version commit to MATCH head_ref (direct + non-vacuous proof the PR-head code was deployed; a stale prev-checkout chaos redeploy fails). Falls back to version/image/chaos move check otherwise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 03:41:42 +01:00
autonomic-bot	b7e6cbd7be	feat(1e): HC3 additive generic + op/assertion split (orchestrator owns the op) - orchestrator: per mutating tier, run optional pre-op seed hook (ops.py pre_<op>) → perform the op ONCE (harness-owned) → run generic assertion (unless opted out) AND overlay assertion, both against the shared post-op deployment. Op results passed op→assertion via run-scoped CCCI_OP_STATE_FILE. - opt-out: CCCI_SKIP_GENERIC / CCCI_SKIP_GENERIC_<OP> / recipe_meta.SKIP_GENERIC (declarative). - generic.py: split do_* into op primitives (perform_upgrade/backup/restore) + assertions (assert_upgraded/backup_artifact/restore_healthy) reading op_state(); deployed_identity now returns {version,image,chaos} (chaos label ready for HC1). - generic test_<op>.py + all 6 recipe overlays migrated to assertion-only; pre-op seeding moved to per-recipe ops.py (pre_upgrade/pre_backup/pre_restore). install overlays unchanged (no op). - deploy-count stays 1 (op primitives never call deploy_app). lint PASS; 8 unit tests PASS on cc-ci. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 03:12:04 +01:00
autonomic-bot	ef44d4658b	feat(1d): G0 — generic install + deploy-once orchestrator (DG1 green on hedgedoc) - harness/generic.py: recipe-agnostic assert_serving (converged + real HTTP, 404-excluded + not Traefik 404 body + CA-verified trusted wildcard cert), op helpers, backup_capable detect - harness/discovery.py: per-op overlay resolution (repo-local > cc-ci > generic), custom + hook - tests/_generic/: assertion-only tiers (install/upgrade/backup/restore) on the shared deployment - run_recipe_ci.py: deploy-ONCE orchestrator, per-op summary, deploy-count guard (DG4.1) - conftest live_app fixture; lifecycle deploy-count + install-steps hook + pin DOMAIN to run domain DG1 cold-verified green on hedgedoc (pure generic, deploy-count=1, clean teardown). G0 CLAIMED. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 23:27:55 +01:00
autonomic-bot	2cede01ed7	style(1b): auto-format + lint-clean the whole codebase (RL1) Mechanical, semantics-preserving cleanup so the codebase passes the new lint stage: - ruff format: all 32 Python files (wraps long signatures, normalizes quotes/blank lines). - nixpkgs-fmt: modules/drone-runner.nix. - shfmt (-i 2 -ci): scripts/.sh. Lint fixes (reviewed, behavior-preserving — no test weakened): - ruff SIM105: try/except-pass -> contextlib.suppress (abra.py app_config rm; lifecycle.py janitor). - ruff SIM115: open().read() -> with open() (run_recipe_ci.py redaction-values + gitea-token). - statix: merge repeated sops `secrets.` keys into one `secrets = { ... }` (comments kept); empty fn pattern `{ ... }:` -> `_:` (packages.nix). - deadnix: drop unused lambda args (flake `self`; configuration.nix `lib`; overlay `final` -> `_`). Verified on cc-ci: `scripts/lint.sh` -> lint: PASS; nixosConfigurations.cc-ci evaluates; all Python byte-compiles. The deployed bridge/dashboard/runner source changes hash (reformat), so cc-ci will be rebuilt to the new closure in W2 before the cold D1-D10 re-verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 20:52:05 +01:00
autonomic-bot	a2f3b14745	fix: upstream tag fetch needs explicit refspec (bare --tags errors 'no remote HEAD') Some checks failed continuous-integration/drone/push Build is passing Details continuous-integration/drone Build is failing Details git fetch --tags <url> without a refspec errors 'couldn't find remote ref HEAD'; use 'refs/tags/:refs/tags/'. Verified: brings custom-html's 18 upstream version tags into the mirror PR clone so the upgrade stage finds a previous published version (was skipping). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 08:28:22 +01:00
autonomic-bot	c277029f84	M10/D10: enable real-!testme path — fetch upstream tags + enroll 6 recipes in POLL_REPOS All checks were successful continuous-integration/drone/push Build is passing Details continuous-integration/drone Build is passing Details fetch_recipe (SRC+REF/PR path) now read-only fetches published version tags from the public upstream into the mirror clone, so the upgrade stage finds a previous published version (mirror PR branches carry no tags → upgrade would skip). Guardrail-safe: only fetches tags, never pushes to the recipe repo; plain git so the bot token isn't sent to upstream. Adds the 6 D10 recipes to the bridge POLL_REPOS so !testme on their PRs triggers runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 08:21:43 +01:00
autonomic-bot	fc07d15800	M7/D6: secrets rotation doc + log redaction filter All checks were successful continuous-integration/drone/push Build is passing Details docs/secrets.md documents the 3 secret classes (A1 external, A2 internal-generated, B recipe-app), the sops-nix decryption chain, and rotation procedures for each (cert version bump, sops re-encrypt + swarm-secret version bump, recipe-app ephemeral). run_recipe_ci streams each stage's output through a redaction filter that masks any /run/secrets/* value (>=8 chars) before it reaches Drone logs — belt-and-suspenders over 'harness never prints secrets + abra doesn't echo'. Live streaming + exit code preserved (locally tested). Recipe-ci clones cc-ci fresh per build, so this applies next run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 07:44:53 +01:00
autonomic-bot	7aa0346902	harness: backup/restore pass -C -o; catalogue fetch re-clones clean Some checks failed continuous-integration/drone/push Build is passing Details continuous-integration/drone Build is failing Details Two fixes surfaced by the first real recipe-ci run through Drone: - abra app backup/restore now pass -C -o (current checkout, no remote fetch) like every other recipe-touching call — without -o they fetch recipe tags from the (private) remote and fail 'authentication required: Unauthorized'. - fetch_recipe's catalogue path rm's the recipe dir first so a leftover private-mirror remote from a prior SRC+REF run can't poison version resolution / backup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 03:05:03 +01:00
autonomic-bot	9b33fdf6e6	M6: D4 recipe-local discovery + recipe #2 (keycloak, DB-backed) enrolled; M6 CLAIMED All checks were successful continuous-integration/drone/push Build is passing Details D4 snapshots recipe-shipped tests/ and runs them against the live app. abra -C -o everywhere + token clone for private mirror PRs. keycloak install green with no harness surgery (D5). docs/enroll-recipe.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 01:48:06 +01:00

1 2

53 Commits