cc-ci

Author	SHA1	Message	Date
autonomic-bot	fd02d9f4b8	feat(harness): P3 — uniform ctx hook convention (rcust) All checks were successful continuous-integration/drone/push Build is passing Details harness.meta.HookCtx (frozen): .domain, .base_url, .meta (RecipeMeta), .deps (provisioned dep creds from $CCCI_DEPS_FILE or None), .op (current lifecycle op or None); built via meta.hook_ctx() at each hook call site. All recipe callables now take ctx: EXTRA_ENV(ctx), UPGRADE_EXTRA_ENV(ctx), READY_PROBE(ctx), BACKUP_VERIFY(ctx), SCREENSHOT(page, ctx), ops.py pre_<op>(ctx). Dict-valued EXTRA_ENV/UPGRADE_EXTRA_ENV unchanged (only the callable signature moved). Call sites converted: deploy_app env shaping, perform_upgrade, wait_ready_probes (gains op=), _perform_op BACKUP_VERIFY, screenshot.capture, _run_pre_hook. Legacy signatures fail FAST with a clear migration message: the registry carries hook_params per hook key, enforced at meta.load() (MetaError names the old vs new signature); ops.py pre-op hooks get the same check at the orchestrator call site (meta.check_hook_signature) — no silent TypeError mid-run. Migrated every in-repo user mechanically (17 ops.py files; cryptpad/lasuite-*/ mailu EXTRA_ENV; mumble+lasuite-drive READY_PROBE; ghost/discourse BACKUP_VERIFY) — seeded values, probes and assertions byte-identical (domain -> ctx.domain; keycloak pre_restore's meta arg -> ctx.meta). Unit tests: hook_ctx field contract, ctx.deps from the run deps file, legacy- signature MetaError (READY_PROBE/EXTRA_ENV/SCREENSHOT + pre-op checker), ctx signatures accepted. Docs table regenerated (signature docs in key docs). Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 180 passed; scripts/lint.sh -> PASS.	2026-06-10 17:10:26 +00:00
autonomic-bot	8cd72fd78d	feat(harness): P2 — delete legacy customization keys & paths (rcust) All checks were successful continuous-integration/drone/push Build is passing Details a) compose.ccci.yml is FIRST-CLASS: the harness auto-copies tests/<recipe>/ compose.ccci.yml into the run's recipe checkout (ABRA_DIR-aware, lifecycle. provide_ccci_overlay) and auto-chaoses the pinned base deploy on its presence (kills the R7 implicit coupling). ghost/discourse install_steps.sh (copy-only boilerplate) deleted; CHAOS_BASE_DEPLOY removed from both metas + the registry. b) install-time deps wiring is the ONLY mode: deps with DEPS provision BEFORE the single deploy; legacy post-deploy provisioning + the setup_custom_tests.sh invocation machinery deleted. lasuite-docs migrated to install_steps.sh OIDC wiring (same env names/values as the old hook — only the timing moved); lasuite-drive's remaining post-deploy MinIO bucket one-shot moved to ops.py pre_install; both setup_custom_tests.sh files deleted; OIDC_AT_INSTALL removed from drive/meet metas + the registry. c) SKIP_GENERIC meta key deleted (zero users). Env form CCCI_SKIP_GENERIC* stays as the documented dev-only escape hatch; when active in a drone CI run the orchestrator prints a loud !! warning (manifest embedding lands in P5). d) conftest cleanup: dead pre-deploy-once fixtures deployed/deployed_app deleted (zero users), app_domain + _short + _wait_healthy dropped (only users were the deleted fixtures); deps_apps+deps_creds consolidated into ONE deps fixture (entries expose .domain etc. as attributes; dict access intact); the 6 lasuite test files renamed deps_creds->deps (fixture name only — assertions and flows byte-identical). requires_deps marker + F2-11 skip-report plumbing unchanged. Registry is now exactly the 14 final keys; docs §4 table regenerated. Stale setup_custom_tests/OIDC_AT_INSTALL prose in docstrings/comments/assert MESSAGES updated (no assert logic or expected value touched). Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 175 passed; scripts/lint.sh -> PASS.	2026-06-10 17:01:33 +00:00
autonomic-bot	17ebdf39ac	feat(harness): P3 per-run ABRA_DIR — structural recipe-tree isolation, recipe flock deleted All checks were successful continuous-integration/drone/push Build is passing Details - run_recipe_ci.setup_run_abra_dir(): builds <runs_dir>/<run-id>/abra with servers/ and catalogue/ symlinked to the canonical ~/.abra (app .env files keep landing in the shared canonical path, so janitor discovery and env-based teardown are unchanged; per-domain filenames + the P2 app-domain lock prevent write conflicts) and a FRESH empty recipes/ — each run clones + checkouts its own recipe trees. Exported as $ABRA_DIR (honored by the abra CLI, verified on-host) before ANY abra call. Manual runs get manual-<pid> isolation. - fetch_recipe(): plain clone into $ABRA_DIR/recipes/<recipe> — no shared-tree rm-rf, no lock. CCCI_SKIP_FETCH=1 now copies the canonically-staged clone into the per-run tree (same staging workflow, run reads staged state). - abra.abra_dir()/recipe_dir(): single resolution rule ($ABRA_DIR else ~/.abra), used by recipe_checkout, has_lightweight_version_tags, recipe_head_commit, recipe_versions, generic._recipe_dir, lifecycle.prepull_images, snapshot_recipe_tests, and warm_reconcile._recipe_dir (which keeps the canonical default for its own systemd runs but follows the per-run tree when imported by promote_canonical inside a run). - deleted: lifecycle.acquire_recipe_lock, RECIPE_LOCK_DIR, the main() call site and the must-lock-before-fetch ordering rule. - tests/{ghost,discourse}/install_steps.sh: RECIPE_DIR resolves ${ABRA_DIR:-$HOME/.abra} so the compose.ccci.yml overlay lands in the tree the run actually deploys from (mechanical path fix required by per-run trees; no assertion/gate touched — see DECISIONS.md). - .drone.yml comments updated (HOME=/root rationale now via the servers symlink).	2026-06-10 04:18:33 +00:00
autonomic-bot	9a7772563a	style: repo-wide lint pass — make the lint gate green again Push builds have been RED on the lint step since ~build 209 from accumulated formatting drift. This is the mechanical cleanup: ruff format + ruff --fix (UP038 isinstance unions, SIM105 contextlib.suppress, UP031 f-strings, SIM115 tempfile context manager), shfmt -i 2 -ci, nixpkgs-fmt/statix/deadnix (merged attrsets, dropped unused lib args), yamllint, and shell quoting fixes in tests/lasuite-docs/setup_custom_tests.sh. No behaviour changes intended; lint: PASS, unit tests: 138 passed.	2026-06-09 21:56:15 +00:00
autonomic-bot	3a612fc733	fix(2): ghost BACKUP_VERIFY — drop __file__ (recipe_meta is exec'd, no __file__); import harness directly full9: backup tier FAILed with NameError('__file__' not defined) — recipe_meta.py is exec()'d into a bare namespace so __file__ is undefined. The harness already has runner/ on sys.path + harness imported, so import lifecycle directly. (restore PASSED on full9 — the data-integrity fix works; this just fixes the verify probe crashing the backup tier.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:49:08 +00:00
autonomic-bot	68a7c79668	fix(2): ghost F2-14b — harness BACKUP_VERIFY hook + retry; close the backup-capture race Root cause (instrumented, DECISIONS 2026-05-30): a DB recipe dumps its data in a backupbot pre-hook, but if the DB container cycles mid-dump (intermittent on the loaded CI node — full5/6/7 RED, full8 green; NOT OOM/NOT healthcheck) the dump is truncated/absent and restic snapshots an empty path — abra app backup 'succeeds' yet a later restore silently loses the data (ghost ci_marker). Fix (additive, recipe-scoped via meta like READY_PROBE): recipe_meta may define BACKUP_VERIFY(domain) -> bool, a READ-ONLY post-backup integrity probe. When it returns False the harness re-runs the whole backup (fresh snapshot, re-stabilised db) up to 3x. Recipes without the hook are unaffected. ghost's BACKUP_VERIFY confirms /var/lib/mysql/backup.sql.gz is a valid non-empty gzip. Weakens no assertion — it only retries a flaky CAPTURE so P4 restore is RELIABLY exercised, not luck-dependent. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:30:25 +00:00
autonomic-bot	4a160f6121	fix(2): ghost F2-14b — bump DEPLOY_TIMEOUT/TIMEOUT 1200→2400s for slow mysql cold-init + migration full4 timed out: abra deploy killed at 1200s while the app was at the near-final email_recipients migration tables (still 0/1). Wall-time = mysql fresh-dir init (~6min, app crash-loops on ECONNREFUSED until DB ready — no migration progress lost) + ~9-15min schema migration (round-trip-bound, slower under host load). Not a test weakening — bounded wait (matches discourse), a genuine hang still fails. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 19:54:20 +00:00
autonomic-bot	3ca45c7308	fix(2): ghost F2-14b — add db start_period grace to base overlay Run #2 base deploy: fresh mysql:8.0 init on the loaded cc-ci host (load ~8) took >6min (InnoDB ~90s + system-tables + root-pw apply, starved by the app crash-loop churn), exceeding the recipe's 1m db start_period (+6min retry grace) → swarm killed mysql mid-init (exit 137 unhealthy) → corrupt InnoDB redo logs → permanent deadlock (same signature as run #1's stale vol). Widen db healthcheck start_period to 15m (matches app) so the slow first-boot finishes before the healthcheck can fail it. Grace-only, masks no defect; bites base+head (published recipe ships db start_period 1m everywhere) so overlay covers both. Torn down corrupt vol. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 17:58:30 +01:00
autonomic-bot	7feeadd0ec	feat(2): ghost F2-14b — upgrade-to-latest base-grace overlay (compose.ccci.yml) Course correction (REVIEW-2 `bdef282`) mandates upgrade-to-latest; harness base-deploys prev published version 1.1.1+6-alpine which predates the recipe-PR 15m start_period bump (ships 1m) → would deadlock on the ~6-9min fresh-DB migration (swarm kill mid-migration → held migrations_lock). Policy-blessed minimal base overlay: compose.ccci.yml re-applies the 15m app-healthcheck start_period grace to the BASE so the from-version is deployable; install_steps.sh provides it; CHAOS_BASE_DEPLOY skips clean-tree on the untracked overlay; persists across head checkout (idempotent — PR head ships 15m). Grace-only, no test weakened. Prior corrupt mysql vol (stale, interrupted init) torn down. Next: full run incl upgrade. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 17:49:05 +01:00
autonomic-bot	0f2cc2d704	feat(2): ghost F2-14b overlay migration — start_period bump moved to recipe-PR (ghost#1 head ae43ffe, literal 15m on app healthcheck); DELETE cc-ci compose.ccci-health.yml + install_steps.sh + COMPOSE_FILE/CHAOS_BASE_DEPLOY. Anti-drift (plan §9): recipe-as-tested == recipe-as-published. env-var start_period impossible (abra pre-subst duration validation, Adversary-reproduced `4b862f6`). Next: run ghost on ae43ffe head.	2026-05-30 17:20:20 +01:00
autonomic-bot	13da216f8d	fix(2): ghost healthcheck start_period overlay — fixes fresh-migration lock deadlock Root cause: Ghost's fresh-DB first boot runs a ~6-9min schema migration (round-trip-bound, not CPU); the recipe healthcheck start_period:1m (~6min grace) kills the still-migrating task, leaving a stale migrations_lock → every later task deadlocks (MigrationsAreLockedError). Hit on both 2- and 4-vCPU. Fix (cc-ci deploy overlay, NOT a recipe/test change): compose.ccci-health.yml raises app healthcheck start_period to 900s, wired via recipe_meta COMPOSE_FILE + install_steps.sh (+ CHAOS_BASE_DEPLOY for the untracked overlay). No assertion weakened. Budget 1200s = migration + convergence. Only the install tier needs it (upgrade redeploys on the populated DB → fast boot).	2026-05-30 05:23:47 +01:00
autonomic-bot	9771b6e16a	fix(2): ghost timeout 2400->900 — VM now 4 dedicated vCPU (operator), migration converges in minutes; short bounded budget fails fast on the migrations_lock deadlock instead of a long blackout	2026-05-30 05:06:22 +01:00
autonomic-bot	bdaeb41496	fix(2): ghost DEPLOY_TIMEOUT/TIMEOUT 1200->2400 — MySQL cold-boot migration + healthcheck-kill+retry needs >20min on slow node (install timed out as it converged)	2026-05-30 04:41:59 +01:00
autonomic-bot	b4d03ccafe	feat(2): ghost P4 data-integrity overlay (MySQL ci_marker) + §4.3 create-post round-trip - ops.py + test_{upgrade,backup,restore}.py: seed ci_marker into the MySQL `ghost` DB (db service) via the mysql CLI; rides the recipe's mysqldump --tab backup. recipe is MySQL not sqlite (stale comment fixed). Expect restore RED -> recipe-PR (no backupbot.restore hook; immich/mattermost class). - functional/_ghost.py: cookie-aware Ghost Admin API client (stdlib http.cookiejar; Origin CSRF hdr). - functional/test_post_roundtrip.py: §4.3 create published post + read back (unique marker, non-vacuous); closes the DEFERRED ghost create-post item. - PARITY.md + recipe_meta.py updated. Authored node-free; full-lifecycle run next, NOT yet claimed.	2026-05-30 04:14:13 +01:00
autonomic-bot	1bd7c7a1d3	feat(2): Q4.4 ghost + DEPLOY_TIMEOUT plumb-through for heavy recipes Harness change (small, surgical): - runner/harness/lifecycle.deploy_app gains a deploy_timeout param (default 900s); passes through to abra.deploy(timeout=...). For heavy recipes (ghost, matrix-synapse, lasuite-meet), the orchestrator + dep resolver now read recipe_meta.DEPLOY_TIMEOUT and pass it so the Python subprocess wrapping abra deploy doesn't SIGKILL it before the recipe's INTERNAL TIMEOUT (via EXTRA_ENV) finishes swarm convergence. - runner/run_recipe_ci.py + runner/harness/deps.py: thread recipe_meta.DEPLOY_TIMEOUT into the per-recipe deploy_app call. Q4.4 ghost enrollment: - recipe_meta.py: HEALTH_PATH=/, DEPLOY_TIMEOUT=1200 (subprocess), EXTRA_ENV={TIMEOUT: 1200} (recipe internal). Ghost cold-start with theme + DB migration runs ~12-15min on cc-ci. - functional/test_health_check.py: GET / returns 200 (themed site). - functional/test_content_api.py: GET /ghost/api/content/settings/ returns 200 (settings JSON) or 401/403 (Ghost error envelope) — distinguishes ghost-server up + JSON API working from static fallback. - functional/test_admin_redirect.py: GET /ghost/ returns 200 or 302 + Ghost branding; proves admin route is wired through nginx proxy. - PARITY.md: recipe-maintainer corpus has no ghost tests/, Phase-2 health_check is the parity baseline; create-a-post deeper test deferred (DEFERRED.md, --extra-tests linked). Cold-verifiable (log /root/ccci-q44-ghost-r3.log): RECIPE=ghost STAGES=install,custom cc-ci-run runner/run_recipe_ci.py install + 3 functional tests PASS, deploy-count=1. 28/28 unit tests still PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 17:23:40 +01:00

15 Commits