Commit Graph

59 Commits

Author SHA1 Message Date
cd19c1b172 feat(settings): server settings.toml loader + SKIP_CANONICALS_FOR_UPGRADE + release-tag-first no-canonical fallback
Some checks failed
continuous-integration/drone/push Build is failing
- harness/settings.py: stdlib tomllib loader, [upgrade].skip_canonicals_for_upgrade
  (bool, default false), _SCHEMA single-source defaults+validation; graceful on
  absent/malformed (WARN+defaults), warn-and-ignore unknown keys/tables, TypeError on
  wrong type. Path $CCCI_SETTINGS / /etc/cc-ci/settings.toml. + tracked settings.toml.example.
- resolve_upgrade_base: flag true bypasses the canonical lookup -> no-canonical fallback;
  canonical-present path (incl. samever step-back) unchanged when false.
- _no_canonical_base (always-on, §2.C): newest release tag < head (reuse
  warm_reconcile.newest_older_version) -> main-tip -> skip; replaces jump-to-main-tip.
- unit: full resolution matrix + loader tests; 315 unit pass, ruff clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 16:55:22 +00:00
3595e80d08 claim(M1): per-recipe history sourced from local /var/lib/cc-ci-runs artifacts (full history, not Drone 100-build slice)
Some checks failed
continuous-integration/drone/push Build is failing
history_for() now enumerates run dirs' results.json, groups by recipe, sorts
newest-first by finished timestamp (mixed numeric+named ids — timestamp is the
only correct key), caps at HISTORY_CAP=30, skips malformed/empty/no-recipe dirs.
Overview + badges + /runs + security guards + stdlib-only unchanged.
Local verify: 13/13 unit tests; full-fixture vs 308 real results.json →
bluesky-pds=8 in exact ts order, plausible capped 30 newest, edge dirs skipped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 16:25:39 +00:00
83c183d985 feat(canon): §2.G strip UPGRADE_BASE_VERSION entirely (plausible verified dynamic-base green)
All checks were successful
continuous-integration/drone/push Build is passing
Gate satisfied — live: with the pin removed, plausible's upgrade tier resolves base 3.0.1+v2.0.0 via
the same-version step-back (canonical 3.1.0 == head 3.1.0 → newest-older = 3.0.1, NOT the broken
3.0.0) and passes install+upgrade green (level 5/5). The pin is redundant, so removed everywhere:
- meta.py KEYS entry (RecipeMeta field auto-drops; 15→14 keys).
- run_recipe_ci.resolve_upgrade_base override branch + docstrings.
- tests/unit/test_meta.py (count 15→14, dropped None-assert), test_upgrade_base.py (override test).
- docs/recipe-customization.md (regenerated table + mentions), docs/testing.md.
- tests/plausible/recipe_meta.py (pin removed), tests/bluesky-pds (re-enable note → dynamic base).
294 unit tests pass; lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 12:31:53 +00:00
a20890a363 feat(canon): M1.2 release-tag trigger + faithful mirror-sync in the weekly sweep (§2.C/§2.D)
All checks were successful
continuous-integration/drone/push Build is passing
- warm_reconcile.sweep_decision(latest_tag, canon_version): pure new-release-tag trigger
  keyed on version_key (NOT commit) — new tag>canon → run; ==/older → skip no-new-version
  (even with untagged main commits); no tag → skip never-released. Unit-tested.
- scripts/recipe-mirror-sync.sh: faithful mirror sync (adapted from open-recipe-pr.sh
  --reconcile-only) — explicit coopcloud `upstream` remote (robust to inconsistent clone
  remotes), syncs main+TAGS, closes merged-upstream PRs, leaves unrelated PRs, bot-token auth.
- nightly_sweep rewritten: per enrolled recipe → mirror_sync → fetch → sweep_decision →
  run_on_tag (checkout the release tag + CCCI_SKIP_FETCH=1 so head IS the tag → tagged-promote
  gate passes, REF empty → promote allowed). Skips logged; run-twice → skip-all determinism.
- smoke-tested recipe-mirror-sync.sh live on custom-html: faithful no-op main/tags push,
  closed merged-upstream PR #2, left pending PR #5.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 06:45:43 +00:00
27e06289f8 feat(canon): M1.1 tagged-promote gate — canonical only advances to a published release tag
All checks were successful
continuous-integration/drone/push Build is passing
- should_promote_canonical gains a `tagged` requirement (canon §2.A): a green cold
  latest run promotes only when the tested head version is a published release tag;
  an untagged main commit never becomes a canonical.
- warm_reconcile.is_released_version(recipe, version): release-tag membership (exact or
  by version_key). Caller computes `tagged` so the gate stays pure.
- unit tests: untagged -> no promote; is_released_version cases.
- drive-by (pre-existing reds, unrelated to canon, now green): test_warm_reconcile
  traefik assertion was stale vs the phase-pxgate spec (probes /api/version, no
  health_domain); meta.py UPGRADE_BASE_VERSION KEYS help synced to the prevb doc text.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 06:34:09 +00:00
b29bb3f804 feat(samever): step back to older base when last-green canonical == head version
resolve_upgrade_base now reads the head's published version (abra.head_compose_version,
the coop-cloud.<stack>.version label) and, when the last-green warm-canonical version
equals it, steps back to the newest published version strictly older than head instead
of deploying a same-version no-op. warm_reconcile gains version_key + newest_older_version
(single coop-cloud ordering source; sort_versions refactored onto version_key, no behavior
change). Skip only when no older published predecessor exists. Step-back returns kind=version
so it inherits F1d-2 pinned-tag checkout. Extends tests/unit/test_upgrade_base.py (13 pass).
2026-06-17 04:24:14 +00:00
e1b32ea650 fix(prevb): prune orphan services on upgrade redeploy (head's dropped services); re-add EXPECTED_NA-other-rung test; consume Adversary inbox
All checks were successful
continuous-integration/drone/push Build is passing
docker stack deploy doesn't prune services the head compose dropped (discourse PR#4 drops sidekiq),
leaving them orphaned on the base image. perform_upgrade now reconciles the live stack to the head
compose service set (lifecycle.prune_orphan_services). Makes the deployed stack faithfully reflect
the head — no test weakened. No-op when service sets match / compose unresolvable.
2026-06-17 00:29:00 +00:00
bb2e3c6b2c feat(prevb): dynamic upgrade base (last-green→main→skip) + per-recipe previous/ overlay; migrate discourse off static base + leaky overlay
All checks were successful
continuous-integration/drone/push Build is passing
- resolve_upgrade_base: BasePlan(kind=version|ref|skip); last-green (warm canonical) primary,
  main-tip fallback, declared skip else. UPGRADE_BASE_VERSION retained as optional override.
- deploy_app: base_ref path (chaos-deploy a main-tip/last-green commit) + apply_previous wiring.
- lifecycle: previous/ surface (has_previous, previous_target_version, previous_status decision,
  provide/remove overlay, compose_file add/remove, recipe_branch_commit, stack_service_names).
- generic.perform_upgrade: strip previous/ overlay + COMPOSE_FILE entry before head redeploy.
- discourse: compose.ccci.yml now environmental-only (order: stop-first); removed bitnamilegacy
  pins + sidekiq + UPGRADE_BASE_VERSION; test_upgrade.py asserts head image == official 3.5.3 + no sidekiq.
- unit tests: resolve_upgrade_base matrix + previous/ apply/skip/stale + COMPOSE_FILE layering.
2026-06-17 00:15:06 +00:00
2d865f06cb fix(gtea): ruff format + check all gtea files and bridge.py
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
Clears cc-ci self-test lint failures:
- ruff format: 9 files reformatted (all gtea test files + test_discovery.py)
- ruff check --fix: bridge.py UP017 (datetime.UTC alias) + 6 gtea check errors
- manifest.py B007: rename unused loop variable path → _path (no auto-fix available)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 21:52:01 +00:00
d832b353e4 fix(gtea): UPGRADE_SECRET_PREP hook — pre-insert lfs_jwt_secret with correct 43-char format
Some checks failed
continuous-integration/drone/push Build is failing
Blocker 4 fix: abra `secret generate --all` uses .env.sample for length hints; the
lfs-plain-gitea PR has SECRET_LFS_JWT_SECRET_VERSION=v1 COMMENTED OUT, so abra produces
a wrong-length secret. gitea requires exactly 43 chars (32 bytes base64 URL-safe); wrong
length → gitea fatals trying to save the JWT secret to the read-only Docker Config
app.ini → health check fails → swarm rolls back.

Fix: new UPGRADE_SECRET_PREP hook (meta.py) called before `abra secret generate --all`
in the upgrade path. abra's `--all` is idempotent (skips existing secrets), so the
correctly pre-inserted secret survives. gitea's recipe_meta.py implements the hook using
`docker secret create` directly to guarantee correct format regardless of .env.sample.

Also consumes machine-docs/BUILDER-INBOX.md (Adversary Blocker 4 digest).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 21:46:28 +00:00
23f1861b7a fix(bridge): ignore pre-start trigger comments
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-13 00:27:22 +00:00
44e02425ab feat(cfold): canonicalize custom test layout
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-12 16:08:18 +00:00
1be74fb9e1 fix(lint): F821 undefined 'e' in test_scm_configured; shfmt/ruff auto-fixes
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
- test_scm_configured.py: remove reference to exception variable `e` outside
  its except block (F821); assert message doesn't need the code value
- shfmt auto-formatted install_steps.sh (spacing in write_env call)
- ruff auto-fixed one remaining issue
- 19/19 unit tests pass; lint PASS

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-11 22:17:19 +00:00
0aa46dbe72 fix(drone-dep): ADV-drone-02 — teardown fallback when SSO enrichment fails after deploy
Some checks failed
continuous-integration/drone/push Build is failing
When _enrich_deps_with_sso raises after deploy_deps succeeds (e.g., gitea API
call fails), deps_state stays {} and the finally block's `if deps_state:` guard
skips teardown, orphaning the dep at its deterministic domain.

Fix: add an `else` branch after the `if deps_state:` block that reads
$CCCI_DEPS_FILE (the legacy-list written by deploy_deps) and calls
teardown_deps on the cold entries so no dep is left running.

Unit tests: test_load_run_state_provides_fallback_for_enrichment_failure and
test_fallback_skips_warm_entries verify the data-flow that the fallback relies on.
19/19 unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-11 22:03:29 +00:00
7e7e84df34 fix(drone): ADV-drone-01 — no-follow redirect pattern in SCM test
Some checks failed
continuous-integration/drone/push Build is failing
test_scm_configured.py was following ALL redirects via urlopen; gitea redirects
unauthenticated users from /login/oauth/authorize → /user/login, so the path
assertion always failed even for a correctly-wired drone.

Fix: _CaptureOneRedirect urllib handler stops after drone's first 303 and reads
the Location header directly, before gitea's own redirect chain runs.

- Consume BUILDER-INBOX.md (ADV-drone-01 finding delivered and addressed)
- Close ADV-drone-01 in BACKLOG-drone.md
- Update test_gitea_dep.py terminology: "location_url" not "final_url"
- All 10 unit tests pass

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-11 21:48:36 +00:00
51c3280163 feat(drone): enroll drone + gitea SCM dep (M1 implementation)
Some checks failed
continuous-integration/drone/push Build is failing
- tests/gitea/recipe_meta.py: gitea as install-time dep provider; sqlite3
  overlay EXTRA_ENV, health path /api/healthz, relaxed access for CI use
- tests/drone/recipe_meta.py: DEPS=["gitea"]; health /healthz; 600s timeout
- tests/drone/install_steps.sh: wires GITEA_CLIENT_ID + GITEA_DOMAIN +
  client_secret Docker secret + DRONE_USER_CREATE before single drone deploy
- tests/drone/functional/test_scm_configured.py: Playwright-free SCM test —
  follows /login redirect, asserts final URL is gitea dep's OAuth2 authorize
  endpoint with matching client_id (per Adversary pre-probe REVIEW-drone.md)
- tests/drone/PARITY.md: backup structural-skip justified (no backupbot labels)
- runner/harness/sso.py: setup_gitea_oauth() — creates gitea admin user via
  CLI + OAuth2 app via API, returns {admin_user, admin_password, client_id,
  client_secret} for install_steps.sh consumption
- runner/run_recipe_ci.py: _enrich_deps_with_sso now handles gitea dep (calls
  setup_gitea_oauth; keycloak path unchanged)
- tests/unit/test_gitea_dep.py: unit tests for gitea dep path — meta loading,
  SSO routing, SCM redirect assertion logic (parametrized)
- machine-docs: STATUS/JOURNAL/BACKLOG-drone.md phase state files initialized

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-11 21:31:43 +00:00
e9745c8c74 feat(bsky): EXPECTED_NA['upgrade'] suppresses the upgrade-tier base deploy — single deploy = PR head; bluesky-pds declares it (no deployable base: every published tag pins the republished moving :0.4). upgrade_base() extracted pure + 6 unit tests; meta-key doc regenerated. 253 unit tests + repo lint PASS
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
2026-06-11 11:51:12 +00:00
68c3486216 fix(lvl5): lint executor PR-path — abra lint selects+checks out the repo DEFAULT BRANCH; scratch clone of a detached per-run tree has none (FATA, live 400-402), and a stale default would be silently linted instead of the PR head. Force local main AT the tested ref + repoint origin to the scratch itself (offline tag fetch, no drift). Regression test with detached two-commit source proves exact-ref content is linted. 247 unit tests green; real-abra detached-source smoke pass.
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
2026-06-11 10:56:56 +00:00
1d3b61c6c2 fix(lvl5): lint table parser — abra renders HEAVY box verticals (┃ U+2503); accept both; meta registry EXPECTED_NA/BACKUP_CAPABLE wording → regenerated doc table
Some checks failed
continuous-integration/drone/push Build is failing
Found by real-abra smoke on cc-ci: hedgedoc clean → pass; +lightweight tag →
fail R014. Full suite 246 passed on cc-ci venv.
2026-06-11 07:49:29 +00:00
e219a7891d feat(lvl5): P1 — 5-rung ladder (L5=abra recipe lint) + de-capped level semantics
All checks were successful
continuous-integration/drone/push Build is passing
level.py: RUNGS += lint; statuses {pass,fail,skip,unver}; compute_level = max passed
rung with all below pass-or-skip (fail/unver block); cap_reason/capped DELETED.
harness/lint.py: lint executor — pristine scratch clone of the per-run tree at the
exact tested ref (mirror-origin + untracked-overlay pollution solved by context, no
rule filtered), PTY via script -qec, 60s hard budget, lint.txt artifact, table-parse
classifier (rc only signals FATA), unver on any non-run (never silent pass).
results.py: derive_rungs classifies every N/A source (structural/declared → skip,
else unver), lint rung + synthetic lint stage + lint block in results.json, schema 2,
cap fields removed. run_recipe_ci.py: lint call before tiers (double-wrapped,
verdict-neutral), badge = level only. card/dashboard: 0-5 ramp, cap line → 'level N
of {4|5}', unverified rows, badge number+colour only, lint.txt servable, old schema-1
artifacts render untouched. Unit suite rewritten: 245 passed on cc-ci venv.
2026-06-11 07:42:30 +00:00
3c33129ebd fix(shot): mattermost hook v2 — interstitial appears on ANY first-visit route incl /login (proven byte-identical PNG); click 'View in Browser' best-effort then settle; unit test covers click + no-interstitial fallback; 207 pass, lint PASS
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 06:45:43 +00:00
7ad7d1f20d fix(shot): A1 — blank-retry keeps the LARGER frame (retry snapped to temp path, os.replace only if >= first; worse late frame discarded + temp cleaned); regression test [9999,4801]->9999; 207 unit tests pass, lint PASS
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 06:24:01 +00:00
80e5713c5c feat(shot): mattermost-lts SCREENSHOT hook → /login (default lands the desktop-or-browser interstitial; watch-list wants the real sign-in form) + public screenshot.settle() for hooks; unit test via real loader; 206 unit tests pass, lint PASS
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 06:19:39 +00:00
ce50f641cc feat(shot): harness default capture fix — bounded networkidle settle after domcontentloaded + blank-frame retry (≤60s wait budget, R7 best-effort preserved); 6 unit tests; lint PASS, 205 unit tests pass via cc-ci-run
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 01:31:03 +00:00
be2026aafb fix(harness): services_converged — a replica deficit explained entirely by Complete tasks is converged (triggered one-shot, rcust M2 lasuite-drive root cause)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 00:26:53 +00:00
858e0f582f fix(harness): redact secret-named meta values in the customization manifest (rcust)
All checks were successful
continuous-integration/drone/push Build is passing
Adversary heads-up (inbox 2026-06-10T19:06Z): meta values are repo-public by construction, but
the manifest lands on the dashboard — a field literally named SECRET_KEY_BASE showing a value
(plausible's committed CI dummy) is needless secret-scan noise. Mask values whose key NAME is
secret-shaped (SECRET|PASSWORD|TOKEN|CREDENTIAL|word-segment KEY), top-level and nested dict
keys; the key name stays visible. Unit test pins redacted vs passthrough (KEYCLOAK_URL).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 19:09:09 +00:00
68954be53e feat(harness): P5 — customization manifest (rcust)
All checks were successful
continuous-integration/drone/push Build is passing
One block at run start answering "what does this recipe customize?" across every surface
(non-default recipe_meta keys, ops.py pre-ops, install_steps.sh, compose.ccci.yml, lifecycle
overlays by source, custom-test counts, active CCCI_SKIP_GENERIC* env overrides — !!-flagged when
riding a CI run, P2c), printed to the run log and embedded verbatim in results.json under
"customization". Pure presentation — building/printing it never influences a verdict; the
manifest honors the HC2 repo-local gate so it never advertises code the run will not execute.

Unit tests: synthetic recipe exercising every surface -> complete + deterministic + JSON-clean;
HC2 invisibility; env-override flagging; render golden lines; build_results threads the dict
verbatim (key always present, None when absent).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 18:57:26 +00:00
29a28e2028 feat(harness): P4 — custom-test ergonomics (rcust)
All checks were successful
continuous-integration/drone/push Build is passing
Placement RULE: discovery.custom_tests covers ONLY functional/ + playwright/ —
the top-level test_*.py glob for recipe dirs is removed (top level is reserved
for lifecycle overlays; zero in-repo users of top-level custom tests, verified
by sweep). Lifecycle-name exclusion inside the subdirs stays as the double-run
safety net. HC2 default-deny unchanged (repo-local custom now pinned via
functional/ in the gate test).

New conftest fixture op_state: parses $CCCI_OP_STATE_FILE (op context: versions,
artifact paths), skipping with a clear reason when unset/absent/unparseable —
overlay tests read op facts from the fixture instead of hand-parsing env (zero
existing hand-parsers found; the fixture is the documented path forward). deps
fixture landed in P2d.

Unit tests: placement-rule discovery tests (top-level custom NOT discovered;
functional/playwright are; misfiled lifecycle names excluded), op_state fixture
contract (reads file / skips without env / skips on missing file), deps fixture
attribute sugar.

Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 184 passed; scripts/lint.sh -> PASS.
2026-06-10 17:14:21 +00:00
fd02d9f4b8 feat(harness): P3 — uniform ctx hook convention (rcust)
All checks were successful
continuous-integration/drone/push Build is passing
harness.meta.HookCtx (frozen): .domain, .base_url, .meta (RecipeMeta), .deps
(provisioned dep creds from $CCCI_DEPS_FILE or None), .op (current lifecycle op
or None); built via meta.hook_ctx() at each hook call site.

All recipe callables now take ctx: EXTRA_ENV(ctx), UPGRADE_EXTRA_ENV(ctx),
READY_PROBE(ctx), BACKUP_VERIFY(ctx), SCREENSHOT(page, ctx), ops.py pre_<op>(ctx).
Dict-valued EXTRA_ENV/UPGRADE_EXTRA_ENV unchanged (only the callable signature
moved). Call sites converted: deploy_app env shaping, perform_upgrade,
wait_ready_probes (gains op=), _perform_op BACKUP_VERIFY, screenshot.capture,
_run_pre_hook.

Legacy signatures fail FAST with a clear migration message: the registry carries
hook_params per hook key, enforced at meta.load() (MetaError names the old vs new
signature); ops.py pre-op hooks get the same check at the orchestrator call site
(meta.check_hook_signature) — no silent TypeError mid-run.

Migrated every in-repo user mechanically (17 ops.py files; cryptpad/lasuite-*/
mailu EXTRA_ENV; mumble+lasuite-drive READY_PROBE; ghost/discourse BACKUP_VERIFY)
— seeded values, probes and assertions byte-identical (domain -> ctx.domain;
keycloak pre_restore's meta arg -> ctx.meta).

Unit tests: hook_ctx field contract, ctx.deps from the run deps file, legacy-
signature MetaError (READY_PROBE/EXTRA_ENV/SCREENSHOT + pre-op checker), ctx
signatures accepted. Docs table regenerated (signature docs in key docs).

Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 180 passed; scripts/lint.sh -> PASS.
2026-06-10 17:10:26 +00:00
8cd72fd78d feat(harness): P2 — delete legacy customization keys & paths (rcust)
All checks were successful
continuous-integration/drone/push Build is passing
a) compose.ccci.yml is FIRST-CLASS: the harness auto-copies tests/<recipe>/
   compose.ccci.yml into the run's recipe checkout (ABRA_DIR-aware, lifecycle.
   provide_ccci_overlay) and auto-chaoses the pinned base deploy on its presence
   (kills the R7 implicit coupling). ghost/discourse install_steps.sh (copy-only
   boilerplate) deleted; CHAOS_BASE_DEPLOY removed from both metas + the registry.

b) install-time deps wiring is the ONLY mode: deps with DEPS provision BEFORE the
   single deploy; legacy post-deploy provisioning + the setup_custom_tests.sh
   invocation machinery deleted. lasuite-docs migrated to install_steps.sh OIDC
   wiring (same env names/values as the old hook — only the timing moved);
   lasuite-drive's remaining post-deploy MinIO bucket one-shot moved to ops.py
   pre_install; both setup_custom_tests.sh files deleted; OIDC_AT_INSTALL removed
   from drive/meet metas + the registry.

c) SKIP_GENERIC meta key deleted (zero users). Env form CCCI_SKIP_GENERIC* stays
   as the documented dev-only escape hatch; when active in a drone CI run the
   orchestrator prints a loud !! warning (manifest embedding lands in P5).

d) conftest cleanup: dead pre-deploy-once fixtures deployed/deployed_app deleted
   (zero users), app_domain + _short + _wait_healthy dropped (only users were the
   deleted fixtures); deps_apps+deps_creds consolidated into ONE deps fixture
   (entries expose .domain etc. as attributes; dict access intact); the 6 lasuite
   test files renamed deps_creds->deps (fixture name only — assertions and flows
   byte-identical). requires_deps marker + F2-11 skip-report plumbing unchanged.

Registry is now exactly the 14 final keys; docs §4 table regenerated. Stale
setup_custom_tests/OIDC_AT_INSTALL prose in docstrings/comments/assert MESSAGES
updated (no assert logic or expected value touched).

Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 175 passed; scripts/lint.sh -> PASS.
2026-06-10 17:01:33 +00:00
472a68b32c feat(harness): P1 — single registry-backed meta loader (rcust)
All checks were successful
continuous-integration/drone/push Build is passing
One loader: runner/harness/meta.py::load(recipe) -> RecipeMeta (frozen dataclass,
attribute access), backed by the declarative KEYS registry (14 final keys + 3
P2-deprecated). The ONLY exec() of tests/<recipe>/recipe_meta.py. Validation per
the locked decision: unknown ALL-CAPS top-level name or type mismatch = MetaError
(hard error at load); underscore-prefixed names recipe-private; callables only on
hook-typed keys.

Migrated all six legacy loaders (spec §4 L1–L6):
- run_recipe_ci.py::_load_meta deleted; orchestrator loads once, passes meta down
- tests/conftest.py::_recipe_meta deleted; meta fixture returns full RecipeMeta (R3)
- lifecycle.py::_recipe_extra_env/_recipe_meta_flag deleted; deploy_app takes meta
- deps.py::declared_deps deleted; callers read meta.DEPS
- canonical.py::is_enrolled reads through meta.load()
- screenshot.py now actually receives SCREENSHOT through the orchestrator path (R2
  fix; proven by unit test through the real load path)

Mumble private constants underscore-prefixed (_WELCOME_TEXT_MARKER/_MAX_USERS) +
importers fixed. New tests/unit/test_meta.py (all-recipes-load-clean typo gate,
MetaError cases, spec §2 baseline defaults, underscore exemption, doc sync). Docs
§4 key table now GENERATED from the registry (scripts/gen-meta-docs.py); drift
fails CI.

Verified on cc-ci: cc-ci-run -m pytest tests/unit -q -> 175 passed; scripts/lint.sh -> PASS.
2026-06-10 16:46:58 +00:00
9a7772563a style: repo-wide lint pass — make the lint gate green again
Push builds have been RED on the lint step since ~build 209 from accumulated
formatting drift. This is the mechanical cleanup: ruff format + ruff --fix
(UP038 isinstance unions, SIM105 contextlib.suppress, UP031 f-strings, SIM115
tempfile context manager), shfmt -i 2 -ci, nixpkgs-fmt/statix/deadnix (merged
attrsets, dropped unused lib args), yamllint, and shell quoting fixes in
tests/lasuite-docs/setup_custom_tests.sh. No behaviour changes intended;
lint: PASS, unit tests: 138 passed.
2026-06-09 21:56:15 +00:00
c51cd84159 feat(harness): intentional skips + custom-html-tiny functional test; 4-rung ladder (#6)
Some checks failed
continuous-integration/drone/push Build is failing
Declare intentional skips + custom-html-tiny functional test; 4-rung level ladder

- recipe_meta.EXPECTED_NA = {rung: reason} lists intentionally-skipped rungs; any
  essential rung skipped and not listed is unintentional. Skips still cap the level
  (never inflate). results.json: skips:{intentional,unintentional} + level_cap_rung.
- Level ladder = the four essential rungs (install, upgrade, backup/restore,
  functional; top = L4). integration & recipe-local are optional, not leveled
  (SSO still enforced for the run verdict, unchanged).
- Card shows skipped rungs as INTENTIONAL SKIP (green, reason below) / UNINTENTIONAL
  SKIP (amber); level badge gains an expected/gap? third segment.
- custom-html-tiny: functional serve test (exact-byte round-trip + 404); declares
  backup_restore intentionally skipped (stateless static server).

Independently verified by the adversary: 138 unit tests pass cold; live full-stage
run on custom-html-tiny green (upgrade tier ran; level 2; correct skips/badge);
clean teardown.
2026-06-09 03:12:11 +00:00
91a69b8971 feat(3 U5.1+U5.2): per-recipe latest-level badge endpoint /badge/<recipe>.svg (R6, level-coloured, status fallback) + complete docs/results-ux.md §3-5 (card/screenshot/PR-comment/badge-embedding, R8); +2 badge unit tests
Some checks failed
continuous-integration/drone/push Build is failing
2026-05-31 10:04:14 +00:00
e1d837ee97 feat(3 U4): YunoHost-style dashboard grid — per-recipe level badge + status + version + app screenshot thumbnail + per-recipe /recipe/<name> history; reads results.json artifacts (R5); 9 dashboard unit tests
Some checks failed
continuous-integration/drone/push Build is failing
2026-05-31 09:52:06 +00:00
9a47aa28e3 feat(3 U3): YunoHost-style PR comment (🌻 + level badge + summary card images, linked) updated in place per PR; text fallback; bridge tests + dashboard do_HEAD 2026-05-31 07:46:00 +00:00
7217e0c98c feat(3 U2-scaffold): summary card + level/status SVG badge renderers (offline; pure)
harness/card.py: render_badge_svg/level_badge_svg (shields-style SVG, colour-by-level, R6) +
render_card_html (recipe+version, level badge, per-stage/per-test ✔/✘ table, embedded screenshot,
invariant flags — REPORTS results.json verbatim, never recomputes; cardinal no-inflation guardrail)
+ render_card_png (best-effort Playwright HTML->PNG, R7). 8 pure unit tests. Orchestrator wiring +
stable-URL serving + live PNG demo come after U0 PASSes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 06:11:47 +00:00
daa7edd3a7 feat(3 U1-scaffold): app screenshot capture module (offline; not yet wired)
harness/screenshot.py: best-effort Playwright capture of the live app (reuses harness browser).
Default = landing page (credential-free, secret-safe R7); recipes needing post-login opt into a
recipe-meta SCREENSHOT hook responsible for avoiding secret pages. Every failure swallowed -> None
(cosmetics never block, R7). Pure helpers unit-tested. Orchestrator wiring + live demo come after U0
PASSes (avoid deploy contention with the Adversary's cold U0 re-runs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 06:05:39 +00:00
52e5d210d8 feat(3 U0.2+U0.3): per-test results + results.json with computed level
harness/results.py: JUnit-XML parsing (stdlib) → per-stage/per-test rows; derive_rungs (documented
tier+deps/SSO → rung mapping); build_results assembles results.json {recipe,version,pr,ref,run_id,
stages[],level,level_cap_reason,rungs,flags{clean_teardown,no_secret_leak},screenshot,summary_card};
write_results (atomic). run_recipe_ci.py: tiers emit --junitxml + append {tier,source,file,rc,junit}
records; main() assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7),
incl. a narrow leak-scan of the serialised artifact. 17 new unit tests (test_results.py).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 05:55:58 +00:00
9773e3ff63 feat(3 U0.1): pure level() ladder mapper (L0-L6, gap-caps) + unit tests
Phase-3 R1 foundation. harness.level.compute_level(rungs)->(level,cap_reason) with YunoHost
gap-caps semantics: level = highest rung 1..L all clean PASS; first non-PASS (FAIL or N/A) caps,
recorded in cap_reason. N/A caps like fail but distinctly (L5 'no integration surface' example).
Helpers backup_restore_status + tier_to_rung. 16 unit tests incl U0 gate cases (L4-pass, L2-cap).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 05:46:23 +00:00
2bf40d69d6 feat(2): HQ1 image pre-pull (plan-prepull-images.md) — warm local store before deploy
lifecycle.prepull_images(recipe, domain): resolve images via docker compose config --images (COMPOSE_FILE
from the app .env — handles $VERSION interpolation + multi-compose) → docker pull each, skip-if-present
(zero network for cached pinned tags). Called in deploy_app before the (unchanged, real) abra.deploy AND
in generic.perform_upgrade before the chaos redeploy (warms new-version images). A pull failure RAISES a
clear pre-deploy error (not a converge timeout); deploy path unchanged (no docker service update/scale).
Removes PULL time not app-INIT time. 4 unit tests (tests/unit/test_prepull.py): present→skip, missing→
pull, pull-fail→raise, no-images→skip. NOT claimed yet — validating cold-verify criteria next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 16:02:21 +01:00
6506c4ac3a test(2): F2-12 P7-negative unit tests — owned upgrade-convergence wait fails on stuck convergence
Proactively addresses the Adversary's pre-claim recon (f7c5681): since the F2-12 fix replaces abra's
converge monitor (-c) with the harness's own wait, prove the replacement genuinely FAILS a broken
convergence (non-vacuous), not just passes a slow one. 5 deterministic tests (fake clock, no deploy):
- wait_ready_probes RAISES TimeoutError when the READY_PROBE never returns 200 (collabora wedged).
- wait_ready_probes returns when it reaches 200; no-op without a READY_PROBE.
- wait_healthy RAISES when services never converge, and when converged-but-never-serving.
Run: cc-ci-run -m pytest tests/unit/test_f212_upgrade_convergence.py -q → 5 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 12:23:34 +01:00
40b03a9bf1 claim(2w): WC8 + WC9 (FINAL gates) — resource-safety consolidation + stale-warm prune + docs/warm.md + --quick rollback proof
WC8: canonical.prune_stale (drop de-enrolled warm data + volumes) wired into the
nightly sweep + df log; consolidated evidence (DRONE_RUNNER_CAPACITY=MAX_TESTS
serialize; autoPrune drops --volumes so warm vols survive; cold teardown sacred;
warm excluded from D8 — no nix source ref). +1 unit (72 pass). WC9: docs/warm.md
documents the full warm/quick model; --quick rollback proof already proven live
(W2 FAIL restores exact known-good; WC4 PASS byte-identical snapshot). On PASS,
all WC1-WC9 (incl WC1.1/WC1.2) verified → DONE.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 04:43:34 +01:00
465e1059b0 claim(2w): WC6 nightly full-cold sweep — timer+service roll warm/infra (health-gated) then serial cold sweep promoting canonicals (WC5); proven live
canonical.enrolled_recipes; runner/nightly_sweep.py (roll keycloak+traefik →
serial full-cold over enrolled on latest → green promotes; skip if test active;
operate against CCCI_REPO checkout for tests/); nix/modules/nightly-sweep.nix
(timer 03:00 Persistent + oneshot service) wired in. 2 bugs fixed via live
service run (repo-relative enrolled scan; util-linux for backup PTY). Live
SERVICE sweep: enrolled=['custom-html'] → all tiers green → canonical advanced
1.10.0→1.11.0; red-run correctly does NOT promote. 71 unit pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 04:33:08 +01:00
125453df20 claim(2w): WC5 promote-on-green-cold proven — green cold run advances canonical (1.10.0→1.11.0); --quick never promotes; only cold advances
should_promote_canonical (enrolled+green+cold+latest) + promote_canonical
(re-seed canonical at green-verified latest, snapshot+registry, old known-good
replaced only on green). +5 unit (70 pass). Live: custom-html canonical advanced
1.10.0+1.28.0 → 1.11.0+1.29.0 via a full green cold run; snapshot refreshed; idle;
per-run app torn down. WC6 nightly sweep next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 04:08:14 +01:00
e678d2e006 claim(2w): W0.10a traefik WC1.1 migrated onto shared health-gated reconciler — no-op converge proven; destructive rollback = Adversary cold proof
warm_reconcile.py: per-spec setup hook + health_domain; SPECS[traefik]
(stateful=False, version-rollback-only, _traefik_setup preserves wildcard-cert/
file-provider config, health on routed dashboard host). keycloak path unchanged.
proxy.nix: deploy-proxy.service now execs warm_reconcile.py traefik. ZERO-disruption
migration (traefik already at latest 5.1.1+v3.6.15; pre-seeded TYPE+last_good →
clean no-op converge; traefik 200 + keycloak-through-traefik 200 + 0 failed).
65 unit pass. Per operator out: code+converge delivered; destructive rollback
(brief TLS blip) = Adversary's required cold proof. Closes the W0.10a tracked-open.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:50:32 +01:00
9afc7f64b9 feat(2w): W2 WC7 trigger surface — bridge parses !testme --quick
bridge/bridge.py: parse_trigger(body) → (is_trigger, quick); accepts exactly
'!testme' (cold, default) and '!testme --quick' (opt-in fast lane), rejects
'!testmexyz'/'!testme foo'/etc. Threaded through both poll + webhook paths and
process_testme → trigger_build adds the CCCI_QUICK=1 Drone param (auto-exposed
to run_recipe_ci). PR comment labels a quick run lower-confidence. .drone.yml
echoes quick=. +3 unit tests (incl. the !testmexyz negative). 64 unit pass.
WC7: default !testme stays full cold; --quick opt-in, never gates merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:10:56 +01:00
b6ef83ab0b feat(2w): W1 canonical registry module (WC2) + alerts archived
runner/harness/canonical.py: data-warm canonical registry + lifecycle —
is_enrolled (recipe_meta.WARM_CANONICAL), canonical_domain (warm.stable_domain
warm-<recipe>), registry read/write (/var/lib/ci-warm/<recipe>/canonical.json),
has_canonical (record + retained volume), deploy_canonical (reattach volume at
known-good version), undeploy_keep_volume (idle data-warm), seed_canonical
(record + warmsnap snapshot). warm.stable_domain helper added (keycloak path
unchanged). +4 unit tests (61 unit pass).

Also archived the Adversary's verification alert sentinels to alerts/seen/
(simulated rollback + 2 holds — evidentiary, gate PASSED; dir clean for real alerts).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 02:15:11 +01:00
32f00717ac fix(2w): W0.9 WC1.1 hardening (proven live: healthy upgrade + marquee rollback)
Bugs found by the live proof, fixed:
- warmsnap: snapshot now swaps a <recipe>/snapshot/ SUBDIR, not the whole
  <recipe>/ dir — so the reconciler's sibling last_good file survives a
  snapshot swap (was being clobbered).
- warm_reconcile: deploy_version captures abra's stdout (it writes FATA to
  stdout) in the error; add wait_undeployed() after every undeploy so
  snapshot/restore/redeploy don't race a half-removed swarm stack; the upgrade
  deploy is wrapped so a deploy FAILURE (not just unhealthy) also triggers
  rollback. (57 unit pass.)

LIVE PROOF on warm keycloak (annotated fake tags via CCCI_SKIP_FETCH):
(a) healthy upgrade 10.7.1->10.7.9: snapshot+deploy+health-pass, last_good
    committed=10.7.9, marker realm preserved.
(b) MARQUEE rollback: broken latest 10.7.10 (lint-fail) -> rollback to 10.7.9,
    HEALTHY, marker realm INTACT (data preserved through broken-upgrade+restore),
    last_good NOT advanced, rollback alert written (attempted=10.7.10,
    last_good=10.7.9, recovered=True). keycloak recovered to canonical
    10.7.1+26.6.2 healthy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 01:21:05 +01:00
a044abb298 feat(2w): W0.6 unpinned warm reconciler + WC1.2 safety gate + WC1.1 scaffold
runner/warm_reconcile.py (python, packaged into nix store, replaces bash
reconcile): UNPIN keycloak (deploy latest published version TAG; recipe fetched
at runtime -> D8 closure byte-identical). WC1.2 pre-deploy safety gate (runs
FIRST): major recipe/app-version bump OR releaseNotes manual-migration marker
-> hold-on-current + alert sentinel (no deploy churn). WC1.1 health-gated
upgrade-with-rollback: record last-good -> [keycloak: undeploy->warmsnap.snapshot
->deploy latest] -> health-gate -> commit-or-(restore+redeploy-prior+alert).
Alerts = /var/lib/ci-warm/alerts/*.json (Builder loop relays). current version
read from abra TYPE=<recipe>:<version>. CCCI_SKIP_FETCH test hook.
+8 unit tests for the version gate (56 unit pass).

Proven on cc-ci: nixos-rebuild switch -> warm-keycloak.service runs the python
reconciler -> noop-healthy (system 0-failed, /realms/master=200). WC1.2 holds
proven live: MAJOR bump -> held-major (keycloak untouched); minor+manual-
migration notes -> held-manual-migration (alert carries notes); no deploy churn.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 00:42:02 +01:00