psql -tAc still prints INSERT/CREATE command tags (e.g. "INSERT 0 1"), so
_register_site asserted out == site against "INSERT 0 1\nsite" and both
event-tracking roundtrip tests failed on their very first run (build 237 —
the custom tier had never executed before; install always failed earlier).
-q suppresses the tags; verified against the recipe db container.
Push builds have been RED on the lint step since ~build 209 from accumulated
formatting drift. This is the mechanical cleanup: ruff format + ruff --fix
(UP038 isinstance unions, SIM105 contextlib.suppress, UP031 f-strings, SIM115
tempfile context manager), shfmt -i 2 -ci, nixpkgs-fmt/statix/deadnix (merged
attrsets, dropped unused lib args), yamllint, and shell quoting fixes in
tests/lasuite-docs/setup_custom_tests.sh. No behaviour changes intended;
lint: PASS, unit tests: 138 passed.
The harness default base (recipe_versions[-2]) resolves to 3.0.0+v2.0.0 for
the open 3.1.0 upgrade PR. That release predates x86_64 support in the
clickhouse entrypoint (added 3.0.1): on this amd64 host it downloads
clickhouse-backup-linux-x86_64.tar.gz — a deterministic HTTP 404 — and with
set -e + a silenced wget the container exits 1 before logging anything,
crash-looping until the deploy times out. The base therefore can never
converge, regardless of the PR content (the published tag is immutable).
This is exactly the case the harness documents for UPGRADE_BASE_VERSION:
a PR adding its version ABOVE the newest published tag, where the true
predecessor is [-1] (3.0.1+v2.0.0), not [-2]. The upgrade tier then tests
the real operator path 3.0.1 -> 3.1.0.
Pairs with recipe-maintainers/plausible#3 (its !testme can only go green
once this lands).
Declare intentional skips + custom-html-tiny functional test; 4-rung level ladder
- recipe_meta.EXPECTED_NA = {rung: reason} lists intentionally-skipped rungs; any
essential rung skipped and not listed is unintentional. Skips still cap the level
(never inflate). results.json: skips:{intentional,unintentional} + level_cap_rung.
- Level ladder = the four essential rungs (install, upgrade, backup/restore,
functional; top = L4). integration & recipe-local are optional, not leveled
(SSO still enforced for the run verdict, unchanged).
- Card shows skipped rungs as INTENTIONAL SKIP (green, reason below) / UNINTENTIONAL
SKIP (amber); level badge gains an expected/gap? third segment.
- custom-html-tiny: functional serve test (exact-byte round-trip + 404); declares
backup_restore intentionally skipped (stateless static server).
Independently verified by the adversary: 138 unit tests pass cold; live full-stage
run on custom-html-tiny green (upgrade tier ran; level 2; correct skips/badge);
clean teardown.
Having test_backup.py in custom-html-bkp-bad caused both bad-backup and bad-restore
to fail at the backup tier. Create custom-html-rst-bad with its own cc-ci test dir
that has ops.py+test_restore.py but NO test_backup.py, so:
- backup: only generic test_backup_artifact → PASS (snapshot exists)
- restore: pre_restore writes 'mutated', marker stays 'mutated' after restore → FAIL
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
backup-bot-two ignores backupbot.backup.path labels and always backs up
the full volume, making path-based restore-RED infeasible.
New approach: custom-html-bkp-bad has no pre_backup → marker never seeded
→ backup snapshot has no ci-marker.txt. pre_restore writes 'mutated'.
After restore: marker is MISSING or 'mutated' → test_restore_returns_state FAILS.
upgrade=skip (no version tags) is acceptable since passing_tiers_before=[install,backup].
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
No ops.py::pre_backup for custom-html-bkp-bad → ci-marker.txt never seeded.
test_backup_captures_state asserts marker=='original' → MISSING → FAIL → backup=RED.
This works regardless of backupbot label behavior.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
backupbot-two ignores nonexistent backup paths and backs up the whole
volume, making the bad-path approach unreliable. New approach:
- Create recipe-maintainers/custom-html-bkp-bad on Gitea (custom-html
without backupbot.backup=true label) — SHA 4e584063a99a
- Add tests/custom-html-bkp-bad/recipe_meta.py with BACKUP_CAPABLE=True
so the harness runs the backup tier despite auto-detect returning False
- Without a labeled container, backup-bot-two produces no snapshot →
parse_snapshot_id=None → test_backup_artifact fails → backup=RED ✓
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both compose.yml uploads had empty files due to a bash encoding bug.
Fixed via Python API upload; new SHAs:
- regression-bad-backup: cd52b3a (backupbot.backup.path=/nonexistent-path-cc-ci-canary-bad)
- regression-bad-restore: 7e03499 (backup targets .backup-data subdir + command creates it)
Adversary confirmed bad-install ✓ and bad-upgrade ✓ from run artifacts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- tests/regression/test_canaries.py: replace `from .conftest import ...`
(relative import fails when not a package) with sys.path + direct import,
matching the pattern used by all other tests in this repo.
- Delete machine-docs/BUILDER-INBOX.md (Adversary inbox consumed).
- Update STATUS-regression.md + JOURNAL-regression.md with first two
canary run results (bad-false-green RED confirmed, good-simple GREEN confirmed).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three canaries (@pytest.mark.canary) drive the real cold CI lifecycle:
- good-simple: custom-html-tiny @ main (435df8fc) — fast signal, expects GREEN
- good-significant: lasuite-docs @ main (290a8ad7) — multi-service, expects GREEN
- bad-false-green: custom-html @ v5-stale-docroot (71e7326a) — expects RED
Semantic teeth: beyond exit-code, each test asserts that specific named tests
ran in results.json stages (test_serving, test_serving_and_frontend, test_content_type).
If an assertion is removed, the named test disappears → regression test fails.
Includes conftest (run_recipe_ci helper + stage_has_{passing,failing}_test),
README (cadence policy, how to run, how to add), and phase state files.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
harness/screenshot.py: best-effort Playwright capture of the live app (reuses harness browser).
Default = landing page (credential-free, secret-safe R7); recipes needing post-login opt into a
recipe-meta SCREENSHOT hook responsible for avoiding secret pages. Every failure swallowed -> None
(cosmetics never block, R7). Pure helpers unit-tested. Orchestrator wiring + live demo come after U0
PASSes (avoid deploy contention with the Adversary's cold U0 re-runs).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The depends_on:[app] override in 04cc44c does NOT make compose valid: docker normalizes short-form
depends_on to a map and merges additively, so {discourse}+{app}={discourse,app} keeps the invalid
'discourse' key (config --images still rc=15). Reverted to keep the overlay minimal (re-pin + grace
only). Prepull-skip is harmless because bitnamilegacy/discourse:3.3.1 is warm in the node image cache
→ inline pull is a no-op. Timeout headroom (3600s) retained in recipe_meta.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
full4 base deploy timed out at 2400s on the 7-GiB single node. Root causes:
(1) sidekiq.depends_on referenced undefined service 'discourse' (main svc is 'app') → abra config
--images rc=15 → prepull SKIPPED → 2.4GB image pulled inline during deploy, eating convergence
budget. Overlay now overrides sidekiq.depends_on:[app] (swarm ignores depends_on → no-op at
runtime, masks nothing) so prepull resolves+pre-pulls images on both base+head deploys.
(2) bumped DEPLOY_TIMEOUT/TIMEOUT 2400→3600 for headroom on the RAM/CPU-constrained Rails cold boot.
Also pre-cached bitnamilegacy/discourse:3.3.1 by tag on cc-ci (was dangling <none>).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
full9: backup tier FAILed with NameError('__file__' not defined) — recipe_meta.py is exec()'d into a
bare namespace so __file__ is undefined. The harness already has runner/ on sys.path + harness imported,
so import lifecycle directly. (restore PASSED on full9 — the data-integrity fix works; this just fixes
the verify probe crashing the backup tier.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Root cause (instrumented, DECISIONS 2026-05-30): a DB recipe dumps its data in a backupbot pre-hook,
but if the DB container cycles mid-dump (intermittent on the loaded CI node — full5/6/7 RED, full8
green; NOT OOM/NOT healthcheck) the dump is truncated/absent and restic snapshots an empty path —
abra app backup 'succeeds' yet a later restore silently loses the data (ghost ci_marker).
Fix (additive, recipe-scoped via meta like READY_PROBE): recipe_meta may define BACKUP_VERIFY(domain)
-> bool, a READ-ONLY post-backup integrity probe. When it returns False the harness re-runs the whole
backup (fresh snapshot, re-stabilised db) up to 3x. Recipes without the hook are unaffected. ghost's
BACKUP_VERIFY confirms /var/lib/mysql/backup.sql.gz is a valid non-empty gzip. Weakens no assertion —
it only retries a flaky CAPTURE so P4 restore is RELIABLY exercised, not luck-dependent.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
full4 timed out: abra deploy killed at 1200s while the app was at the near-final email_recipients
migration tables (still 0/1). Wall-time = mysql fresh-dir init (~6min, app crash-loops on ECONNREFUSED
until DB ready — no migration progress lost) + ~9-15min schema migration (round-trip-bound, slower
under host load). Not a test weakening — bounded wait (matches discourse), a genuine hang still fails.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Adversary course-correction (bdef282) + plan-ccci-compose-overlay-policy.md §1: upgrade-to-latest
is MANDATORY. The 0.7.0+3.3.1 from-version pins the Docker-Hub-removed bitnami/discourse:3.3.1 (404)
and ships a too-tight 5m start_period for the 15-25min Rails cold boot. Minimal base overlay
compose.ccci.yml re-pins app+sidekiq to bitnamilegacy/discourse:3.3.1 (namespace-only, identical
image — same re-pin the PR head makes) + widens start_period to 20m (grace-only). install_steps.sh
provides it; CHAOS_BASE_DEPLOY skips the clean-tree gate; UPGRADE_BASE_VERSION=0.7.0+3.3.1 sets the
true predecessor. Neither change weakens a test. Run shape returns to STAGES=install,upgrade,backup,
restore,custom.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Run #2 base deploy: fresh mysql:8.0 init on the loaded cc-ci host (load ~8) took >6min
(InnoDB ~90s + system-tables + root-pw apply, starved by the app crash-loop churn), exceeding
the recipe's 1m db start_period (+6min retry grace) → swarm killed mysql mid-init (exit 137
unhealthy) → corrupt InnoDB redo logs → permanent deadlock (same signature as run #1's stale
vol). Widen db healthcheck start_period to 15m (matches app) so the slow first-boot finishes
before the healthcheck can fail it. Grace-only, masks no defect; bites base+head (published
recipe ships db start_period 1m everywhere) so overlay covers both. Torn down corrupt vol.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Course correction (REVIEW-2 bdef282) mandates upgrade-to-latest; harness base-deploys
prev published version 1.1.1+6-alpine which predates the recipe-PR 15m start_period bump
(ships 1m) → would deadlock on the ~6-9min fresh-DB migration (swarm kill mid-migration →
held migrations_lock). Policy-blessed minimal base overlay: compose.ccci.yml re-applies the
15m app-healthcheck start_period grace to the BASE so the from-version is deployable;
install_steps.sh provides it; CHAOS_BASE_DEPLOY skips clean-tree on the untracked overlay;
persists across head checkout (idempotent — PR head ships 15m). Grace-only, no test weakened.
Prior corrupt mysql vol (stale, interrupted init) torn down. Next: full run incl upgrade.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
abra rejects env-interpolation in healthcheck start_period (FATA 'Does not match
format duration' for both ${VAR} and quoted forms — validates the literal compose
duration before .env substitution). So §9 pt1's env-var route is impossible for
this field; the §9-compliant fix is a LITERAL start_period:20m bump in the
recipe-PR (recipe everyone runs, not a cc-ci overlay; strictly safer). Remove
APP_START_PERIOD from recipe_meta EXTRA_ENV; record the finding in DECISIONS
(ghost E1 must use the same approach); STATUS-2 → new PR head 7a2e0e0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Migrate discourse off the cc-ci compose overlay per plan §9 / plan-prefer-env-over-compose-overlay.md:
- recipe_meta: drop UPGRADE_BASE_VERSION + COMPOSE_FILE + CHAOS_BASE_DEPLOY; set APP_START_PERIOD=1200s
via EXTRA_ENV (the recipe-PR exposes start_period: ${APP_START_PERIOD:-5m}); declare upgrade tier N/A
(both published prev bases pin removed bitnami images; Adversary §7.1 granted, REVIEW-2 efe3790).
- delete tests/discourse/compose.ccci-health.yml + install_steps.sh (existed only to copy the overlay).
- DECISIONS.md + STATUS-2 record the §9 guardrail + discourse shape (upgrade N/A, env start_period,
pg_backup restore-hook recipe-PR = 5th data-loss recipe cc-ci caught).
recipe-PR head now 8b8df17 (start_period env var added). Not a claim — run STAGES=install,backup,restore,custom next.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Implements the real 0.7.0+3.3.1 -> 0.8.0+3.3.1 upgrade crossover instead of a
§7.1 skip-with-sign-off (Adversary leans DENY on the deferral; agreed):
- recipe_meta UPGRADE_BASE_VERSION=0.7.0+3.3.1 + generic support in
run_recipe_ci (prev = meta override or previous_version). Harness default
[-2]=0.6.3+3.1.2 is a hollow base (img 3.1.2 != head 3.3.1); [-1]=0.7.0+3.3.1
is the PR's true predecessor and shares head's servable 3.3.1 image.
- compose.ccci-health.yml re-pins services.{app,sidekiq}.image to
bitnamilegacy/discourse:3.3.1 so the 0.7.0 base (compose pins 404 bitnami:3.3.1)
is servable; idempotent on the head (PR already bitnamilegacy).
Consumes Adversary BUILDER-INBOX (deleted), leaves ADVERSARY-INBOX ack; STATUS-2
discourse section updated. Full lifecycle run launching next.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
_discourse.py: bootstrap an admin (recipe seeds none) + mint an ApiKey via rails runner in the app
container (class-B run-scoped). test_create_topic.py: POST /posts.json (unique marker) -> GET
/t/<id>.json title+cooked round-trip. test_site_basic.py: GET /site.json asserts discourse categories
config. Meets P3 (>=2 functional beyond health).
Install timed out at 1800s: discourse's 15-25min Rails cold boot overran both the deploy timeout and
the recipe healthcheck start_period:5m (swarm killed the booting app). Add compose.ccci-health.yml
(app healthcheck start_period 1200s) via install_steps.sh + recipe_meta COMPOSE_FILE + CHAOS_BASE_DEPLOY,
bump DEPLOY_TIMEOUT/TIMEOUT to 2400. Image re-pin (bitnamilegacy) already proven working. NO test weakened.
Root cause: Ghost's fresh-DB first boot runs a ~6-9min schema migration (round-trip-bound, not CPU);
the recipe healthcheck start_period:1m (~6min grace) kills the still-migrating task, leaving a stale
migrations_lock → every later task deadlocks (MigrationsAreLockedError). Hit on both 2- and 4-vCPU.
Fix (cc-ci deploy overlay, NOT a recipe/test change): compose.ccci-health.yml raises app healthcheck
start_period to 900s, wired via recipe_meta COMPOSE_FILE + install_steps.sh (+ CHAOS_BASE_DEPLOY for
the untracked overlay). No assertion weakened. Budget 1200s = migration + convergence. Only the
install tier needs it (upgrade redeploys on the populated DB → fast boot).
- ops.py + test_{upgrade,backup,restore}.py: seed ci_marker into the MySQL `ghost` DB (db service)
via the mysql CLI; rides the recipe's mysqldump --tab backup. recipe is MySQL not sqlite (stale
comment fixed). Expect restore RED -> recipe-PR (no backupbot.restore hook; immich/mattermost class).
- functional/_ghost.py: cookie-aware Ghost Admin API client (stdlib http.cookiejar; Origin CSRF hdr).
- functional/test_post_roundtrip.py: §4.3 create published post + read back (unique marker, non-vacuous);
closes the DEFERRED ghost create-post item.
- PARITY.md + recipe_meta.py updated. Authored node-free; full-lifecycle run next, NOT yet claimed.