Files
cc-ci/machine-docs/JOURNAL-settings.md
2026-06-17 16:58:59 +00:00

6.3 KiB

JOURNAL — phase settings (WHY / reasoning; Adversary does not read before verdict)

2026-06-17 — bootstrap + M1 design

Phase: server-level settings.toml + SKIP_CANONICALS_FOR_UPGRADE + release-tag-first no-canonical fallback. Plan: /srv/cc-ci/cc-ci-plan/plan-phase-settings-ci-server-config.md.

Why a new harness/settings.py (not extending an env-var module)

Checked for an existing cc-ci config mechanism first (plan §2.A "extend rather than spawn a parallel one"). The server config today is scattered ad-hoc env reads (os.environ.get for MAX_TESTS, CCCI_RUNS_DIR, CCCI_REPO, STAGES, CCCI_QUICK, …) — there is no central config module/class to extend (grep for tomllib|settings\.toml|class Settings → none). So a small dedicated loader IS the minimal, extensible home rather than threading another env var. Stdlib tomllib (py3.12 on the server, confirmed). One [upgrade] table, one key now; _SCHEMA is the single source of defaults+validation so adding a key/table later is a one-line change.

Settings file path: /etc/cc-ci/settings.toml (override $CCCI_SETTINGS)

The harness runs from /etc/cc-ci in BOTH execution contexts (nightly sweep sets CCCI_REPO=/etc/cc-ci and cds there; the Drone recipe-CI runner runs from its checkout but an absolute host path is read identically by both). /etc/cc-ci is a git checkout kept current by git pull + nixos-rebuild on deploy — an untracked settings.toml there survives pulls (git pull never deletes untracked files) and sits next to the tracked settings.toml.example. Chose this over /srv/cc-ci/settings.toml (the plan's suggestion) because /srv/cc-ci is the orchestrator path, ambiguous on the server; /etc/cc-ci is unambiguous and discoverable. The loader is graceful if the file/dir is absent → defaults.

Why the canonical-present path (incl. samever step-back) is byte-for-byte unchanged

Guardrail §4: default false must be a no-op for current behavior. Structure: if rec and rec.version and not flag: → the entire existing prevb/samever block runs verbatim (canonical ≠ head → canonical; canonical == head → step-back older tag, else skip). Only when there is no canonical in play (rec falsy, OR flag true) do we enter the new _no_canonical_base. So with flag false + a canonical, nothing changes; the step-back's "no older predecessor → skip" is preserved (NOT routed to main-tip), which is correct — routing it to main-tip could reintroduce the same-version no-op samever exists to prevent. The plan §2.C "unified chain ... (==head)" is satisfied by the step-back already taking the same release-tag helper as step 1; I deliberately did NOT add a main-tip tail to the step-back skip, to keep samever's guarantee intact. This is the one place where a literal reading of §2.C ("==head → ... → main-tip → skip") and the §4 no-op guardrail + samever's intent point slightly differently; I chose the conservative path that preserves both samever and the no-op guardrail. If the Adversary reads §2.C literally and wants the step-back-no-older case to fall to main-tip, that is a one-line change — but I believe it would be a regression (vacuous upgrade), so it's recorded here.

Why _no_canonical_base guards on head_version before calling recipe_tags

newest_older_version(tags, None) returns None, but evaluating recipe_tags(recipe) eagerly would shell out to git -C <per-run recipe dir> tag even when head_version is None (e.g. callers/tests that don't pass it). Guarding if head_version else None avoids a needless/erroring git call and preserves the prevb behavior for the no-head_version caller shape (→ main-tip).

Why wrong-type raises but malformed/absent doesn't

Plan M1: "malformed file handled" (graceful) AND "wrong type errors clearly". Reconciled: absent / unreadable / TOML-syntax-error → WARN + all-defaults (a red file degrades to today's behavior, can't crash CI). A syntactically-valid file with a known key of the wrong typeTypeError (a typo'd value should be loud, not silently mis-parsed). bool-is-int-subclass handled: 1/0 for a bool key is rejected, not coerced.

Pre-existing, OUT OF SCOPE: dashboard lint drift on main

scripts/lint.sh reports dashboard/dashboard.py + tests/unit/test_dashboard.py would be reformatted by the pinned ruff — confirmed present at HEAD f68f1c5 (git show HEAD:... through pinned ruff), NOT in my diff. Not touched by this phase (narrow scope). Recorded in DECISIONS as an observation. My 5 phase files are format-clean + ruff check clean.

Verification (commands + output)

  • nix shell nixpkgs#python311Packages.pytest -c pytest tests/unit/test_upgrade_base.py tests/unit/test_settings.py -q32 passed.
  • full unit suite pytest tests/unit/ -q315 passed.
  • ruff check runner/ tests/unit/ bridge/ dashboard/ → All checks passed.
  • ruff format --check (pinned) on my 5 files → all formatted.

2026-06-17 — M2 prep (read-only; not advancing past M1 gate)

Server canonical registry (/var/lib/ci-warm/<recipe>/canonical.json, status all idle):

  • WITH canonical (16): cryptpad, custom-html, custom-html-tiny, drone, ghost, gitea, hedgedoc, immich, lasuite-docs, lasuite-drive, lasuite-meet, mailu, matrix-synapse, n8n, plausible, uptime-kuma.
  • warm dir but NO canonical.json (candidates for M2 evidence (a) "recipe without a canonical → newest release tag < head"): keycloak, alerts, traefik.

M2 plan (after M1 PASS):

  • (a) pick a no-canonical recipe WITH published release tags (keycloak has many) → show resolve_upgrade_base returns a release-tag base, not raw main-tip. Likely via a harness dry-run / targeted invocation on the server reading the live settings (absent file → default false).
  • (b) drop a scratch /etc/cc-ci/settings.toml with skip_canonicals_for_upgrade = true, show a canonical-bearing recipe (e.g. gitea/ghost) now resolves to the release-tag base (canonical bypassed), then remove the scratch file → restore default false.
  • Deploy: ensure /etc/cc-ci is at the phase commit (git pull); settings.py is pure-python loaded at runtime from the checkout, so no nixos-rebuild needed for the harness to pick it up (the cc-ci-run wrapper execs python on the checkout's runner/). Confirm on server.