Some checks failed
continuous-integration/drone/push Build is failing
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
105 lines
7.8 KiB
Markdown
105 lines
7.8 KiB
Markdown
# JOURNAL — phase `settings` (WHY / reasoning; Adversary does not read before verdict)
|
|
|
|
## 2026-06-17 — bootstrap + M1 design
|
|
|
|
**Phase:** server-level `settings.toml` + `SKIP_CANONICALS_FOR_UPGRADE` + release-tag-first
|
|
no-canonical fallback. Plan: `/srv/cc-ci/cc-ci-plan/plan-phase-settings-ci-server-config.md`.
|
|
|
|
### Why a new `harness/settings.py` (not extending an env-var module)
|
|
Checked for an existing cc-ci config mechanism first (plan §2.A "extend rather than spawn a parallel
|
|
one"). The server config today is **scattered ad-hoc env reads** (`os.environ.get` for `MAX_TESTS`,
|
|
`CCCI_RUNS_DIR`, `CCCI_REPO`, `STAGES`, `CCCI_QUICK`, …) — there is **no** central config module/class
|
|
to extend (`grep` for `tomllib|settings\.toml|class Settings` → none). So a small dedicated loader IS
|
|
the minimal, extensible home rather than threading another env var. Stdlib `tomllib` (py3.12 on the
|
|
server, confirmed). One `[upgrade]` table, one key now; `_SCHEMA` is the single source of
|
|
defaults+validation so adding a key/table later is a one-line change.
|
|
|
|
### Settings file path: `/etc/cc-ci/settings.toml` (override `$CCCI_SETTINGS`)
|
|
The harness runs from `/etc/cc-ci` in BOTH execution contexts (nightly sweep sets `CCCI_REPO=/etc/cc-ci`
|
|
and `cd`s there; the Drone recipe-CI runner runs from its checkout but an **absolute** host path is read
|
|
identically by both). `/etc/cc-ci` is a git checkout kept current by `git pull` + nixos-rebuild on
|
|
deploy — an **untracked** `settings.toml` there survives pulls (git pull never deletes untracked files)
|
|
and sits next to the tracked `settings.toml.example`. Chose this over `/srv/cc-ci/settings.toml` (the
|
|
plan's *suggestion*) because `/srv/cc-ci` is the orchestrator path, ambiguous on the server; `/etc/cc-ci`
|
|
is unambiguous and discoverable. The loader is graceful if the file/dir is absent → defaults.
|
|
|
|
### Why the canonical-present path (incl. samever step-back) is byte-for-byte unchanged
|
|
Guardrail §4: default false must be a no-op for current behavior. Structure:
|
|
`if rec and rec.version and not flag:` → the entire existing prevb/samever block runs verbatim
|
|
(canonical ≠ head → canonical; canonical == head → step-back older tag, else skip). Only when there is
|
|
**no canonical in play** (rec falsy, OR flag true) do we enter the new `_no_canonical_base`. So with
|
|
flag false + a canonical, nothing changes; the step-back's "no older predecessor → skip" is preserved
|
|
(NOT routed to main-tip), which is correct — routing it to main-tip could reintroduce the same-version
|
|
no-op samever exists to prevent. The plan §2.C "unified chain ... (==head)" is satisfied by the
|
|
step-back already taking the same release-tag helper as step 1; I deliberately did NOT add a main-tip
|
|
tail to the step-back skip, to keep samever's guarantee intact. This is the one place where a literal
|
|
reading of §2.C ("==head → ... → main-tip → skip") and the §4 no-op guardrail + samever's intent point
|
|
slightly differently; I chose the conservative path that preserves both samever and the no-op guardrail.
|
|
If the Adversary reads §2.C literally and wants the step-back-no-older case to fall to main-tip, that is
|
|
a one-line change — but I believe it would be a regression (vacuous upgrade), so it's recorded here.
|
|
|
|
### Why `_no_canonical_base` guards on `head_version` before calling `recipe_tags`
|
|
`newest_older_version(tags, None)` returns None, but evaluating `recipe_tags(recipe)` eagerly would
|
|
shell out to `git -C <per-run recipe dir> tag` even when head_version is None (e.g. callers/tests that
|
|
don't pass it). Guarding `if head_version else None` avoids a needless/erroring git call and preserves
|
|
the prevb behavior for the no-head_version caller shape (→ main-tip).
|
|
|
|
### Why wrong-type raises but malformed/absent doesn't
|
|
Plan M1: "malformed file handled" (graceful) AND "wrong type errors clearly". Reconciled: absent /
|
|
unreadable / TOML-syntax-error → WARN + all-defaults (a red file degrades to today's behavior, can't
|
|
crash CI). A syntactically-valid file with a **known key of the wrong type** → `TypeError` (a typo'd
|
|
value should be loud, not silently mis-parsed). bool-is-int-subclass handled: `1`/`0` for a bool key is
|
|
rejected, not coerced.
|
|
|
|
### Pre-existing, OUT OF SCOPE: dashboard lint drift on main
|
|
`scripts/lint.sh` reports `dashboard/dashboard.py` + `tests/unit/test_dashboard.py` would be reformatted
|
|
by the pinned ruff — confirmed present at HEAD f68f1c5 (`git show HEAD:...` through pinned ruff), NOT in
|
|
my diff. Not touched by this phase (narrow scope). Recorded in DECISIONS as an observation. My 5
|
|
phase files are format-clean + `ruff check` clean.
|
|
|
|
### Verification (commands + output)
|
|
- `nix shell nixpkgs#python311Packages.pytest -c pytest tests/unit/test_upgrade_base.py
|
|
tests/unit/test_settings.py -q` → **32 passed**.
|
|
- full unit suite `pytest tests/unit/ -q` → **315 passed**.
|
|
- `ruff check runner/ tests/unit/ bridge/ dashboard/` → All checks passed.
|
|
- `ruff format --check` (pinned) on my 5 files → all formatted.
|
|
|
|
## 2026-06-17 — M2 prep (read-only; not advancing past M1 gate)
|
|
|
|
Server canonical registry (`/var/lib/ci-warm/<recipe>/canonical.json`, status all `idle`):
|
|
- **WITH canonical** (16): cryptpad, custom-html, custom-html-tiny, drone, ghost, gitea, hedgedoc,
|
|
immich, lasuite-docs, lasuite-drive, lasuite-meet, mailu, matrix-synapse, n8n, plausible, uptime-kuma.
|
|
- **warm dir but NO canonical.json** (candidates for M2 evidence (a) "recipe without a canonical →
|
|
newest release tag < head"): **keycloak, alerts, traefik**.
|
|
|
|
M2 plan (after M1 PASS):
|
|
- (a) pick a no-canonical recipe WITH published release tags (keycloak has many) → show
|
|
`resolve_upgrade_base` returns a release-tag base, not raw main-tip. Likely via a harness dry-run /
|
|
targeted invocation on the server reading the live settings (absent file → default false).
|
|
- (b) drop a scratch `/etc/cc-ci/settings.toml` with `skip_canonicals_for_upgrade = true`, show a
|
|
canonical-bearing recipe (e.g. gitea/ghost) now resolves to the release-tag base (canonical bypassed),
|
|
then remove the scratch file → restore default false.
|
|
- Deploy: ensure `/etc/cc-ci` is at the phase commit (git pull); settings.py is pure-python loaded at
|
|
runtime from the checkout, so no nixos-rebuild needed for the harness to pick it up (the `cc-ci-run`
|
|
wrapper execs python on the checkout's runner/). Confirm on server.
|
|
|
|
## 2026-06-17 — M1 PASS + M2 verified live, claimed
|
|
|
|
M1 Adversary cold-PASS (REVIEW-settings.md @17:00Z, no VETO). Advanced to M2.
|
|
Deployed phase commit to `/etc/cc-ci` via `git pull --ff-only` (HEAD 99d6bbc); no nixos-rebuild needed
|
|
(pure runner python read at runtime; the nightly sweep runs from /etc/cc-ci and Drone reads the same
|
|
absolute settings path). Added `scripts/show-upgrade-base.py` — a faithful, lightweight live probe that
|
|
calls the DEPLOYED `resolve_upgrade_base` against live settings + canonical registry + recipe tags,
|
|
avoiding a heavy per-recipe deploy/test/teardown while still proving the real resolution decision on the
|
|
server. Chose this over full `cc-ci-run runner/run_recipe_ci.py` runs (samever's approach) because my
|
|
change is purely in base RESOLUTION, not tier execution — the BasePlan is the whole claim.
|
|
|
|
Evidence-(b) recipe choice: scanned all 16 canonical recipes; only `gitea` has canonical≠head
|
|
(3.5.3 vs 3.6.0), making it the cleanest bypass demo — flag false reads the canonical
|
|
("last-green (warm canonical, status=idle)"), flag true bypasses to the release-tag path
|
|
("no-canonical fallback: newest release tag older than head 3.6.0..."). The resolved version is 3.5.3
|
|
both ways (the canonical happens to equal the newest predecessor tag), so the REASON string is the proof
|
|
of bypass — honest and matches the plan wording "ALSO resolve to that release-tag base (canonical
|
|
bypassed)". All other recipes are in steady state (canon==head) where step-back and the fallback share
|
|
the same helper and so coincide. Server restored to steady state (settings.toml absent → false).
|