542 lines
40 KiB
Markdown
542 lines
40 KiB
Markdown
# REVIEW-rcust.md — Adversary ledger for the recipe-customization restructure phase
|
||
|
||
SSOT for this phase: `/srv/cc-ci/cc-ci-plan/recipe-custom-restructure-full-plan.md`.
|
||
Gates: **M1** (implementation verified — branch `restructure/recipe-custom`, unit+concurrency+lint
|
||
green on cold clone, resolved-customization diff clean for all 21 recipes, adversarial diff review)
|
||
and **M2** (merged + real-CI regression sweep matching baseline matrix). DONE requires fresh PASS
|
||
for both with no open VETO.
|
||
|
||
I own this file and the `## Adversary findings` section of BACKLOG-rcust.md only.
|
||
|
||
---
|
||
|
||
## Standing watch items (what I will hunt at M1/M2)
|
||
|
||
- **Coverage loss** (cardinal risk): for every migrated recipe, old loaders' effective customization
|
||
values must equal new `meta.load()` values. Throwaway diff script over all 21 recipe dirs; any
|
||
delta = finding.
|
||
- **Assertion weakening** in `tests/<recipe>/` diffs — migrations must be mechanical only (signatures,
|
||
fixture/key renames, underscore prefixes). Any changed assert/expected value = VETO.
|
||
- **Deleted-code fallout** — dangling refs to `_recipe_meta`, `_load_meta`, `_recipe_extra_env`,
|
||
`_recipe_meta_flag`, `declared_deps`, `is_canonical_enrolled`, `OIDC_AT_INSTALL`,
|
||
`CHAOS_BASE_DEPLOY`, `SKIP_GENERIC`, `setup_custom_tests`, `deps_apps`, `deps_creds`, `deployed_app`.
|
||
- **Validation gaps** — typo'd key / wrong type / callable-on-data-key must raise MetaError, not pass.
|
||
- **R2 fixed end-to-end** — orchestrator load path delivers SCREENSHOT to screenshot.py.
|
||
- **HC2 / F2-11 integrity** — repo-local default-deny, requires_deps skip-report, generic floor
|
||
semantics all unchanged.
|
||
|
||
---
|
||
|
||
## Verdicts
|
||
|
||
_(no GATE verdict yet — M1 is not claimed. M1 only claims after P1–P6 are all on the branch;
|
||
Builder has landed P1 (472a68b) + P2 (8cd72fd) and is mid-P3. The interim pre-review below is
|
||
front-loaded break-it work on the FROZEN P1/P2 commits — NOT an M1 PASS.)_
|
||
|
||
### Interim pre-review of frozen P1+P2 (branch @ 8cd72fd) — @2026-06-10, cold from upstream clone
|
||
|
||
Done as idle-time break-it work while no gate is pending. P1/P2 phase commits won't be rewritten
|
||
(Builder adds P3+ on top), so reviewing them now is non-wasted and front-loads M1. Cold clone of
|
||
`origin/restructure/recipe-custom` into `/tmp/rcust-verify` from the true upstream remote.
|
||
|
||
**No defects found so far.** Results:
|
||
|
||
1. **Deleted-code fallout — CLEAN.** Grepped `runner/ tests/ scripts/` for live refs to every deleted
|
||
symbol (`_recipe_meta`, `_load_meta`, `_recipe_extra_env`, `_recipe_meta_flag`, `declared_deps`,
|
||
`is_canonical_enrolled`, `OIDC_AT_INSTALL`, `CHAOS_BASE_DEPLOY`, `SKIP_GENERIC`,
|
||
`setup_custom_tests`, `deps_apps`, `deps_creds`, `deployed_app`). All hits are comments/docstrings
|
||
explaining the deletion, test names, or the intentionally-RETAINED `CCCI_SKIP_GENERIC*` env form
|
||
(kept per P2c). Zero live call-sites. `setup_custom_tests.sh` files gone.
|
||
2. **All-recipes-load-clean (typo gate) — PASS, independently.** Ran `meta.load()` (pure stdlib) over
|
||
all 21 recipe dirs cold via plain python3 (did NOT trust the Builder's test_meta.py). All 21 load;
|
||
non-default key sets sane. Every ALL-CAPS key used in any recipe_meta.py is in the 14-key registry.
|
||
3. **Coverage-loss diff (CARDINAL check) — ZERO deltas on data keys + hook presence.** Throwaway
|
||
harness (`/tmp/diff_meta.py`) reproduces main's six-loader effective resolution (`_load_meta`,
|
||
`declared_deps`, `is_enrolled`, `_recipe_extra_env`) from MAIN's recipe_meta files and diffs vs the
|
||
BRANCH's `meta.load()` for all 21 recipes. After correcting one harness artifact (EXTRA_ENV default
|
||
is `{}` not None), **0/21 recipes show any delta** for HEALTH_PATH/HEALTH_OK/DEPLOY_TIMEOUT/
|
||
HTTP_TIMEOUT/BACKUP_CAPABLE/EXPECTED_NA/UPGRADE_BASE_VERSION/DEPS/WARM_CANONICAL + presence of
|
||
READY_PROBE/BACKUP_VERIFY/UPGRADE_EXTRA_ENV/EXTRA_ENV/SCREENSHOT.
|
||
4. **Validation gaps — CLOSED.** Crafted tmp recipe_metas: typo'd key → MetaError (with "did you mean
|
||
DEPLOY_TIMEOUT?"); wrong type (`DEPLOY_TIMEOUT="str"`) → MetaError; callable on data key
|
||
(`DEPLOY_TIMEOUT=lambda ctx:...`) → MetaError; `_PRIVATE`/lowercase-helper → loads clean (exemption
|
||
works). All four behave per the locked decision.
|
||
5. **meta.py read** — single `exec()`, frozen `RecipeMeta` generated from `KEYS`, `_coerce` rejects
|
||
bool-as-int and callable-on-data-key; `non_default` compares vs registry default. No issues.
|
||
|
||
**Still UNVERIFIED for M1 (do NOT treat above as M1 PASS):** full `pytest tests/unit -q` +
|
||
`pytest tests/concurrency -q` + `scripts/lint.sh` cold on the cc-ci host; R2 end-to-end through the
|
||
real orchestrator screenshot path; P3 ctx-hook signature migration (assert byte-identical, legacy
|
||
`lambda domain:` raises clear MetaError); P4/P5/P6; re-run the coverage diff on the FINAL branch
|
||
(P3 changes hook signatures); recipe-test diffs are mechanical-only (no assertion weakening);
|
||
HC2/F2-11/generic-floor integrity. These wait for the `claim(rcust): M1`.
|
||
|
||
### Interim pre-review of frozen P3 (branch @ fd02d9f) — @2026-06-10, cold from upstream clone
|
||
|
||
Builder landed P3 (uniform ctx hook convention) and moved to P4, so P3 is frozen. Pre-reviewed it.
|
||
**No defects found.**
|
||
|
||
1. **Mechanical-migration discipline — HELD (no VETO trigger).** `git diff 8cd72fd..fd02d9f` over
|
||
`tests/*/` shows ZERO changed assert/expected literals. Every hook change is purely
|
||
`def HOOK(domain[, meta])` → `def HOOK(ctx)` + `domain` → `ctx.domain` in the body. Spot-checked
|
||
cryptpad/mumble/ghost/lasuite-drive recipe_meta.py + lasuite-drive ops.py: seeded values, return
|
||
dicts, paths, status codes, and the `pre_restore` `assert _psql(...) in (...)` are byte-identical
|
||
apart from the `ctx.` deref.
|
||
2. **HookCtx — present + complete.** `meta.HookCtx` frozen dataclass has all 5 documented fields
|
||
(`.domain`, `.base_url`, `.meta`, `.deps`, `.op`); `meta.hook_ctx(domain, meta, op=…)` factory
|
||
builds it and pulls `deps` from `$CCCI_DEPS_FILE`. All call sites migrated: run_recipe_ci
|
||
`pre_<op>`, BACKUP_VERIFY; lifecycle `extra_env` + READY_PROBE; screenshot `SCREENSHOT(page, ctx)`.
|
||
(NB my first pass falsely flagged "no HookCtx" — that was a STALE WORKTREE at P2; corrected by
|
||
checking out fd02d9f. Logged here for honesty.)
|
||
3. **Legacy-signature guard (P3.4) — PRESENT + works, live-probed.** `meta.check_hook_signature`
|
||
exact-matches positional params and raises a CLEAR MetaError naming the P3 migration + HookCtx
|
||
fields. Wired into both `load()` (recipe_meta hooks; SCREENSHOT expects `(page, ctx)`, rest
|
||
`(ctx)`) and the orchestrator (ops.py `pre_<op>`). Crafted tmp metas: legacy `READY_PROBE(domain)`,
|
||
`SCREENSHOT(page, domain, meta)`, `EXTRA_ENV(domain)` all → MetaError at load; `READY_PROBE(ctx)`
|
||
loads clean. No silent mid-run TypeError path.
|
||
4. **Coverage diff re-run at P3 head — still 0/21 deltas** (hook presence + all data keys unchanged).
|
||
|
||
Net: P1+P2+P3 all clean under cold adversarial probing. M1 still gated on full unit+concurrency+lint
|
||
on the cc-ci host, P4–P6, R2 end-to-end via the real screenshot orchestrator path, and a final
|
||
coverage re-diff. No findings filed; no VETO.
|
||
|
||
### Interim pre-review of frozen P4 (branch @ 29a28e2) — @2026-06-10T18:55Z, cold from fresh host clone
|
||
|
||
Builder landed P4 (custom-test ergonomics) and moved to P5, so P4 is frozen. Pre-reviewed it cold.
|
||
**No defects found.** NOT an M1 verdict — M1 stays gated (see "Still UNVERIFIED" below).
|
||
|
||
Cold acceptance (fresh `git clone` on cc-ci host at 29a28e2, my own checkout — not the Builder's):
|
||
- `cc-ci-run -m pytest tests/unit -q` → **184 passed** (exact match to claim; full suite, no
|
||
cross-fixture pollution from the session-scoped `deps` fixture).
|
||
- `cc-ci-run -m pytest tests/unit/test_discovery.py test_discovery_phase2.py
|
||
test_conftest_fixtures.py -q` → 14 passed.
|
||
- `nix develop .#lint --command scripts/lint.sh` → **lint: PASS** (ruff format/check, deadnix,
|
||
shfmt, shellcheck, yamllint all clean).
|
||
|
||
Correctness probes:
|
||
1. **Placement-rule claim ("zero in-repo users of top-level custom tests") — HOLDS.** Filesystem
|
||
sweep of every `tests/<recipe>/test_*.py`: ALL are lifecycle names (test_{install,upgrade,
|
||
backup,restore}.py). No top-level non-lifecycle custom exists in-repo, so dropping the top-level
|
||
glob in `discovery.custom_tests` loses ZERO coverage. The lifecycle-name exclusion is retained
|
||
inside functional/playwright as the double-run safety net.
|
||
2. **Discovery diff — clean.** Top-level `glob(test_*.py)` branch removed; functional/ + playwright/
|
||
subdir globs retained with `basename not in lifecycle_names` guard. Docstring + module header
|
||
updated to state the placement RULE.
|
||
3. **Test changes are adaptation + strengthening, NOT weakening (no VETO trigger).**
|
||
- `test_discovery_phase2`: renamed to `..._placement_rule_...`; now ASSERTS the top-level
|
||
`test_sso_smoke.py` is `not in names` (new negative assertion proving the behavior change),
|
||
while functional/playwright customs are still `in names` and lifecycle name excluded.
|
||
- `test_discovery::test_custom_tests_repo_local_gated`: repo-local custom moved from top-level
|
||
into `functional/`; HC2 default-deny (`== []` when unapproved) and approved-case
|
||
(`functional/test_sso.py in names`, `test_install.py` excluded) both INTACT. HC2 integrity
|
||
preserved.
|
||
4. **op_state fixture — correct.** Skips with clear reason on unset env / missing file / non-JSON
|
||
(`except ValueError` catches JSONDecodeError); reads & returns parsed dict otherwise. Tests
|
||
cover 3 of 4 paths (the non-JSON skip path is untested — minor coverage gap, not a defect; the
|
||
branch is trivially correct by inspection).
|
||
|
||
Net: P1+P2+P3+P4 all clean under cold adversarial probing; both halves of every phase claim
|
||
(unit count + lint) reproduced cold on a fresh clone. No findings filed; no VETO.
|
||
|
||
**Still UNVERIFIED for M1 (do NOT treat above as M1 PASS):** P5 (manifest) + P6 (docs);
|
||
`pytest tests/concurrency -q` cold; R2 end-to-end through the real orchestrator screenshot path;
|
||
final coverage re-diff on the COMPLETE branch (P1–P6, all 21 recipes, effective customization set
|
||
unchanged); recipe-test diffs mechanical-only across the whole branch; HC2/F2-11/generic-floor
|
||
integrity at the final head. These wait for `claim(rcust): M1`.
|
||
|
||
### Interim pre-review of frozen P5 (branch @ 68954be) — @2026-06-10T19:06Z, cold from fresh host clone
|
||
|
||
Builder landed P5 (customization manifest) and moved to P6, so P5 is frozen. Pre-reviewed it cold.
|
||
**No blocking defect; one secret-SURFACE observation raised (heads-up to Builder, NOT a VETO, NOT
|
||
an M1 secret-leak failure).** NOT an M1 verdict.
|
||
|
||
Cold acceptance (fresh `git clone` on cc-ci host at 68954be, my own checkout):
|
||
- `cc-ci-run -m pytest tests/unit -q` → **191 passed** (exact match to claim).
|
||
- `nix develop .#lint --command scripts/lint.sh` → **lint: PASS**.
|
||
|
||
Primary adversarial target — SECRET LEAKAGE via the new manifest surface (D-gate: published logs +
|
||
dashboard contain NO secrets, incl. generated app passwords):
|
||
1. **Generated/runtime secrets — NOT exposed (gate holds).** `manifest.build` collects only:
|
||
`meta_non_default` (static recipe_meta), hook NAMES (pre-ops/install_steps.sh/compose.ccci.yml),
|
||
overlay FILENAMES, custom-test COUNTS, and env-override KEY names (printed `KEY=1`, value never
|
||
rendered). It never touches `deps` (client_secret), `op_state`, abra-generated app passwords, or
|
||
any env VALUE. The cardinal concern — generated app passwords on the dashboard — is structurally
|
||
absent from this surface.
|
||
2. **Cold all-recipes sweep.** Built+rendered the manifest for all 21 recipes on the host; grepped
|
||
the rendered blocks AND the results.json `customization` payload for secret/password/token/key/
|
||
credential and for any 32+ char high-entropy string. The ONLY hit, across every recipe, is
|
||
plausible's `EXTRA_ENV.SECRET_KEY_BASE` =
|
||
`"ccciplausibletestkeybase64charsexactlyforCIephemeral4567890123"`.
|
||
3. **OBSERVATION (not a leak):** that value is a HARDCODED, committed, PUBLIC dummy CI constant
|
||
(tests/plausible/recipe_meta.py, in the open-source repo) — not a generated or real secret.
|
||
`meta_non_default` dumps EXTRA_ENV literal dicts verbatim into the log AND results.json (→
|
||
dashboard), so a field literally named `SECRET_KEY_BASE` with a value now appears on the
|
||
dashboard. No real secret is exposed (it's public), so this is NOT a D-gate failure and does NOT
|
||
block P5. BUT it's a standing surface: (a) a dashboard secret-scan gets a true-positive-shaped
|
||
hit on a public dummy (noise that could mask a real leak), and (b) if any recipe ever set a real
|
||
secret-ish literal in a meta dict, the manifest would surface it unredacted. Flagged to Builder
|
||
via BUILDER-INBOX as a heads-up to consider redacting values of sensitive-named meta keys before
|
||
M1. Will re-examine on the real dashboard at the M1 cold-verify.
|
||
4. **HC2-honoring — confirmed.** Manifest routes ALL repo-local reads through `discovery._gated`
|
||
(ops.py loop direct; `install_steps`/`resolve_overlay_op`/`custom_tests` each call `_gated`
|
||
internally). An unapproved repo-local recipe contributes nothing to the manifest.
|
||
5. **Pure presentation — holds.** `build()` only reads files/env and returns a dict; `render()`
|
||
formats a string. Called at run_recipe_ci.py:889-890 (print) + embedded at :1261 into results;
|
||
no state mutation, no verdict influence. `_jsonable` renders callables as `'<hook>'` (so a
|
||
callable EXTRA_ENV/READY_PROBE never leaks closure internals) and tuples→lists for JSON.
|
||
|
||
Net: P1–P5 all clean under cold adversarial probing; every phase claim (unit count + lint)
|
||
reproduced cold. No findings filed; no VETO. One non-blocking secret-surface heads-up sent.
|
||
|
||
**Still UNVERIFIED for M1:** P6 (docs); `pytest tests/concurrency -q` cold; R2 end-to-end via the
|
||
real orchestrator screenshot path; final coverage re-diff on the COMPLETE branch (all 21 recipes,
|
||
effective customization unchanged); recipe-test diffs mechanical-only across the whole branch;
|
||
HC2/F2-11/generic-floor integrity at final head; AND — at the M1 dashboard check — confirm the
|
||
SECRET_KEY_BASE-named field on the real dashboard is the accepted public dummy (or redacted).
|
||
These wait for `claim(rcust): M1`.
|
||
|
||
## M1 — implementation verified: **PASS** @2026-06-10T19:27Z (branch `restructure/recipe-custom` @ 858e0f5)
|
||
|
||
Cold-verified from TWO fresh clones on the cc-ci host (NEW=858e0f5, OLD=main pre-restructure;
|
||
merge-base 49fb818 confirmed → `main..858e0f5` is exactly P1–P6). Verdict formed from the phase plan
|
||
(SSOT), the code/git history, the STATUS verification facts, and my own cold re-runs — NOT from
|
||
JOURNAL rationale (isolation discipline; I did not need to consult JOURNAL).
|
||
|
||
**All M1 Definition-of-Done items PASS:**
|
||
|
||
1. **Cold test suites — match claim exactly.** Fresh clone @858e0f5:
|
||
`cc-ci-run -m pytest tests/unit -q` → **192 passed**; `tests/concurrency -q` → **23 passed**
|
||
(untouched by this plan, proven); `nix develop .#lint --command scripts/lint.sh` → **lint: PASS**.
|
||
|
||
2. **Coverage diff (cardinal risk) — 0 REAL deltas / 21 recipes.** Wrote throwaway extractors that
|
||
resolve EVERY recipe's effective customization in BOTH worlds — OLD via the legacy loaders
|
||
(`_load_meta` + `lifecycle._recipe_extra_env` + `deps.declared_deps` + `_recipe_meta_flag`),
|
||
NEW via `meta.load()` + `meta.extra_env/upgrade_extra_env` — for the common keys (HEALTH_*,
|
||
timeouts, DEPS, EXTRA_ENV resolved at a fixed domain, UPGRADE_EXTRA_ENV, BACKUP_CAPABLE,
|
||
EXPECTED_NA, UPGRADE_BASE_VERSION, READY_PROBE/BACKUP_VERIFY presence). Diff = **0 behavioral
|
||
deltas**; the only raw diffs were 20× `UPGRADE_EXTRA_ENV: None→{}` (unset default representation,
|
||
behaviorally identical) and mumble (most-customized: callable EXTRA_ENV→dict, UPGRADE_EXTRA_ENV,
|
||
READY_PROBE) is **byte-identical** old↔new.
|
||
Deleted keys accounted for (no silent loss): `SKIP_GENERIC` (0 recipe users); `CHAOS_BASE_DEPLOY`
|
||
→ overlay-presence (discourse+ghost, exactly the two shipping compose.ccci.yml — perfect 1:1, no
|
||
change either direction); `OIDC_AT_INSTALL` → install-time made universal (drive+meet were
|
||
already install-time). **lasuite-docs** declared DEPS but NOT OIDC_AT_INSTALL → OLD post-install,
|
||
NEW install-time: an INTENTIONAL P2b consolidation, not a drop — flagged below for M2 validation.
|
||
|
||
3. **Assertion weakening (VETO-class) — NONE.** Full branch diff over all recipe test files
|
||
(excl. harness unit/concurrency/regression): 18 removed asserts, 18 added. After mechanical
|
||
normalization (`domain`→`ctx.domain`, `deps_creds`→`deps`, `MAX_USERS`→`_MAX_USERS`, whitespace)
|
||
the removed and added assert sets are **IDENTICAL** — zero unmatched in either direction. Every
|
||
change is a pure signature/fixture/constant rename; no expected value altered, no assert deleted.
|
||
Spot-confirmed discourse/ghost `_psql(domain,…ci_marker…) in (…)` → `ctx.domain` only (expected
|
||
tuple + SQL byte-identical). **No VETO.**
|
||
|
||
4. **Deleted-code fallout — clean.** No dangling LIVE refs to any of the 13 deleted symbols
|
||
(`_recipe_meta`/`_load_meta`/`_recipe_extra_env`/`_recipe_meta_flag`/`declared_deps`/
|
||
`is_canonical_enrolled`/`OIDC_AT_INSTALL`/`CHAOS_BASE_DEPLOY`/`SKIP_GENERIC`/`setup_custom_tests`/
|
||
`deps_apps`/`deps_creds`/`deployed_app`). Only residue: stale DOC/comment mentions of
|
||
`OIDC_AT_INSTALL` + `setup_custom_tests.sh` in PARITY.md files (non-blocking P6 cosmetic nit).
|
||
|
||
5. **Validation gaps — closed.** Cold-probed `meta.load()` with synthetic bad metas: typo'd key,
|
||
str-on-int, bool-as-int, callable-on-data-key, legacy hook sig `READY_PROBE(domain)`, and unknown
|
||
key ALL → `MetaError` (clear, names the offending file/key). Clean + underscore-private-helper
|
||
metas load fine (no false positives). No silent pass.
|
||
|
||
6. **R2 fixed end-to-end.** Cold proof through the REAL load path: a recipe declaring
|
||
`def SCREENSHOT(page, ctx)` is surfaced by `meta.load()` and resolved callable by
|
||
`screenshot._load_screenshot_hook` (old L1 allowlist dropped it — now arrives); orchestrator wires
|
||
it `run_recipe_ci.py:1029 capture(…, recipe_meta=meta)` → `hook(page, hook_ctx(domain, meta))`.
|
||
Absent recipe → None (default landing-page path). Legacy `SCREENSHOT(page, domain, meta)` sig
|
||
rejected at load.
|
||
|
||
7. **HC2 / F2-11 / generic-floor integrity — preserved.** Cold-probed `discovery.custom_tests` +
|
||
`install_steps`: UNAPPROVED repo-local → `[]` / `None` (default-deny holds); APPROVED → surfaced.
|
||
`sso_dep_unverified` (F2-11) logic UNCHANGED (only a comment edited) — a deps-not-ready run that
|
||
skips ≥1 `requires_deps` test still suppresses the green signal. Generic floor `_skip_generic`
|
||
default = run (additive); opt-out now env-only (same env vars as before; the 0-user meta key
|
||
removed) and surfaced LOUDLY in CI + flagged `!!` in the manifest — strictly stronger, never
|
||
silent.
|
||
|
||
8. **(Bonus) P5 secret-surface heads-up RESOLVED + verified.** The Builder landed `858e0f5`
|
||
redacting secret-named meta values in the manifest (my P5 BUILDER-INBOX ask). Cold-verified:
|
||
`plausible.EXTRA_ENV.SECRET_KEY_BASE` → `<redacted>` in BOTH the log block and results.json;
|
||
recursive into nested dict keys; word-segment `(^|_)KEY(_|$)` regex avoids over-match
|
||
(KEYCLOAK_* passes). All-21-recipe sweep: exactly 1 redaction, ZERO over-redaction, ZERO
|
||
under-redaction (no secret-shaped value remains). Regression test
|
||
`test_manifest_redacts_sensitive_named_values` present.
|
||
|
||
**Verdict: M1 PASS.** No findings filed, no VETO.
|
||
|
||
**This does NOT clear `## DONE`.** Per the phase DoD, DONE requires a fresh Adversary PASS for BOTH
|
||
M1 *and* M2. M2 (merged-main real-CI regression sweep vs the committed baseline matrix) is still
|
||
unverified. M2 watch-items I will specifically re-check from run logs:
|
||
- **lasuite-docs OIDC is now install-time** (post→install change above) — must pass a real run with
|
||
OIDC wired at install (skip-count 0 on its `requires_deps` tests).
|
||
- the customization spot-checks the plan §M2.4 enumerates (mumble READY_PROBE tcp lines, cryptpad
|
||
SANDBOX_DOMAIN, ghost/discourse BACKUP_VERIFY + overlay copy + auto-chaos base deploy, lasuite-*
|
||
deps provisioning + OIDC tests ran, immich ops.py seeds, manifest block present in every log,
|
||
screenshot.png where capture succeeded).
|
||
- canary suite (RED canaries still caught at intended tier) + per-recipe level == baseline matrix.
|
||
- zero leaked apps after teardown.
|
||
|
||
### M2-prep — independent hook-port audit (shell→python / best-effort↔fatal drift) @2026-06-10T20:55Z
|
||
|
||
Triggered by the lasuite-drive regression (below), which my M1 PASS MISSED: my M1 coverage diff
|
||
compared recipe_meta KEYS (resolved values), not ops.py hook BODIES, and my assertion scan matched
|
||
`assert ` not `raise AssertionError`. So a hook that flipped best-effort→fatal was invisible to my
|
||
M1 method. M2 (real-CI sweep) caught it — the safety net working as designed. I then audited ALL
|
||
hook ports cold (`git diff c2508c7..origin/main` per recipe ops.py + the 2 setup_custom_tests.sh
|
||
ports), filtering for non-mechanical error-handling (raise/assert/except/exit/timeout/poll changes):
|
||
|
||
- **lasuite-drive `pre_install`** — GENUINE rcust regression (Builder-disclosed, I confirmed):
|
||
OLD setup_custom_tests.sh bucket poll fell through on 90s timeout (best-effort, no failure; the
|
||
custom-tier `test_minio_storage.py` upload→list→download is the real gate); NEW port added a
|
||
terminal `raise AssertionError` → deterministic install RED when the bucket appears just after
|
||
90s. Fix-forward APPROVED (restore best-effort print+return, scoped to line-54 only; conditioned
|
||
on an L5 re-run + my diff re-verify). See approval entry in BUILDER-INBOX history (commit 57c66ad).
|
||
- **lasuite-docs `install_steps.sh`** — INTENTIONAL P2b change, NOT a defect: OLD setup_custom_tests
|
||
did `exit 1` on missing deps/null KC creds; NEW does `exit 0` (no-op) for missing-deps (gated now
|
||
by F2-11: the `@requires_deps` OIDC test skips → `sso_dep_unverified` suppresses green) BUT
|
||
preserves `exit 1` on secret-insert failure. Consistent with the install-time-deps redesign.
|
||
WATCH-ITEM (residual): the missing-deps path now relies entirely on F2-11; the sweep didn't
|
||
exercise it (deps were ready, skip-count 0). Mechanism verified present at M1; not blocking.
|
||
- **All other ops.py** (cryptpad, discourse, ghost, immich, keycloak, lasuite-meet, matrix-synapse,
|
||
mattermost-lts, mumble, n8n, plausible, custom-html) — pure mechanical ctx migration
|
||
(`domain`→`ctx.domain`, `meta`→`ctx.meta`); expected tuples/strings byte-identical (spot-checked
|
||
keycloak 201/409 + 204/200, discourse/ghost _psql ci_marker). No error-handling drift.
|
||
|
||
Net: exactly ONE accidental hook-port regression (lasuite-drive), now under approved fix. No other
|
||
best-effort↔fatal flips. This audit closes the M1-method gap for the hook bodies.
|
||
|
||
---
|
||
|
||
### M2 proof-run independent analysis (cold, Adversary) @2026-06-10T23:53Z
|
||
|
||
M2 is NOT yet claimed by the Builder; this is my independent read of the proof runs sitting on
|
||
cc-ci (`/var/lib/cc-ci-runs/{m2b-*,ab-*-oldmain}`), parsed myself via jq (NOT trusting Builder
|
||
narrative). The 6 first-sweep mismatches break down as follows.
|
||
|
||
**Confirmed root fact — REF MISMATCH is real (I verified, not taken on faith).** Every baseline
|
||
matrix run used a *PR-head* ref; the first M2.3 sweep used each mirror's *default-branch head* — a
|
||
different commit. Independently confirmed via `results.json.ref`:
|
||
| recipe | baseline run/ref/level | sweep ref/level |
|
||
|---|---|---|
|
||
| discourse | 184 / 7ae7b0f76efb / L4 | 7d53d4ec390f / L2 |
|
||
| plausible | 308 / 13458fac56a1 / L4 | da159375d89a / L2 |
|
||
| mattermost-lts | 196 / a333e31a6002 / L4 | 41c9eb8e5f34 / L2 |
|
||
| immich | 307 / 107d7220adce / L4 | 7eb3937a82d0 / L2 |
|
||
| lasuite-drive | 189 / ffa7d585afa2 / L5 | f4135d78201e / L0 |
|
||
So the sweep was NOT apples-to-apples vs the baseline matrix. Reconciliation requires either
|
||
(a) re-run at the baseline ref on new main == baseline level, or (b) A/B same-ref old-vs-new main
|
||
== same level. Status per recipe:
|
||
|
||
- **immich** — m2b-immich (new main, baseline ref 107d7220adce) = **L4 == baseline L4. CLEAN.**
|
||
- **mattermost-lts** — m2b (new main, a333e31a6002) = **L4 == baseline L4. CLEAN.**
|
||
- **plausible** — m2b (new main, 13458fac56a1) = **L4 == baseline L4. CLEAN.**
|
||
→ these three: restructure proven INNOCENT (baseline ref reproduces baseline level on merged main).
|
||
- **bluesky-pds** — ab-bluesky-pds-oldmain (OLD main, b2d86efba3f1) = L0 == new-main sweep L0 at
|
||
same ref → restructure-NEUTRAL at the sweep ref. (Baseline is "L4-equiv, pre-results-era", no run
|
||
id — softer baseline; A/B neutrality is the available evidence.)
|
||
- **discourse — NOT yet clean. OPEN.** Two *distinct* flake modes seen, and the A/B was run at the
|
||
wrong ref to close the gap:
|
||
- baseline 184 (OLD main, 7ae7b0f): all pass → L4.
|
||
- m2b-discourse (NEW main, SAME ref 7ae7b0f): **upgrade FAILED**, HC1 guard fired —
|
||
"upgrade deployed chaos commit 'eb96de94+U', not intended PR-head '7ae7b0f76efb' — re-checkout
|
||
to code-under-test failed (HC1)" → L1. ← same-ref old=L4 vs new=L1 discrepancy, UNexplained.
|
||
- ab-discourse-oldmain (OLD main, 7d53d4ec): **restore FAILED** (ci_marker truncated-dump race)
|
||
→ L2 == new-main sweep L2 at that ref → neutrality proven, but for the RESTORE mode at the
|
||
DEFAULT-head ref, NOT for the L1/upgrade-HC1 mode at the baseline ref.
|
||
- Net: the clean A/B (ref 7ae7b0f on OLD main vs NEW main) that would explain L4→L1 was NOT run.
|
||
The upgrade re-checkout/HC1 path lives in run_recipe_ci.py/lifecycle which the meta-param
|
||
threading DID touch — so "pre-existing flake" is plausible but UNPROVEN here. To clear: run
|
||
discourse @7ae7b0f on OLD main (does it deterministically reproduce L4, or also flake to L1?),
|
||
and/or repeat @7ae7b0f on new main to characterise the HC1 re-checkout as a race. The HC1 guard
|
||
FIRING (not silently passing the wrong commit) is the safety net working — good — but it means
|
||
the upgrade did not exercise the PR code, so the run is inconclusive, not a clean baseline match.
|
||
- **lasuite-drive** — fix-forward 1357544 (restore best-effort bucket poll) landed; needs a fresh
|
||
L5 run at the baseline ref ffa7d585afa2 on merged main to confirm baseline. m2rr/earlier runs
|
||
predate or used the default head — NOT yet a clean baseline match. OPEN.
|
||
|
||
**M2 disposition: still OPEN — no PASS.** 3/6 cleanly reconciled (immich/mattermost/plausible);
|
||
bluesky neutral-at-sweep-ref; discourse + lasuite-drive NOT yet closed. I will require, at the M2
|
||
claim: (1) discourse same-ref A/B (or repeat) explaining L4→L1; (2) a clean lasuite-drive L5 at
|
||
baseline ref; (3) my own cold re-parse of every per-recipe level vs baseline; (4) the M2.4
|
||
customization-executed spot-greps; (5) zero leaked apps. Recorded a BUILDER-INBOX heads-up on the
|
||
discourse-HC1 gap so it is addressed in the claim, not glossed as "the restore flake".
|
||
|
||
### M2 proof-run progress + self-correction @2026-06-11T00:05Z
|
||
|
||
Builder is running (independently, matching my inbox ask) the decisive A/B serially on the box:
|
||
`m2-proof.sh` → lasuite-drive @ffa7d585afa2 PR=1 (post-fix-forward 1357544) on merged main 5c0676b,
|
||
then discourse @7ae7b0f76efb **PR=2** on merged main (m2p-discourse); `m2-proof2.sh` (queued) →
|
||
discourse @7ae7b0f76efb **PR=2** on OLD main (/root/m2-oldmain, ab-discourse-7ae7b0f-oldmain).
|
||
|
||
**Self-correction to my 23:53Z discourse analysis:** my m2b-discourse run used **PR=0**, but the
|
||
upgrade HC1 guard resolves the *PR head* for the re-checkout. The L1 failure message ("deployed
|
||
chaos commit 'eb96de94+U', not PR-head 7ae7b0f — re-checkout failed") is plausibly a **PR=0
|
||
artifact** (no real PR to resolve the head from), NOT a restructure regression. The Builder's proof
|
||
runs correctly use PR=2 (matching baseline run 184's pr=2). So the apples-to-apples comparison I
|
||
need is m2p-discourse (PR=2, new main) vs ab-discourse-7ae7b0f-oldmain (PR=2, old main) vs baseline
|
||
184 (PR=2, old main, L4). I will cold-verify those three when they land; my L4→L1 concern is on
|
||
hold pending the PR=2 result, not yet a confirmed regression. Live lasu-f68b63 stack = active
|
||
lasuite-drive proof run (expected, not a leak).
|
||
|
||
### M2 fix-forward APPROVE: be2026a (services_converged completed-one-shot rule) @2026-06-11T00:31Z
|
||
|
||
Builder proposed a 2nd lasuite-drive P2b fix on branch `fix/converged-oneshot @ be2026a` and asked
|
||
approval before merging to main (M2 "trivial fix-forward w/ Adversary approval" path). Cold-verified
|
||
independently (fresh clone of be2026a at /root/adv-be2026a on cc-ci, NOT the Builder's working tree):
|
||
|
||
- **Diff** (`git diff origin/main..be2026a runner/harness/lifecycle.py`, read myself): in
|
||
`services_converged`, a `cur != want` deficit now passes ONLY if `docker service ps <svc>` shows
|
||
ALL task states == `Complete`. Conservative: any Running/Preparing/Pending (spinning up) or
|
||
Failed/Rejected (broken) in the deficit still returns False; no-tasks-yet still False; plain N/N
|
||
and 0/0 unchanged. Targeted addition, not a rewrite.
|
||
- **False-green analysis (my own):** only `restart_policy:none` one-shots ever show `Complete`; a
|
||
normal crashed service shows Failed/Running(restarting), never Complete. Even if converge passed
|
||
on a completed-but-ineffective one-shot, two INDEPENDENT gates still catch it — the generic
|
||
`test_serving` HTTP floor and the custom-tier functional test (lasuite-drive
|
||
`test_minio_storage.py` upload→list→download is the real bucket gate). Defense-in-depth holds; I
|
||
could not construct a false-green path.
|
||
- **Tests** `tests/unit/test_converged_oneshot.py` (read + cold-ran): 7 cases pin exactly the
|
||
non-vacuity criteria — completed→converged, Failed→NOT, mixed Complete+Failed→NOT (covers the
|
||
`docker service ps` history concern), Preparing→NOT, no-tasks→NOT, N/N→converged, 0/0→converged.
|
||
- **Cold suite+lint from fresh be2026a checkout:** `cc-ci-run -m pytest tests/unit -q` → **199
|
||
passed**; the 7 new tests pass alone; `nix develop .#lint --command scripts/lint.sh` → **lint:
|
||
PASS**. Matches Builder's claim.
|
||
- **Root cause judged genuine P2b regression** (hook moved into ops.py pre_install runs BEFORE the
|
||
install assert; the completed one-shot's 0/1 then burns DEPLOY_TIMEOUT in the converge poll). The
|
||
fix accepts a genuinely-healthy deploy (HTTP 200, all other services 1/1) the old `cur!=want`
|
||
wrongly rejected — correction, not masking.
|
||
- **Not on main** — confirmed `all(s == "Complete")` absent from origin/main; Builder held the gate.
|
||
- **Disclosed semantic delta** (a failing one-shot now blocks install convergence earlier vs later
|
||
at custom-tier): ACCEPTED — both paths RED, no false-green, no enrolled recipe has a
|
||
baseline-failing one-shot.
|
||
|
||
**VERDICT: fix-forward be2026a APPROVED, conditional on:**
|
||
1. Post-merge lasuite-drive proof re-run @ffa7d585afa2 PR=1 lands **L5** (binding end-to-end proof
|
||
the fix resolves the converge hang — if it doesn't, the diagnosis was wrong and approval voids).
|
||
2. I re-verify the MERGED diff == be2026a diff (no extra change sneaks in at merge).
|
||
3. discourse PR=2 A/B pair (m2p-discourse / ab-discourse-7ae7b0f-oldmain — no one-shots, unaffected
|
||
by this fix) completes and I cold-verify those levels too.
|
||
This APPROVE does NOT clear M2; M2 still needs all per-recipe levels reconciled + my independent
|
||
sample re-check + zero-leak teardown.
|
||
|
||
### be2026a merge cold-verify — condition #2 SATISFIED @2026-06-11T00:42Z
|
||
|
||
Builder merged be2026a as 6cabbe7 (build 350 green, origin/main now b4505ac). Independently checked:
|
||
`diff origin/main:runner/harness/lifecycle.py be2026a:...` → **IDENTICAL**; the merged
|
||
`tests/unit/test_converged_oneshot.py` → **IDENTICAL** to be2026a. Clean merge, no extra change
|
||
slipped in — approval condition #2 met. m2p-lasuite-drive (pre-fix) landed L0 (install/converge
|
||
timeout) = the diagnosed symptom (Builder disclosed b4505ac it SIGINT-shortcut the doomed burn;
|
||
binding proof is the post-fix m2p2 re-run). REMAINING be2026a conditions: #1 post-fix lasuite-drive
|
||
L5, #3 discourse PR=2 A/B cold-check — both pending (m2p-discourse running, then ab-oldmain, then
|
||
m2p2-lasuite-drive).
|
||
|
||
### be2026a conditions CLEARED + SSO-baseline staleness finding (independent) @2026-06-11T01:12Z
|
||
|
||
Reached the conclusions below COLD (own git archaeology + run-dir jq) BEFORE reading the Builder's
|
||
01:10Z inbox — which then concurred. Anti-anchoring preserved (no JOURNAL read; inbox read after my
|
||
own derivation).
|
||
|
||
**be2026a fix-forward — ALL 3 CONDITIONS SATISFIED → fix-forward FULLY CLEARED:**
|
||
1. **Post-fix lasuite-drive (m2p2, merged main 6cabbe7, ffa7d585afa2, PR=1): L4, rc=0, 3m19s.**
|
||
Independently verified: flags clean_teardown=true + no_secret_leak=true; all 4 essential rungs
|
||
pass; `test_minio_storage::...object_roundtrip` PASSED; `test_oidc_..._keycloak` PASSED. The
|
||
install converge no longer hangs — both fix-forwards (1357544 best-effort poll + 6cabbe7
|
||
completed-one-shot converge) exercised in one run. The literal "L5" in my condition is
|
||
**unmeetable on current code and NOT an rcust effect** — see staleness finding below; I accept
|
||
the L4-equivalence. Fix works end-to-end.
|
||
2. **Merged diff == branch diff** — verified earlier (4428e76): lifecycle.py + test file
|
||
byte-identical to be2026a.
|
||
3. **discourse A/B — restructure-NEUTRAL.** m2p-discourse (NEW main, 7ae7b0f, PR=2) = L1 and
|
||
ab-discourse-7ae7b0f-oldmain (OLD main, SAME ref, SAME PR=2) = L1, SAME stage (upgrade), SAME
|
||
message (`eb96de94+U` HC1 re-checkout). old==new byte-identical → rcust did NOT regress discourse.
|
||
The L4(184)→L1 vs baseline is pre-existing env drift since 06-05 (filed below), not rcust.
|
||
|
||
**FINDING [adversary] — M2 baseline matrix has 3 STALE L5 entries (lasuite-docs/drive/meet).**
|
||
Independently established: the level ladder dropped 6-rung(L5)→4-rung(max L4, integration &
|
||
recipe-local now OPTIONAL/non-laddered) in mainline PR#6 (c51cd84 "4-rung ladder", + 46e2cdb),
|
||
which `git merge-base --is-ancestor c51cd84 01e6d49^` confirms is an ANCESTOR OF PRE-RCUST MAIN.
|
||
The rcust merge touches level.py NOT AT ALL and results.py by +4 cosmetic P5 lines; compute_level
|
||
+ derive_rungs are byte-identical old-main↔merged-main. So NO current-code run (rcust or pre-rcust)
|
||
can produce L5; baselines 188/189/204 (L5, integration:pass) were recorded under the OLD schema
|
||
(run 204 ran 06-09 hours before the refactor deployed). **rcust is INNOCENT of L4≠L5.** Integration
|
||
coverage is NOT lost: the requires_deps OIDC tests EXECUTE and PASS (skip-count 0) on current code —
|
||
verified in m2p2 AND the sweep's m2r-lasuite-docs (`test_oidc_login_via_keycloak` +
|
||
`test_oidc_password_grant_...` PASSED) and m2r-lasuite-meet (`...password_grant...` PASSED).
|
||
ACCEPTED equivalence for the M2 matrix: **old L5 ≡ new L4 (all 4 essential rungs pass) + requires_deps
|
||
OIDC test PASSED (skip-count 0)**. Under this, lasuite-docs (m2r L4) / lasuite-meet (m2r L4) /
|
||
lasuite-drive (m2p2 L4) all MATCH. (Note: this validates — but corrects the basis of — the Builder's
|
||
first-sweep "lasuite-docs/meet matched baseline"; they are L4+OIDC, not numeric L5.) This is a
|
||
matrix-staleness correction, NOT a rcust regression; no VETO.
|
||
|
||
**Still OPEN for the M2 verdict (my side):** (a) per-recipe levels reconciled vs the CORRECTED
|
||
baseline for all 21; (b) bluesky-pds is L0 on BOTH old & new main (upstream image
|
||
`Cannot find module index.js`) — restructure-neutral but also cannot match its L4-equiv baseline on
|
||
ANY current run → needs a DECISIONS/DEFERRED note as non-rcust upstream breakage, not a silent
|
||
mismatch; (c) the 2 drone-path !testme runs (immich#2/plausible#3); (d) zero-leak teardown sweep;
|
||
(e) my own independent re-check of ≥5 recipes' logs + ALL mismatches before any M2 PASS.
|
||
|
||
---
|
||
|
||
## M2 — merged-main real-CI regression sweep: **PASS** @2026-06-11T01:15Z
|
||
|
||
Cold-verified the M2 claim (STATUS gate "M2 CLAIMED ~01:30Z") from my own clone + direct on cc-ci,
|
||
re-running/ re-parsing rather than trusting Builder logs. Every M2.0–M2.4 item holds.
|
||
|
||
**M2.2 canaries — cold RE-RAN myself** from a fresh `origin/main` checkout (/root/adv-be2026a @
|
||
origin/main): `cc-ci-run -m pytest tests/regression/ -m canary -v` → **7/7 passed (301s)**, incl.
|
||
`bad-false-green` (the false-green detector) + all four RED canaries (bad-install/upgrade/backup/
|
||
restore) caught at their designed tier. The level system is NOT inflating. (log /root/adv-canary.log)
|
||
|
||
**M2.3 per-recipe — all 21 reconciled (cold jq on each run dir):**
|
||
- 13 clean: cryptpad/custom-html/ghost/hedgedoc/keycloak/matrix-synapse/n8n/uptime-kuma = L4;
|
||
mailu/custom-html-tiny = L2 (backup_restore N/A); mumble = L4 (deploy-count=1) — all == baseline,
|
||
clean_teardown=true.
|
||
- 2 designed-bad canaries genuinely exercised: bkp-bad rungs backup_restore=**fail** (backup=fail);
|
||
rst-bad backup_restore=**fail** (backup=pass→restore=fail). The L1 cap is upgrade-N/A ladder
|
||
semantics; the designed failure is recorded in the rung (verified — NOT a coincidental
|
||
level-match).
|
||
- immich/mattermost-lts/plausible: **L4 @ exact baseline refs** (m2b-*) — baseline REPRODUCED on the
|
||
restructured harness (cold-verified earlier this session).
|
||
- discourse: m2p-discourse (NEW main) == ab-discourse-7ae7b0f-oldmain (OLD main) — SAME ref/PR=2,
|
||
SAME stage, SAME upgrade-HC1 message (`eb96de94+U`), SAME L1. **old==new ⇒ rcust-neutral**; the
|
||
L4(184)→L1 is pre-existing env drift since 06-05 (DEFERRED.md), NOT caused by the restructure.
|
||
- lasuite-docs/-meet/-drive: L4 all-rungs-pass + requires_deps OIDC test PASSED (skip-count 0)
|
||
[lasuite-drive m2p2 also MinIO PASSED, post-both-fixes, rc=0]. Their "L5" baselines are STALE:
|
||
the 6→4-rung ladder landed in mainline c51cd84 (PR#6), which `git merge-base --is-ancestor
|
||
c51cd84 01e6d49^` confirms PREDATES the rcust merge; level.py untouched by the merge, derive_rungs
|
||
byte-identical old↔new. **rcust-innocent; integration coverage preserved** (OIDC tests execute &
|
||
pass). Accepted equivalence old L5 ≡ new L4-all-pass + OIDC-pass.
|
||
- bluesky-pds: EXCLUDED — `Cannot find module /app/index.js` crash-loop on BOTH old & new main at
|
||
every ref → upstream image breakage, rcust-neutral. DEFERRED.md note present.
|
||
|
||
**M2.3 drone→harness path:** drone builds **356 (immich) + 357 (plausible)** = `build_event=custom`
|
||
(bridge-triggered; distinct from push builds 358-361), trigger=autonomic-bot, both **success**
|
||
(verified in drone sqlite DB); run dirs 356/357 = immich L4 pr=2 / plausible L4 pr=3, customization
|
||
manifest present, clean_teardown=true.
|
||
|
||
**M2.4 customizations actually executed (cold-grep):** manifest block **21/21** logs; mumble
|
||
`ready-probe OK (tcp 3x) 127.0.0.1:64738`; ghost `ccci-overlay: provided compose.ccci.yml ...
|
||
base deploy auto-chaos` (P2a first-class path live); cryptpad `EXTRA_ENV='<hook>'`; immich
|
||
`ops.py[pre_backup,pre_restore,pre_upgrade]` + `pre-op seed` lines (migrated ctx hooks run).
|
||
|
||
**Teardown:** `docker stack ls` = infra (backups/bridge/dashboard/reports/drone/traefik) +
|
||
warm-keycloak ONLY, **zero leaked app stacks** (checked after ALL runs incl. drone-path).
|
||
|
||
**Fix-forwards (both Adversary-approved, additive):** 1357544 (lasuite-drive best-effort poll, appr
|
||
57c66ad) + be2026a/6cabbe7 (services_converged completed-one-shot, appr a531746) — merged diff ==
|
||
branch diff, all 3 be2026a conditions cleared (24a203a). Cold unit suite on post-fix main = 199
|
||
passed, lint PASS.
|
||
|
||
**VERDICT: M2 PASS.** No regression CAUSED BY the restructure: every deviation from the baseline
|
||
matrix is proven rcust-neutral by same-ref old-vs-new A/B (discourse, bluesky) or is a pre-rcust
|
||
stale-schema artifact with coverage preserved (3 lasuite), all documented in DEFERRED.md — not a
|
||
silent mismatch. The false-green detector is green on my own cold canary run. No findings filed,
|
||
no VETO.
|
||
|
||
**M1 PASS (01f9f70) + M2 PASS (this entry) both stand** → the phase DoD handshake is satisfied; the
|
||
Builder may write `## DONE` to STATUS-rcust.md. (M1's unit+lint acceptance still holds on post-fix
|
||
main: 199 passed / lint PASS, the fix-forwards being additive + separately approved.)
|