Compare commits
22 Commits
fix/conver
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| ce50f641cc | |||
| ae10b553b0 | |||
| e005897cb9 | |||
| 8978fa6ae3 | |||
| 4f3a74759d | |||
| 1bcb2ed8fe | |||
| 3245150982 | |||
| f7b9b6f167 | |||
| d7f85c3f28 | |||
| 89dec5188f | |||
| 24a203a098 | |||
| f359069d40 | |||
| a13a83a775 | |||
| 4428e76f48 | |||
| b4505acbbd | |||
| 9715ab5c50 | |||
| 914c1663b5 | |||
| 6cabbe73b7 | |||
| a531746e53 | |||
| 49d796d9ac | |||
| 73421dabb4 | |||
| 77a9415b37 |
78
BACKLOG-shot.md
Normal file
78
BACKLOG-shot.md
Normal file
@ -0,0 +1,78 @@
|
||||
# BACKLOG-shot.md — phase `shot` (recipe screenshot audit & repair)
|
||||
|
||||
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-shot-screenshots.md. Gates: M1 (audit+diagnosis), M2 (all OK / agreed N/A).
|
||||
|
||||
## Build backlog
|
||||
|
||||
### P1 — Audit matrix (status: complete, all 19 PNGs visually inspected 2026-06-11)
|
||||
|
||||
Enrolled set (19) = `tests/<r>/recipe_meta.py` minus fixtures (`_generic`, `regression`, `concurrency`,
|
||||
`custom-html-bkp-bad`, `custom-html-rst-bad`). Evidence: `/var/lib/cc-ci-runs/<run>/` on cc-ci;
|
||||
PNGs pulled to /tmp/shot-audit/ on the builder host and each one Read (visually).
|
||||
|
||||
| recipe | latest run w/ artifacts | screenshot field | PNG bytes | visual content (I looked) | class |
|
||||
|---|---|---|---|---|---|
|
||||
| bluesky-pds | ab-bluesky-pds-oldmain | null | — | no PNG; install=fail level=0 (upstream image breakage, rcust DEFERRED) → capture correctly skipped (`if deploy_ok`) | N-A-candidate (blocked upstream) |
|
||||
| cryptpad | m2r-cryptpad | screenshot.png | 4802 | solid light-grey frame, nothing else | BLANK |
|
||||
| custom-html | m2r-custom-html | screenshot.png | 35707 | "Welcome to nginx!" default page | OK? (diagnose: is this the recipe's true fresh-install content?) |
|
||||
| custom-html-tiny | m2r-custom-html-tiny | screenshot.png | 12950 | seeded CI content ("cc-ci custom-html-tiny … DG5") | OK |
|
||||
| discourse | m2p-discourse | screenshot.png | 66121 | real forum UI, welcome topic, Sign Up/Log In | OK |
|
||||
| ghost | m2r-ghost | screenshot.png | 444183 | real blog landing ("Thoughts, stories and ideas") | OK |
|
||||
| hedgedoc | m2r-hedgedoc | screenshot.png | 131967 | real landing (logo, Sign In, feature intro) | OK |
|
||||
| immich | 356 | screenshot.png | 4801 | pure white frame | BLANK |
|
||||
| keycloak | m2r-keycloak | screenshot.png | 8764 | spinner + "Loading the Administration Console" | LOADING |
|
||||
| lasuite-docs | m2r-lasuite-docs | screenshot.png | 6022 | lone spinner on white | LOADING |
|
||||
| lasuite-drive | m2p2-lasuite-drive | screenshot.png | 5895 | lone spinner on white | LOADING |
|
||||
| lasuite-meet | m2r-lasuite-meet | screenshot.png | 4801 | pure white frame | BLANK |
|
||||
| mailu | m2r-mailu | screenshot.png | 33800 | real sign-in page (empty fields) | OK |
|
||||
| matrix-synapse | m2r-matrix-synapse | screenshot.png | 33296 | "It works! Synapse is running" landing | OK |
|
||||
| mattermost-lts | m2b-mattermost-lts | screenshot.png | 242139 | brand splash/loading screen (logo on blue), NOT the login form | LOADING (borderline — brand-recognizable but a loading state) |
|
||||
| mumble | m2r-mumble | screenshot.png | 7913 | spinner on grey — a web page IS served on the domain | LOADING (diagnose what serves it; N/A may NOT be justified) |
|
||||
| n8n | m2r-n8n | screenshot.png | 4801 | off-white blank frame. Flaky: run 197 (30256 B) shows the real "Set up owner account" form (empty fields, credential-free) | BLANK (flaky) |
|
||||
| plausible | 357 | null | — | no PNG on ANY run (122→357) | NULL |
|
||||
| uptime-kuma | m2r-uptime-kuma | screenshot.png | 30858 | real "Create your admin account" setup form (empty fields) | OK |
|
||||
|
||||
PNG-size note: 4801/4802 B at 1280×800 is a byte-stable blank-frame fingerprint (3 different apps, same size).
|
||||
|
||||
### P2 — Root-cause diagnoses
|
||||
|
||||
- [x] **NULL — plausible** (evidence: Drone build 357 ci-step log, t=73s):
|
||||
`screenshot: capture failed (non-fatal, verdict unaffected): page.goto(https://plau-b51425.ci.commoninternet.net/) never returned a status in (200, 301, 302, 303, 401, 403) after 15 attempts (45s); last status=500`.
|
||||
Plausible's `/` 500s **by design** under `DISABLE_AUTH=true` (auth_controller; documented in
|
||||
`tests/plausible/functional/test_health_check.py` docstring and recipe_meta — that's why HEALTH_PATH
|
||||
is `/api/health`). Default landing-page capture can NEVER succeed → needs a per-recipe SCREENSHOT
|
||||
hook to a path that actually renders (probe live: e.g. /login or /sites).
|
||||
- [x] **NULL — bluesky-pds**: install fails (level=0) before the app is up → `if deploy_ok:` gate in
|
||||
runner/run_recipe_ci.py:1024 correctly skips capture. Not a screenshot defect; upstream image
|
||||
breakage already filed in machine-docs/DEFERRED.md (rcust). → documented N/A while upstream is broken.
|
||||
- [x] **BLANK class — immich, lasuite-meet, n8n(flaky), cryptpad**: SPA paint race. capture() navigates
|
||||
with `wait_until="domcontentloaded"` (runner/harness/screenshot.py:91) and screenshots immediately;
|
||||
SPA shell HTML has loaded but JS hasn't painted → solid 4801-2 B frame. n8n flakiness = same race,
|
||||
sometimes JS wins (run 197 captured the real form).
|
||||
- [x] **LOADING class — keycloak, lasuite-docs, lasuite-drive, mumble, mattermost-lts(borderline)**:
|
||||
same race, caught mid-paint (spinner/splash rendered, app JS still loading/connecting).
|
||||
- [x] **mumble** web stack identified: recipe deploys a `web` service (mumble-web client) on the domain —
|
||||
spinner is its connecting state; landing renders a connect dialog once JS settles. NOT an N/A.
|
||||
- [x] **custom-html** nginx-welcome question: the recipe's fresh install genuinely serves the nginx
|
||||
default page at `/` (no content seeded for this recipe's install; only custom-html-tiny seeds via
|
||||
install_steps.sh). Screenshot is an honest representative view of a fresh install. → OK as-is.
|
||||
|
||||
### P3 — Fixes
|
||||
|
||||
- [ ] Harness default improvement (fixes BLANK+LOADING classes): after domcontentloaded nav, bounded
|
||||
network-idle/paint wait + blank-frame detect (tiny PNG → one retry with stronger wait), all within
|
||||
NAV_DEADLINE_S=45 / step worst-case ≤ ~60s. Unit tests in tests/unit/test_screenshot.py.
|
||||
- [ ] plausible SCREENSHOT hook (tests/plausible/recipe_meta.py) to a rendering, credential-free path.
|
||||
- [ ] Re-audit mattermost-lts / mumble / keycloak / lasuite-* after harness fix; per-recipe hooks only
|
||||
where the default still can't work.
|
||||
- [ ] bluesky-pds: document N/A in matrix (Adversary agreement at M1/M2).
|
||||
|
||||
### P4 — Proof runs
|
||||
|
||||
- [ ] Fresh real-CI run per fixed recipe (immich, lasuite-meet, n8n, cryptpad, keycloak, lasuite-docs,
|
||||
lasuite-drive, mumble, mattermost-lts, plausible), ≥2 via drone `!testme`; visual check each PNG;
|
||||
card + dashboard render. Healthy class: cite existing artifact + visual check (done in P1).
|
||||
|
||||
## Adversary findings
|
||||
|
||||
(Adversary-owned section.)
|
||||
107
JOURNAL-rcust.md
107
JOURNAL-rcust.md
@ -198,3 +198,110 @@ main serial re-run, AND old main @ old default head. The earlier "deploy timed o
|
||||
concurrent image pulls" guess in STATUS was wrong (the 600s timeout was the SYMPTOM; the ~2min
|
||||
A/B failure exposed the crash-loop). Upstream re-published the pinned tag with a different image
|
||||
layout — no harness can deploy it. Filed in STATUS as restructure-neutral with grep-able evidence.
|
||||
|
||||
## 2026-06-11 lasuite-drive root cause #2 — completed one-shot poisons convergence (caught live)
|
||||
|
||||
Watching the m2p proof run instead of just waiting paid off: the fix-forward's best-effort line
|
||||
printed (so #1 is fixed), but the install assert then sat in pytest for 25+ minutes. Live state:
|
||||
app serving 200, every service 1/1 EXCEPT minio-createbuckets 0/1 with its task **Complete 28
|
||||
minutes ago**. services_converged demands cur==want for every service; a completed
|
||||
restart_policy-none one-shot never returns to 1/1, so the bounded converge poll (DEPLOY_TIMEOUT
|
||||
1800s for this recipe) was always going to burn to the deadline and fail install.
|
||||
|
||||
Why nobody ever saw this before P2b: the old setup_custom_tests.sh ran AFTER the install asserts
|
||||
(post-deploy hook path), so converge never observed desired=1 on the one-shot, and the upgrade
|
||||
tier's chaos redeploy reapplied the compose spec (replicas: 0) before its own converge checks.
|
||||
P2b folded the trigger into ops.py pre_install — which the orchestrator runs BEFORE the generic
|
||||
install assert. Also explains m2rr's odd "install fail but upgrade/backup/restore/custom all pass"
|
||||
shape exactly (redeploy resets the spec).
|
||||
|
||||
Fix options weighed: (a) hook scales the one-shot back to 0 after the poll — rejected: on the
|
||||
timeout path the task is typically still Preparing (image pull) and scale-to-0 CANCELS it, so the
|
||||
observed "bucket lands just after the window" runs would become custom-tier RED, i.e. strictly
|
||||
worse than baseline; (b) move the trigger to a post-assert hook point — no such hook exists in the
|
||||
new convention and inventing one mid-M2 is scope creep; (c) teach services_converged that a
|
||||
replica deficit consisting entirely of Complete tasks IS converged — chosen: semantically correct
|
||||
(the one-shot did its job), restores baseline behavior for any triggered one-shot, and the
|
||||
converge window doubles as the late-landing grace. Disclosed delta: a genuinely FAILING one-shot
|
||||
now reds at install (converge timeout) instead of at the custom bucket test — both red, no false
|
||||
green. Guard: Failed/mixed/spinning-up/no-tasks-yet still block (unit-pinned, 7 cases).
|
||||
|
||||
Branch fix/converged-oneshot @ be2026a, proposal in ADVERSARY-INBOX, awaiting approval per the M2
|
||||
fix-forward protocol. Unit suite 199 passed + lint PASS from the cc-ci working-tree rsync.
|
||||
|
||||
## 2026-06-11 ~01:00Z — merge landed, queue shortened
|
||||
|
||||
be2026a approved (REVIEW a531746, cold-verified independently) and merged as 6cabbe7; drone build
|
||||
350 green on the push head 914c166. Merged diff verified == branch diff (empty git diff be2026a..
|
||||
main for the two files). Post-fix proof m2p2-lasuite-drive queued from a FRESH clone
|
||||
/root/m2-postfix @6cabbe7 rather than git-updating /root/m2-sweep, because the serial queue's
|
||||
discourse runs exec from m2-sweep and swapping code under an active/imminent run is how you get
|
||||
unexplainable results. The discourse A/B therefore runs at 5c0676b (pre-converge-fix) — irrelevant
|
||||
to discourse (no one-shots), and the Adversary's approval explicitly noted that.
|
||||
|
||||
Shortened the doomed m2p run: the generic install assert had already burned its 1800s converge
|
||||
deadline and failed; the overlay install test then started an IDENTICAL second 1800s burn (same
|
||||
assert_serving). SIGINT'd the overlay pytest child only — KeyboardInterrupt surfaced at
|
||||
generic.py:97, the exact diagnosed converge-poll line (a nice live confirmation), and the
|
||||
orchestrator advanced to the upgrade tier on its normal path. Teardown semantics untouched.
|
||||
Disclosed in STATUS so the log's KeyboardInterrupt is pre-explained.
|
||||
|
||||
Drone API note for future me: no token on disk; fastest read-only check is docker cp the drone
|
||||
sqlite out and query builds (documented in STATUS). The Gitea statuses API returned empty for
|
||||
these shas (drone evidently doesn't post commit statuses here).
|
||||
|
||||
## 2026-06-11 ~00:55Z — discourse A/B closed (harness-neutral), mechanism still unattributed
|
||||
|
||||
m2p-discourse (new main, PR=2, @7ae7b0f) and ab-discourse-7ae7b0f-oldmain (old main, PR=2, same
|
||||
ref) failed the upgrade IDENTICALLY: HC1, chaos-version=eb96de94+U, all other tiers pass, L2.
|
||||
Same invocation as baseline 184 which was L4 five days ago. So: deterministic, harness-neutral,
|
||||
and something outside both harnesses drifted since 06-05. Eliminated: branch-tip existence (7ae7b0f
|
||||
still tips upgrade-0.8.0+3.5.0 + pr/2), upstream tag set (0.7.0+3.3.1 still latest), abra pin
|
||||
(flake.lock untouched by the restructure). Not eliminated: abra-internal interaction with repo/app
|
||||
state (the chaos stamp lands on the prev-base TAG commit despite the tree being at the PR head —
|
||||
my best guess remains something in how abra resolves the version/commit for the chaos label when
|
||||
COMPOSE_FILE includes the overlay and the project normalizes invalid, but m2r at 7d53d4ec stamping
|
||||
correctly with the same dangling depends_on kills the simple version of that theory). The
|
||||
`service "sidekiq" depends on...` line appears in passing AND failing upgrades, position-identical,
|
||||
so it discriminates nothing. M2-wise the question is settled — the restructure is exonerated by
|
||||
byte-identical old==new failure; chasing abra's stamp resolution further is post-phase work, filed
|
||||
as a DEFERRED note rather than burning more M2 wall-clock on a non-rcust mechanism.
|
||||
|
||||
m2p2-lasuite-drive (the binding post-fix proof) auto-started at 00:48:58Z from /root/m2-postfix
|
||||
@6cabbe7. Watching for: no 1800s converge burn after the one-shot completes, then L5.
|
||||
|
||||
## 2026-06-11 ~01:10Z — m2p2 green; "L5" turned out to be a moved goalpost (mainline, not ours)
|
||||
|
||||
m2p2-lasuite-drive: rc=0, 3m19s, all stages pass, OIDC + MinIO custom tests green, and the
|
||||
fix-forward pair demonstrably exercised (one-shot overshot 90s again → best-effort line → late
|
||||
Complete → converge fix admitted it). But results.json said level=4 where the binding condition
|
||||
said L5 — heart-stopper until the git archaeology: run 189's level-5 + "L6 recipe-local N/A" cap
|
||||
didn't match ANY derive_rungs I could find in either world, because the 6-rung ladder was removed
|
||||
on MAIN by 46e2cdb+c51cd84 (PR #6) on 06-09, between the baseline runs and the merge — by the
|
||||
mirror/report phase, not rcust. The merge didn't touch level.py (checked 01e6d49^1..01e6d49), and
|
||||
run 204 on 06-09 (hours pre-deploy of the refactor) still shows 6 rungs — clean timeline. So the
|
||||
baseline matrix's "L5" rows need a schema-equivalence reading, declared in STATUS BEFORE the claim
|
||||
rather than negotiated after the Adversary trips on it. Lesson re-learned: a baseline matrix
|
||||
should pin the SCHEMA VERSION of its evidence, not just the level number.
|
||||
|
||||
## 2026-06-11 ~01:30Z — M2 claim assembled
|
||||
|
||||
Drone-path runs landed green (356 immich#2 L4, 357 plausible#3 L4, both with embedded
|
||||
customization manifests + clean flags, triggered by real !testme comments). Zero-leak verified
|
||||
after everything. Plausible's missing screenshot.png checked against its other runs — it never
|
||||
produces one (no screenshot surface), so not a capture regression. Claimed M2 with the full
|
||||
21-recipe reconciliation table against the corrected baseline; the three lasuite rows ride the
|
||||
Adversary-accepted L5≡L4+OIDC equivalence, bluesky-pds is the one justified exclusion, discourse
|
||||
is reconciled as env-drift with byte-identical old==new evidence. Nothing else unblocked in this
|
||||
phase while the verdict is out — holding per §7 case 2.
|
||||
|
||||
## 2026-06-11 ~01:20Z — M2 PASS → ## DONE
|
||||
|
||||
Adversary cold-verified the whole claim independently (re-ran the canaries themselves, jq'd all 21
|
||||
run dirs, re-checked the drone DB and the zero-leak state) and passed M2 with no findings and no
|
||||
VETO. M1 + M2 both stand; ## DONE written. Phase summary: 6 plan phases landed on one branch,
|
||||
merged after M1; the real-CI sweep then caught exactly TWO genuine regressions (both in the same
|
||||
lasuite-drive P2b hook port: raise-on-timeout, and one-shot-vs-converge ordering), both root-caused
|
||||
live, fixed forward under approval, and proven end-to-end — plus it surfaced two pre-existing
|
||||
environment drifts (discourse upgrade-HC1, bluesky-pds upstream image) that the A/B discipline
|
||||
kept from being misattributed to the restructure. The sweep-as-safety-net worked as designed.
|
||||
|
||||
40
JOURNAL-shot.md
Normal file
40
JOURNAL-shot.md
Normal file
@ -0,0 +1,40 @@
|
||||
# JOURNAL-shot.md — Builder journal, phase `shot`
|
||||
|
||||
## 2026-06-11 ~01:17–01:35Z — phase open, P1+P2 in one sweep
|
||||
|
||||
Read the phase plan + plan.md §6.1/§7/§9. Enumerated enrolled recipes (19). Pulled per-recipe
|
||||
latest-run data off cc-ci (`results.json` screenshot field + PNG size for all ~190 run dirs),
|
||||
scp'd 18 PNGs to /tmp/shot-audit/ and Read every one of them.
|
||||
|
||||
Findings vs the orchestrator pre-audit: all four 4801-2B suspects are indeed blank frames
|
||||
(immich pure white, lasuite-meet white, n8n off-white, cryptpad grey). keycloak 8.7KB is a
|
||||
"Loading the Administration Console" spinner — NOT a sparse login page as §2 guessed.
|
||||
lasuite-docs/drive ~5.9KB are lone spinners. Two surprises: (1) mattermost-lts 242KB, classed
|
||||
healthy by size, is actually the brand splash/loading screen, not the login form — size
|
||||
heuristics lie in both directions; (2) mumble serves a real web page (mumble-web client per
|
||||
compose.mumbleweb.yml, deployed since Phase 2 for HTTP health) showing its connecting spinner —
|
||||
so mumble is fixable, not an N/A.
|
||||
|
||||
plausible root cause: traced via Drone sqlite (no python3 on host; ran alpine+sqlite3 against
|
||||
the drone data volume). Build 357 log t=73s: capture failed, last status=500 after 45s. Cross-ref
|
||||
tests/plausible/functional/test_health_check.py: `/` 500s via auth_controller under
|
||||
DISABLE_AUTH=true — permanent, not an init race. So the default landing capture can never work;
|
||||
plausible needs a SCREENSHOT hook to a path that renders (will probe /login, /sites on a live
|
||||
deploy during P3).
|
||||
|
||||
bluesky-pds: null because install fails at level 0 (upstream image breakage, already in
|
||||
DEFERRED.md from rcust) — capture gated on deploy_ok, correctly skipped. N/A while upstream broken.
|
||||
|
||||
custom-html nginx-welcome: verified no install-time seeding exists for this recipe (custom-html-tiny
|
||||
has install_steps.sh; custom-html only seeds in pre_backup/pre_upgrade ops, after capture). The
|
||||
nginx default page IS the honest fresh-install view. Leaving OK; flagged in matrix for Adversary.
|
||||
|
||||
Adversary opened REVIEW-shot.md with its own cold pre-audit (4f3a747) before my first push —
|
||||
good: my visual reads agree with theirs on every overlapping row.
|
||||
|
||||
Design thinking for P3 (next iteration): default-path improvement = after goto(domcontentloaded),
|
||||
try a bounded `wait_for_load_state("networkidle")` (~10-15s cap) and/or wait for a non-trivial
|
||||
painted body, then screenshot; then a blank-detect (PNG < ~6KB or near-uniform) → one retry with
|
||||
a longer settle. Keep total ≤ ~60s worst case, all inside the existing capture() try/except so R7
|
||||
(cosmetics never block) is preserved. Unit tests: blank-detector pure function + retry logic with
|
||||
a fake page. Per-recipe hooks only for plausible (500 root) + whatever the re-audit still shows.
|
||||
159
REVIEW-rcust.md
159
REVIEW-rcust.md
@ -380,3 +380,162 @@ need is m2p-discourse (PR=2, new main) vs ab-discourse-7ae7b0f-oldmain (PR=2, ol
|
||||
184 (PR=2, old main, L4). I will cold-verify those three when they land; my L4→L1 concern is on
|
||||
hold pending the PR=2 result, not yet a confirmed regression. Live lasu-f68b63 stack = active
|
||||
lasuite-drive proof run (expected, not a leak).
|
||||
|
||||
### M2 fix-forward APPROVE: be2026a (services_converged completed-one-shot rule) @2026-06-11T00:31Z
|
||||
|
||||
Builder proposed a 2nd lasuite-drive P2b fix on branch `fix/converged-oneshot @ be2026a` and asked
|
||||
approval before merging to main (M2 "trivial fix-forward w/ Adversary approval" path). Cold-verified
|
||||
independently (fresh clone of be2026a at /root/adv-be2026a on cc-ci, NOT the Builder's working tree):
|
||||
|
||||
- **Diff** (`git diff origin/main..be2026a runner/harness/lifecycle.py`, read myself): in
|
||||
`services_converged`, a `cur != want` deficit now passes ONLY if `docker service ps <svc>` shows
|
||||
ALL task states == `Complete`. Conservative: any Running/Preparing/Pending (spinning up) or
|
||||
Failed/Rejected (broken) in the deficit still returns False; no-tasks-yet still False; plain N/N
|
||||
and 0/0 unchanged. Targeted addition, not a rewrite.
|
||||
- **False-green analysis (my own):** only `restart_policy:none` one-shots ever show `Complete`; a
|
||||
normal crashed service shows Failed/Running(restarting), never Complete. Even if converge passed
|
||||
on a completed-but-ineffective one-shot, two INDEPENDENT gates still catch it — the generic
|
||||
`test_serving` HTTP floor and the custom-tier functional test (lasuite-drive
|
||||
`test_minio_storage.py` upload→list→download is the real bucket gate). Defense-in-depth holds; I
|
||||
could not construct a false-green path.
|
||||
- **Tests** `tests/unit/test_converged_oneshot.py` (read + cold-ran): 7 cases pin exactly the
|
||||
non-vacuity criteria — completed→converged, Failed→NOT, mixed Complete+Failed→NOT (covers the
|
||||
`docker service ps` history concern), Preparing→NOT, no-tasks→NOT, N/N→converged, 0/0→converged.
|
||||
- **Cold suite+lint from fresh be2026a checkout:** `cc-ci-run -m pytest tests/unit -q` → **199
|
||||
passed**; the 7 new tests pass alone; `nix develop .#lint --command scripts/lint.sh` → **lint:
|
||||
PASS**. Matches Builder's claim.
|
||||
- **Root cause judged genuine P2b regression** (hook moved into ops.py pre_install runs BEFORE the
|
||||
install assert; the completed one-shot's 0/1 then burns DEPLOY_TIMEOUT in the converge poll). The
|
||||
fix accepts a genuinely-healthy deploy (HTTP 200, all other services 1/1) the old `cur!=want`
|
||||
wrongly rejected — correction, not masking.
|
||||
- **Not on main** — confirmed `all(s == "Complete")` absent from origin/main; Builder held the gate.
|
||||
- **Disclosed semantic delta** (a failing one-shot now blocks install convergence earlier vs later
|
||||
at custom-tier): ACCEPTED — both paths RED, no false-green, no enrolled recipe has a
|
||||
baseline-failing one-shot.
|
||||
|
||||
**VERDICT: fix-forward be2026a APPROVED, conditional on:**
|
||||
1. Post-merge lasuite-drive proof re-run @ffa7d585afa2 PR=1 lands **L5** (binding end-to-end proof
|
||||
the fix resolves the converge hang — if it doesn't, the diagnosis was wrong and approval voids).
|
||||
2. I re-verify the MERGED diff == be2026a diff (no extra change sneaks in at merge).
|
||||
3. discourse PR=2 A/B pair (m2p-discourse / ab-discourse-7ae7b0f-oldmain — no one-shots, unaffected
|
||||
by this fix) completes and I cold-verify those levels too.
|
||||
This APPROVE does NOT clear M2; M2 still needs all per-recipe levels reconciled + my independent
|
||||
sample re-check + zero-leak teardown.
|
||||
|
||||
### be2026a merge cold-verify — condition #2 SATISFIED @2026-06-11T00:42Z
|
||||
|
||||
Builder merged be2026a as 6cabbe7 (build 350 green, origin/main now b4505ac). Independently checked:
|
||||
`diff origin/main:runner/harness/lifecycle.py be2026a:...` → **IDENTICAL**; the merged
|
||||
`tests/unit/test_converged_oneshot.py` → **IDENTICAL** to be2026a. Clean merge, no extra change
|
||||
slipped in — approval condition #2 met. m2p-lasuite-drive (pre-fix) landed L0 (install/converge
|
||||
timeout) = the diagnosed symptom (Builder disclosed b4505ac it SIGINT-shortcut the doomed burn;
|
||||
binding proof is the post-fix m2p2 re-run). REMAINING be2026a conditions: #1 post-fix lasuite-drive
|
||||
L5, #3 discourse PR=2 A/B cold-check — both pending (m2p-discourse running, then ab-oldmain, then
|
||||
m2p2-lasuite-drive).
|
||||
|
||||
### be2026a conditions CLEARED + SSO-baseline staleness finding (independent) @2026-06-11T01:12Z
|
||||
|
||||
Reached the conclusions below COLD (own git archaeology + run-dir jq) BEFORE reading the Builder's
|
||||
01:10Z inbox — which then concurred. Anti-anchoring preserved (no JOURNAL read; inbox read after my
|
||||
own derivation).
|
||||
|
||||
**be2026a fix-forward — ALL 3 CONDITIONS SATISFIED → fix-forward FULLY CLEARED:**
|
||||
1. **Post-fix lasuite-drive (m2p2, merged main 6cabbe7, ffa7d585afa2, PR=1): L4, rc=0, 3m19s.**
|
||||
Independently verified: flags clean_teardown=true + no_secret_leak=true; all 4 essential rungs
|
||||
pass; `test_minio_storage::...object_roundtrip` PASSED; `test_oidc_..._keycloak` PASSED. The
|
||||
install converge no longer hangs — both fix-forwards (1357544 best-effort poll + 6cabbe7
|
||||
completed-one-shot converge) exercised in one run. The literal "L5" in my condition is
|
||||
**unmeetable on current code and NOT an rcust effect** — see staleness finding below; I accept
|
||||
the L4-equivalence. Fix works end-to-end.
|
||||
2. **Merged diff == branch diff** — verified earlier (4428e76): lifecycle.py + test file
|
||||
byte-identical to be2026a.
|
||||
3. **discourse A/B — restructure-NEUTRAL.** m2p-discourse (NEW main, 7ae7b0f, PR=2) = L1 and
|
||||
ab-discourse-7ae7b0f-oldmain (OLD main, SAME ref, SAME PR=2) = L1, SAME stage (upgrade), SAME
|
||||
message (`eb96de94+U` HC1 re-checkout). old==new byte-identical → rcust did NOT regress discourse.
|
||||
The L4(184)→L1 vs baseline is pre-existing env drift since 06-05 (filed below), not rcust.
|
||||
|
||||
**FINDING [adversary] — M2 baseline matrix has 3 STALE L5 entries (lasuite-docs/drive/meet).**
|
||||
Independently established: the level ladder dropped 6-rung(L5)→4-rung(max L4, integration &
|
||||
recipe-local now OPTIONAL/non-laddered) in mainline PR#6 (c51cd84 "4-rung ladder", + 46e2cdb),
|
||||
which `git merge-base --is-ancestor c51cd84 01e6d49^` confirms is an ANCESTOR OF PRE-RCUST MAIN.
|
||||
The rcust merge touches level.py NOT AT ALL and results.py by +4 cosmetic P5 lines; compute_level
|
||||
+ derive_rungs are byte-identical old-main↔merged-main. So NO current-code run (rcust or pre-rcust)
|
||||
can produce L5; baselines 188/189/204 (L5, integration:pass) were recorded under the OLD schema
|
||||
(run 204 ran 06-09 hours before the refactor deployed). **rcust is INNOCENT of L4≠L5.** Integration
|
||||
coverage is NOT lost: the requires_deps OIDC tests EXECUTE and PASS (skip-count 0) on current code —
|
||||
verified in m2p2 AND the sweep's m2r-lasuite-docs (`test_oidc_login_via_keycloak` +
|
||||
`test_oidc_password_grant_...` PASSED) and m2r-lasuite-meet (`...password_grant...` PASSED).
|
||||
ACCEPTED equivalence for the M2 matrix: **old L5 ≡ new L4 (all 4 essential rungs pass) + requires_deps
|
||||
OIDC test PASSED (skip-count 0)**. Under this, lasuite-docs (m2r L4) / lasuite-meet (m2r L4) /
|
||||
lasuite-drive (m2p2 L4) all MATCH. (Note: this validates — but corrects the basis of — the Builder's
|
||||
first-sweep "lasuite-docs/meet matched baseline"; they are L4+OIDC, not numeric L5.) This is a
|
||||
matrix-staleness correction, NOT a rcust regression; no VETO.
|
||||
|
||||
**Still OPEN for the M2 verdict (my side):** (a) per-recipe levels reconciled vs the CORRECTED
|
||||
baseline for all 21; (b) bluesky-pds is L0 on BOTH old & new main (upstream image
|
||||
`Cannot find module index.js`) — restructure-neutral but also cannot match its L4-equiv baseline on
|
||||
ANY current run → needs a DECISIONS/DEFERRED note as non-rcust upstream breakage, not a silent
|
||||
mismatch; (c) the 2 drone-path !testme runs (immich#2/plausible#3); (d) zero-leak teardown sweep;
|
||||
(e) my own independent re-check of ≥5 recipes' logs + ALL mismatches before any M2 PASS.
|
||||
|
||||
---
|
||||
|
||||
## M2 — merged-main real-CI regression sweep: **PASS** @2026-06-11T01:15Z
|
||||
|
||||
Cold-verified the M2 claim (STATUS gate "M2 CLAIMED ~01:30Z") from my own clone + direct on cc-ci,
|
||||
re-running/ re-parsing rather than trusting Builder logs. Every M2.0–M2.4 item holds.
|
||||
|
||||
**M2.2 canaries — cold RE-RAN myself** from a fresh `origin/main` checkout (/root/adv-be2026a @
|
||||
origin/main): `cc-ci-run -m pytest tests/regression/ -m canary -v` → **7/7 passed (301s)**, incl.
|
||||
`bad-false-green` (the false-green detector) + all four RED canaries (bad-install/upgrade/backup/
|
||||
restore) caught at their designed tier. The level system is NOT inflating. (log /root/adv-canary.log)
|
||||
|
||||
**M2.3 per-recipe — all 21 reconciled (cold jq on each run dir):**
|
||||
- 13 clean: cryptpad/custom-html/ghost/hedgedoc/keycloak/matrix-synapse/n8n/uptime-kuma = L4;
|
||||
mailu/custom-html-tiny = L2 (backup_restore N/A); mumble = L4 (deploy-count=1) — all == baseline,
|
||||
clean_teardown=true.
|
||||
- 2 designed-bad canaries genuinely exercised: bkp-bad rungs backup_restore=**fail** (backup=fail);
|
||||
rst-bad backup_restore=**fail** (backup=pass→restore=fail). The L1 cap is upgrade-N/A ladder
|
||||
semantics; the designed failure is recorded in the rung (verified — NOT a coincidental
|
||||
level-match).
|
||||
- immich/mattermost-lts/plausible: **L4 @ exact baseline refs** (m2b-*) — baseline REPRODUCED on the
|
||||
restructured harness (cold-verified earlier this session).
|
||||
- discourse: m2p-discourse (NEW main) == ab-discourse-7ae7b0f-oldmain (OLD main) — SAME ref/PR=2,
|
||||
SAME stage, SAME upgrade-HC1 message (`eb96de94+U`), SAME L1. **old==new ⇒ rcust-neutral**; the
|
||||
L4(184)→L1 is pre-existing env drift since 06-05 (DEFERRED.md), NOT caused by the restructure.
|
||||
- lasuite-docs/-meet/-drive: L4 all-rungs-pass + requires_deps OIDC test PASSED (skip-count 0)
|
||||
[lasuite-drive m2p2 also MinIO PASSED, post-both-fixes, rc=0]. Their "L5" baselines are STALE:
|
||||
the 6→4-rung ladder landed in mainline c51cd84 (PR#6), which `git merge-base --is-ancestor
|
||||
c51cd84 01e6d49^` confirms PREDATES the rcust merge; level.py untouched by the merge, derive_rungs
|
||||
byte-identical old↔new. **rcust-innocent; integration coverage preserved** (OIDC tests execute &
|
||||
pass). Accepted equivalence old L5 ≡ new L4-all-pass + OIDC-pass.
|
||||
- bluesky-pds: EXCLUDED — `Cannot find module /app/index.js` crash-loop on BOTH old & new main at
|
||||
every ref → upstream image breakage, rcust-neutral. DEFERRED.md note present.
|
||||
|
||||
**M2.3 drone→harness path:** drone builds **356 (immich) + 357 (plausible)** = `build_event=custom`
|
||||
(bridge-triggered; distinct from push builds 358-361), trigger=autonomic-bot, both **success**
|
||||
(verified in drone sqlite DB); run dirs 356/357 = immich L4 pr=2 / plausible L4 pr=3, customization
|
||||
manifest present, clean_teardown=true.
|
||||
|
||||
**M2.4 customizations actually executed (cold-grep):** manifest block **21/21** logs; mumble
|
||||
`ready-probe OK (tcp 3x) 127.0.0.1:64738`; ghost `ccci-overlay: provided compose.ccci.yml ...
|
||||
base deploy auto-chaos` (P2a first-class path live); cryptpad `EXTRA_ENV='<hook>'`; immich
|
||||
`ops.py[pre_backup,pre_restore,pre_upgrade]` + `pre-op seed` lines (migrated ctx hooks run).
|
||||
|
||||
**Teardown:** `docker stack ls` = infra (backups/bridge/dashboard/reports/drone/traefik) +
|
||||
warm-keycloak ONLY, **zero leaked app stacks** (checked after ALL runs incl. drone-path).
|
||||
|
||||
**Fix-forwards (both Adversary-approved, additive):** 1357544 (lasuite-drive best-effort poll, appr
|
||||
57c66ad) + be2026a/6cabbe7 (services_converged completed-one-shot, appr a531746) — merged diff ==
|
||||
branch diff, all 3 be2026a conditions cleared (24a203a). Cold unit suite on post-fix main = 199
|
||||
passed, lint PASS.
|
||||
|
||||
**VERDICT: M2 PASS.** No regression CAUSED BY the restructure: every deviation from the baseline
|
||||
matrix is proven rcust-neutral by same-ref old-vs-new A/B (discourse, bluesky) or is a pre-rcust
|
||||
stale-schema artifact with coverage preserved (3 lasuite), all documented in DEFERRED.md — not a
|
||||
silent mismatch. The false-green detector is green on my own cold canary run. No findings filed,
|
||||
no VETO.
|
||||
|
||||
**M1 PASS (01f9f70) + M2 PASS (this entry) both stand** → the phase DoD handshake is satisfied; the
|
||||
Builder may write `## DONE` to STATUS-rcust.md. (M1's unit+lint acceptance still holds on post-fix
|
||||
main: 199 passed / lint PASS, the fix-forwards being additive + separately approved.)
|
||||
|
||||
113
REVIEW-shot.md
Normal file
113
REVIEW-shot.md
Normal file
@ -0,0 +1,113 @@
|
||||
# REVIEW-shot.md — Adversary verdicts, phase `shot` (recipe screenshot audit & repair)
|
||||
|
||||
Owner: Adversary loop. Append-only verdict log. Gates: M1 (audit+diagnosis), M2 (all working).
|
||||
SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase-shot-screenshots.md`.
|
||||
|
||||
No gate CLAIMED yet (phase just opened; Builder has not bootstrapped STATUS-shot.md). Doing
|
||||
independent cold ground-truth prep below so M1/M2 cold-verify is fast and un-anchored.
|
||||
|
||||
---
|
||||
|
||||
## Independent cold pre-audit (Adversary, @2026-06-11T01:20Z)
|
||||
|
||||
Method: ssh cc-ci, scanned `/var/lib/cc-ci-runs/*/results.json` for recipe + `screenshot` field +
|
||||
on-disk `screenshot.png` size; scp'd suspect PNGs locally and **looked at them** (Read tool).
|
||||
This is MY ground truth, formed before any Builder claim — to compare against the Builder's matrix.
|
||||
|
||||
PNG sizes from latest representative runs (m2r-* sweep + numbered drone runs):
|
||||
|
||||
| recipe | PNG bytes | my visual read | class |
|
||||
|---|---|---|---|
|
||||
| immich | 4801 | pure blank white frame | **BLANK** |
|
||||
| n8n | 4801 | blank near-white frame | **BLANK** |
|
||||
| lasuite-meet | 4801 | (size-identical to immich/n8n 4801B — blank tell) | BLANK (to confirm visually) |
|
||||
| cryptpad | 4802 | blank light-grey frame | **BLANK** |
|
||||
| keycloak | 8764 | spinner + "Loading the Administration Console" — paint-race loading state, NOT a real login form | **BLANK/LOADING** (not the "genuine sparse login" §2 guessed) |
|
||||
| lasuite-docs | 6022 | bare spinner on white | **BLANK/LOADING** |
|
||||
| lasuite-drive | ~5.9K | (size sibling of lasuite-docs — likely same spinner) | BLANK (to confirm) |
|
||||
| plausible | null / NO PNG | every run null (122→357 incl. 357); run dir has no screenshot.png; capture stdout not in run dir (goes to Drone build log) — root cause still to trace | **NULL** |
|
||||
| ghost | 444183 | (reference healthy, §2) | OK (visual-confirm at M2) |
|
||||
| mattermost-lts | 242139 | reference healthy | OK |
|
||||
| hedgedoc | 131967 | reference healthy | OK |
|
||||
| discourse | 66-67K | reference healthy | OK |
|
||||
| custom-html | 35707 | reference healthy | OK |
|
||||
| mailu | 33800 | reference healthy | OK |
|
||||
| matrix-synapse | 33296 | reference healthy | OK |
|
||||
| uptime-kuma | 30858 | reference healthy | OK |
|
||||
| custom-html-tiny | 12950 | reference healthy | OK |
|
||||
| mumble | 7913 | voice server — web-UI N/A candidate (confirm) | N/A? |
|
||||
|
||||
Confirmed defect classes match the orchestrator pre-audit (§2): SPA paint-race (domcontentloaded
|
||||
fires before JS paints) → immich/n8n/cryptpad fully blank, keycloak/lasuite-docs/-drive caught at
|
||||
loading spinner; plausible never captures (null on every run). **The 4801B byte-identical size is a
|
||||
reliable blank-frame fingerprint.**
|
||||
|
||||
Open items I must still resolve when verifying:
|
||||
- plausible NULL root cause — need the Drone build log for a plausible run (capture stdout: "capture
|
||||
failed" vs "produced no file" vs step never reached). Run dir alone doesn't have it.
|
||||
- lasuite-meet / lasuite-drive / mumble — visual confirm.
|
||||
- Authoritative enrolled-recipe set: every `tests/<recipe>/recipe_meta.py` minus fixtures
|
||||
(`_generic`, `regression`, `concurrency`, `custom-html-bkp-bad`, `custom-html-rst-bad`).
|
||||
|
||||
No verdict yet. Awaiting `claim(shot): M1`.
|
||||
|
||||
---
|
||||
|
||||
## M1: PASS @2026-06-11T01:38Z (audit + diagnosis complete)
|
||||
|
||||
Claim: `claim(shot): M1` commit e005897; matrix+diagnoses at 8978fa6. STATUS-shot.md "M1 claim".
|
||||
Verified COLD from my own clone + ssh cc-ci, **without reading JOURNAL-shot.md** (anti-anchoring).
|
||||
My independent pre-audit (commit 4f3a747, formed BEFORE reading the Builder's matrix) already
|
||||
agreed on every BLANK/LOADING/NULL read I had pre-formed — no anchoring.
|
||||
|
||||
**Enrolled set — complete, no omissions.** `ls tests/*/recipe_meta.py` = 21. Minus the two harness
|
||||
canaries `custom-html-bkp-bad`, `custom-html-rst-bad` (plan §2 explicitly excludes both) = **19**.
|
||||
The 19 matrix rows are *exactly* that set (diffed by hand) and exactly the plan §2 expected set.
|
||||
`_generic`/`regression`/`concurrency`/`unit` have no recipe_meta.py → correctly absent. ✓
|
||||
|
||||
**Every non-OK row has evidence-backed root cause (independently re-derived):**
|
||||
- plausible NULL — ran the Builder's drone-log command myself: build 357 step log shows
|
||||
`capture failed … page.goto(https://plau-…/) never returned a status in (200,301,302,303,401,403)
|
||||
after 15 attempts (45s); last status=500`. `/` 500s by design (DISABLE_AUTH) → default landing
|
||||
capture can never succeed; needs a SCREENSHOT hook to a rendering path. Confirmed. ✓
|
||||
- bluesky-pds NULL — capture is `if deploy_ok:`-gated, OUTSIDE the deploy try/except
|
||||
(runner/run_recipe_ci.py:1024, read it). install=fail level=0 → capture correctly skipped. Not a
|
||||
screenshot defect; upstream image breakage already in DEFERRED.md (rcust). ✓
|
||||
- BLANK/LOADING — screenshot.py:84-93 navigates `wait_until="domcontentloaded"` then screenshots
|
||||
immediately, no paint wait; accept_statuses excludes 500 (plausible mechanism). Read the code. ✓
|
||||
- mumble NOT N/A — tests/mumble/recipe_meta.py header: deploys `compose.mumbleweb.yml`, a mumble-web
|
||||
HTTP client routed through Traefik, HEALTH_PATH "/". A real web surface IS served → correctly the
|
||||
HARDER (non-N/A) call. ✓
|
||||
|
||||
**Independent visual spot-checks (Read tool) — 11 artifacts, matrix matched reality on every one:**
|
||||
immich 4801B = pure white; n8n 4801B = blank; cryptpad 4802B = blank grey; lasuite-meet 4801B =
|
||||
pure white; keycloak 8764B = "Loading the Administration Console" spinner (NOT a real login — the
|
||||
§2 "might be a genuine login" guess was wrong, Builder classed it LOADING correctly); lasuite-docs
|
||||
6022B = bare spinner; mumble 7913B = spinner ring on grey; mattermost-lts 242139B = blue brand
|
||||
splash + logo, NO login form (correctly LOADING despite large size — size alone is NOT a sufficient
|
||||
signal, good catch); n8n run 197 30256B = real "Set up owner account" form, empty fields,
|
||||
credential-free (flaky-pass + secret-safe, confirmed); custom-html 35707B = genuine "Welcome to
|
||||
nginx!" (honest fresh-install view for a bare static host — OK); plausible = NULL via drone log.
|
||||
Includes plausible ✓ and multiple 4801B cases ✓ (M1 minimum was ≥5 incl. those — exceeded).
|
||||
|
||||
**N/A arguments — agreed:**
|
||||
- bluesky-pds → justified N/A (deploy-gated: can't screenshot what can't deploy; upstream breakage
|
||||
is pre-existing/DEFERRED, not a screenshot defect). Agreed, contingent on the upstream image still
|
||||
being broken at M2 — if it becomes deployable, it re-enters as a real recipe.
|
||||
- mumble → NOT N/A. Agreed (real mumble-web surface, evidence above).
|
||||
|
||||
No omissions, no fabricated visual reads, diagnoses are causal not symptomatic. **M1 PASS.**
|
||||
|
||||
Watch-list for M2 (so the Builder has it early — NOT blocking M1):
|
||||
1. Harness default-wait fix must stay within NAV_DEADLINE_S=45 / step worst-case ≤~60s and must
|
||||
NEVER affect a verdict on screenshot failure (R7) — I will test the failure path has teeth but
|
||||
no verdict impact, and compare pre/post run durations.
|
||||
2. plausible SCREENSHOT hook must land on a credential-free *rendering* path (not /login showing a
|
||||
generated secret; not a 500 page).
|
||||
3. mattermost-lts proof: a bigger PNG is NOT acceptance — I will visually confirm the real login,
|
||||
not a brand splash.
|
||||
4. Secret-safety: every final PNG must show no generated credentials (install wizards, secrets
|
||||
pages). n8n's "Set up owner account" with EMPTY fields is the safe shape; a pre-filled one is not.
|
||||
5. M2 requires ≥2 proof runs via the drone `!testme` path + me Reading *every* final PNG.
|
||||
|
||||
Did not read JOURNAL-shot.md before this verdict. No finding filed (audit is accurate). No VETO.
|
||||
144
STATUS-rcust.md
144
STATUS-rcust.md
@ -1,5 +1,14 @@
|
||||
# STATUS — sub-phase rcust (recipe-customization restructure)
|
||||
|
||||
## DONE
|
||||
|
||||
Phase complete 2026-06-11: M1 PASS (REVIEW-rcust.md 01f9f70, 2026-06-10) + M2 PASS (REVIEW-rcust.md
|
||||
3245150, 2026-06-11) — both fresh, Adversary-verified, no standing VETO. Restructure merged to main
|
||||
(01e6d49 + approved fix-forwards 1357544, 6cabbe7); all 21 recipes reconciled vs corrected
|
||||
baseline; canaries 7/7 (Adversary's own cold run); drone path covered; zero leaked apps.
|
||||
Non-rcust follow-ups filed in machine-docs/DEFERRED.md (discourse abra-stamp env drift,
|
||||
bluesky-pds upstream image breakage re-pin).
|
||||
|
||||
Plan: /srv/cc-ci/cc-ci-plan/recipe-custom-restructure-full-plan.md (SSOT for this phase).
|
||||
Reference spec: docs/recipe-customization.md @ 76a4b6b.
|
||||
Work branch: `restructure/recipe-custom` (one commit per phase P1–P6; merged to main only after M1 PASS).
|
||||
@ -77,7 +86,65 @@ sweep runs, not retroactively here.
|
||||
|
||||
## Gate
|
||||
|
||||
**Gate: M2 IN PROGRESS** — M1 PASS in REVIEW-rcust.md (01f9f70, 2026-06-10).
|
||||
**Gate: M2 CLAIMED 2026-06-11 ~01:30Z, awaiting Adversary.**
|
||||
|
||||
### M2 claim — WHAT / HOW / EXPECTED / WHERE
|
||||
|
||||
WHAT: plan M2.0–M2.4 complete on merged main. Merge 01e6d49 (build 326 green) + two
|
||||
Adversary-approved fix-forwards: 1357544 (lasuite-drive best-effort bucket poll, approval 57c66ad)
|
||||
and 6cabbe7 = merge of be2026a (services_converged completed-one-shot rule, approval a531746,
|
||||
build 350 green on 914c166, merged-diff==branch-diff verified 4428e76). Canaries 7/7. All 21
|
||||
recipe dirs reconciled vs the CORRECTED baseline (the Adversary-accepted L5≡L4+OIDC equivalence
|
||||
for the three stale lasuite-* rows; one justified exclusion: bluesky-pds, non-rcust upstream image
|
||||
breakage, DEFERRED.md). Drone→harness path covered (2 PR !testme runs green). Zero leaked apps.
|
||||
|
||||
RECONCILIATION (final evidence per recipe; run dirs under /var/lib/cc-ci-runs/):
|
||||
|
||||
| Recipe | Baseline | Final evidence | Match |
|
||||
|---|---|---|---|
|
||||
| bluesky-pds | full green (pre-results-era) | m2r L0 == m2rr L0 == ab-oldmain L0, all `Cannot find module /app/index.js` crash-loop | EXCLUDED: upstream image breakage, harness-neutral (DEFERRED.md) |
|
||||
| cryptpad | L4 | m2r-cryptpad L4 | ✓ |
|
||||
| custom-html | L4 | m2r-custom-html L4 | ✓ |
|
||||
| custom-html-bkp-bad | designed backup fail, L1 | m2r: backup fail exactly | ✓ |
|
||||
| custom-html-rst-bad | designed restore fail, L1 | m2r: backup pass → restore fail exactly | ✓ |
|
||||
| custom-html-tiny | L2 (declared EXPECTED_NA) | m2r-custom-html-tiny L2 | ✓ |
|
||||
| discourse | L4 (184, 06-05) | m2r/m2b/m2p + ab-oldmain×2: ALL deviations byte-identical old==new harness (restore race @default head: L2==L2; upgrade-HC1 @baseline ref PR=2: L1==L1, stamp eb96de94+U both) | env drift since 06-05, rcust-neutral (Adversary-verified, condition 3 of a531746) |
|
||||
| ghost | L4 | m2r-ghost L4 | ✓ |
|
||||
| hedgedoc | L4 | m2r-hedgedoc L4 | ✓ |
|
||||
| immich | L4 | m2b-immich L4 @baseline ref + drone-path run 356 L4 | ✓ |
|
||||
| keycloak | L4 | m2r-keycloak L4 | ✓ |
|
||||
| lasuite-docs | L5 (stale schema) | m2r-lasuite-docs L4 all-pass + OIDC PASSED skip-0 | ✓ (accepted equivalence) |
|
||||
| lasuite-drive | L5 (stale schema) | m2p2-lasuite-drive L4 all-pass + OIDC + MinIO PASSED, rc=0, post-both-fixes | ✓ (accepted equivalence) |
|
||||
| lasuite-meet | L5 (stale schema) | m2r-lasuite-meet L4 all-pass + OIDC PASSED | ✓ (accepted equivalence) |
|
||||
| mailu | L2 | m2r-mailu L2 | ✓ |
|
||||
| matrix-synapse | L4 | m2r-matrix-synapse L4 | ✓ |
|
||||
| mattermost-lts | L4 | m2b-mattermost-lts L4 @baseline ref | ✓ |
|
||||
| mumble | all 5 tiers (pre-results-era) | m2r-mumble all tiers pass, deploy-count=1 | ✓ |
|
||||
| n8n | L4 | m2r-n8n L4 | ✓ |
|
||||
| plausible | L4 | m2b-plausible L4 @baseline ref + drone-path run 357 L4 | ✓ |
|
||||
| uptime-kuma | L4 | m2r-uptime-kuma L4 | ✓ |
|
||||
|
||||
HOW (cold, from the Adversary's own clone / direct on cc-ci):
|
||||
- per-recipe: `jq '{recipe,level,rungs,flags}' /var/lib/cc-ci-runs/<id>/results.json` for every id
|
||||
above; logs in /root/m2-logs/, /root/m2-baseline-logs/, /root/m2-proof-logs/, /root/m2-ab-logs/.
|
||||
- canaries: /root/m2-canary.log (7/7, fresh clone of merged main).
|
||||
- drone path: builds 356 (immich#2) + 357 (plausible#3) `custom` events SUCCESS in drone DB
|
||||
(`docker cp <drone_cid>:/data/database.sqlite` + sqlite query, as documented above); run dirs
|
||||
356/357 carry `customization` manifest keys + clean flags; triggered by real `!testme` comments
|
||||
(gitea comment ids 14317/14318).
|
||||
- M2.4 spot-greps: section above (manifest 21/21, mumble tcp probe, ghost/discourse overlay+
|
||||
BACKUP_VERIFY, lasuite deps+OIDC, immich seeds, cryptpad EXTRA_ENV hook+playwright).
|
||||
- zero-leak: `docker stack ls` on cc-ci → infra (backups/bridge/dashboard/reports/drone/traefik)
|
||||
+ warm-keycloak ONLY (checked 01:27Z, after ALL runs incl. drone-path).
|
||||
- tree: origin/main, working tree clean, every claim-referenced commit pushed.
|
||||
|
||||
EXPECTED: every check above reproduces as stated; no recipe regresses vs the corrected baseline.
|
||||
|
||||
WHERE: origin/main @ (this commit); REVIEW-rcust.md holds M1 PASS (01f9f70), be2026a approval +
|
||||
all-conditions-cleared (a531746, 24a203a); DEFERRED.md holds the two non-rcust follow-ups
|
||||
(discourse abra-stamp mechanism, bluesky-pds upstream re-pin).
|
||||
|
||||
**Gate history: M2 IN PROGRESS** — M1 PASS in REVIEW-rcust.md (01f9f70, 2026-06-10).
|
||||
|
||||
- M2.0 merge: `restructure/recipe-custom` merged to main as 01e6d49 (merge commit, no force);
|
||||
push build green: drone build **326 success** on 01e6d49 (API-verified).
|
||||
@ -127,11 +194,72 @@ sweep runs, not retroactively here.
|
||||
- M2.3 in-flight proof runs (serial queue /root/m2-proof.sh + /root/m2-proof2.sh, logs
|
||||
/root/m2-proof-logs/, driver /root/m2-proof-logs/driver.log):
|
||||
1. **lasuite-drive @baseline ref ffa7d585afa2 PR=1 on merged main @5c0676b** (post-fix-forward
|
||||
1357544) → run id m2p-lasuite-drive; EXPECTED L5 (the Adversary approval condition).
|
||||
2. **discourse @7ae7b0f PR=2 on merged main** (exact baseline-184 invocation) → m2p-discourse;
|
||||
discriminates PR=0-artifact/race vs deterministic-at-ref.
|
||||
3. **discourse @7ae7b0f PR=2 on OLD main** (/root/m2-oldmain) → ab-discourse-7ae7b0f-oldmain;
|
||||
completes the same-ref A/B the upgrade-HC1 mode is missing.
|
||||
1357544) → run id m2p-lasuite-drive: **WILL LAND L0 — second P2b regression found via this
|
||||
run, root-caused LIVE.** The 1357544 best-effort path WORKED (`!!` warn + continue in the
|
||||
log); the one-shot task went **Complete** ~3min in (bucket created); but a completed
|
||||
restart_policy-none one-shot reports replicas 0/1 FOREVER, and services_converged requires
|
||||
cur==want → the install assert burned DEPLOY_TIMEOUT (1800s) and failed. Old world never saw
|
||||
this: setup_custom_tests.sh ran POST-install-assert (its own header: orchestrator runs it
|
||||
after the deploy is healthy); P2b moved the trigger to ops.py pre_install = PRE-assert.
|
||||
Verified live during the run: app HTTP 200, all other services 1/1,
|
||||
`docker service ps ..._minio-createbuckets` = Complete, pytest in converge loop 27+ min.
|
||||
**Fix-forward proposed, awaiting Adversary approval: branch `fix/converged-oneshot` @
|
||||
be2026a** — services_converged treats a replica deficit explained ENTIRELY by Complete tasks
|
||||
as converged (Failed/mixed/spinning-up/no-tasks still block; 0/0 + N/N unchanged); pinned by
|
||||
tests/unit/test_converged_oneshot.py (7 cases). Proof: working tree on cc-ci
|
||||
`cc-ci-run -m pytest tests/unit -q` → 199 passed; lint PASS.
|
||||
**APPROVED (REVIEW a531746) and MERGED to main as 6cabbe7** (merge commit, no force);
|
||||
merged diff == be2026a diff (`git diff be2026a..main -- runner/harness/lifecycle.py
|
||||
tests/unit/test_converged_oneshot.py` = empty). Push build green: drone build **350
|
||||
success** on 914c166 (branch head incl. the merge; verify on cc-ci:
|
||||
`docker cp <drone_cid>:/data/database.sqlite /tmp/d.sqlite && sqlite3 /tmp/d.sqlite
|
||||
"select build_number,build_status,build_after from builds order by build_id desc limit 5"`).
|
||||
Post-fix re-run QUEUED: /root/m2-proof3.sh waits for the discourse A/B pair to drain, then
|
||||
runs lasuite-drive @ffa7d585afa2 PR=1 from fresh clone /root/m2-postfix @6cabbe7 →
|
||||
CCCI_RUN_ID=m2p2-lasuite-drive, log /root/m2-proof-logs/lasuite-drive-postfix.log.
|
||||
EXPECTED **L5** (binding condition 1 of the approval).
|
||||
DISCLOSED INTERVENTION: in the doomed pre-fix m2p run, after the GENERIC install assert had
|
||||
already failed at the 1800s converge deadline, the OVERLAY install test entered a second
|
||||
identical 1800s converge burn — Builder sent it (pytest pid only) SIGINT at ~01:00Z to skip
|
||||
the redundant 20+ min wait. The log therefore shows `KeyboardInterrupt` at generic.py:97
|
||||
(the converge poll — the exact diagnosed line). The orchestrator's own exit paths/teardown
|
||||
untouched; run continued to upgrade/backup/restore/custom normally. The m2p result is
|
||||
diagnostic evidence of the bug, not a baseline data point — the binding proof is m2p2.
|
||||
2. **discourse @7ae7b0f PR=2 on merged main** (exact baseline-184 invocation) → m2p-discourse:
|
||||
**COMPLETE — L2, upgrade HC1 fail, chaos-version=eb96de94+U** (identical to m2b: stamp = the
|
||||
prev-base tag commit). Deterministic at this ref on new main; NOT a PR=0 artifact, NOT a race.
|
||||
install/backup/restore/custom all pass.
|
||||
3. **discourse @7ae7b0f PR=2 on OLD main** → ab-discourse-7ae7b0f-oldmain: **COMPLETE — L2,
|
||||
upgrade HC1 fail, chaos-version=eb96de94+U — BYTE-IDENTICAL failure to the new-main run.**
|
||||
**DISCOURSE A/B CLOSED: old harness == new harness at the baseline ref + baseline invocation
|
||||
(PR=2). The upgrade-HC1 mode is HARNESS-NEUTRAL — not an rcust regression.** Baseline 184's
|
||||
L4 (06-05) vs today's identical-both-worlds failure = environment/content drift since 06-05,
|
||||
outside both harnesses. Drift candidates checked and ELIMINATED: 7ae7b0f is still a live
|
||||
branch tip in the mirror (`refs/heads/upgrade-0.8.0+3.5.0` + `refs/pull/2/head` — git
|
||||
ls-remote), and upstream's latest release tag is unchanged (0.7.0+3.3.1 = eb96de94, no new
|
||||
tag since 06-05). flake.lock (abra pin) identical in both worlds. HC1 firing rather than
|
||||
false-greening is the guard working as designed.
|
||||
Cold-verify: results.json + full logs at /var/lib/cc-ci-runs/{m2p-discourse,
|
||||
ab-discourse-7ae7b0f-oldmain}/ + /root/m2-proof-logs/discourse{,-oldmain}.log.
|
||||
4. **lasuite-drive @ffa7d585afa2 PR=1 on merged main @6cabbe7 (post-converge-fix)** →
|
||||
m2p2-lasuite-drive: **COMPLETE in 3m19s, rc=0 — all 5 stages pass, deploy-count=1,
|
||||
`test_oidc_password_grant_against_dep_keycloak` PASSED (requires_deps skip-count 0),
|
||||
`test_minio_bucket_present_and_object_roundtrip` PASSED, clean_teardown+no_secret_leak
|
||||
flags true. NO converge burn: the one-shot again exceeded its 90s window (`!!` best-effort
|
||||
line), completed late, and the install assert passed straight through — both fix-forwards
|
||||
proven end-to-end.** results.json `level=4`, NOT 5 — see schema note below.
|
||||
- **BASELINE SCHEMA NOTE (affects lasuite-docs/-drive/-meet expected "L5")**: the 6-rung ladder
|
||||
(L5 integration / L6 recipe-local) was REMOVED from main by the deliberate mainline refactor
|
||||
46e2cdb + c51cd84 ("four essential rungs only — integration & recipe-local are optional",
|
||||
PR #6, 2026-06-09 ~03:00Z) — BEFORE the rcust merge and NOT part of it (merge diff
|
||||
01e6d49^1..01e6d49 touches level.py not at all and results.py by +4 lines; current
|
||||
derive_rungs/compute_level are byte-equal to the pre-merge main versions). Every post-06-09 run
|
||||
caps at L4 BY DESIGN; the integration (OIDC) test now counts inside the functional/custom rung.
|
||||
Timeline evidence: run 204 (lasuite-meet, 06-09 pre-deploy) = 6-rung level 5; all later runs =
|
||||
4-rung. EQUIVALENCE for the baseline matrix: old "L5 (integration pass)" ≡ new "L4 all-rungs
|
||||
pass + the requires_deps OIDC test PASSED (skip-count 0)". m2p2-lasuite-drive meets it; the
|
||||
m2r sweep's lasuite-docs + lasuite-meet L4-all-pass results (with their OIDC PASSED lines,
|
||||
already in M2.4 spot-greps) meet it identically.
|
||||
- M2.4 spot-greps (customizations actually executed — log evidence in /root/m2-logs/):
|
||||
manifest block present 21/21; mumble `ready-probe OK (tcp 3x): 127.0.0.1:64738`; ghost+discourse
|
||||
`ccci-overlay: provided compose.ccci.yml ... auto-chaos` (P2a first-class path live);
|
||||
@ -161,5 +289,5 @@ sweep runs, not retroactively here.
|
||||
|
||||
## Current
|
||||
|
||||
M2 in progress: merge done (01e6d49, build 326 green); canary suite running on cc-ci; 21-recipe
|
||||
sweep queued behind it. Evidence lands here as steps complete.
|
||||
M2 CLAIMED (see Gate above) — awaiting Adversary cold-verify. No other unblocked work in this
|
||||
phase; DONE follows the M2 PASS handshake.
|
||||
|
||||
38
STATUS-shot.md
Normal file
38
STATUS-shot.md
Normal file
@ -0,0 +1,38 @@
|
||||
# STATUS-shot.md — Builder status, phase `shot`
|
||||
|
||||
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-shot-screenshots.md
|
||||
|
||||
## Current section
|
||||
|
||||
Gate: M1 CLAIMED, awaiting Adversary.
|
||||
P1 audit matrix COMPLETE (all 19 enrolled recipes, every PNG visually inspected).
|
||||
P2 diagnoses COMPLETE (see BACKLOG-shot.md P2 — each with evidence).
|
||||
Meanwhile working (unblocked, pre-M2): P3 harness default-wait improvement + unit tests.
|
||||
|
||||
## M1 claim — verification map (WHAT/HOW/EXPECTED/WHERE)
|
||||
|
||||
WHAT: M1 = full audit matrix (19/19 enrolled recipes, BACKLOG-shot.md "P1 — Audit matrix") +
|
||||
root-cause diagnosis with evidence for every non-OK row (BACKLOG-shot.md "P2") + N/A candidates
|
||||
argued (bluesky-pds: blocked-upstream N/A; mumble: explicitly NOT an N/A — real web UI).
|
||||
Claimed at commit 8978fa6 (matrix+diagnoses) — claim commit follows.
|
||||
|
||||
- Enrolled set (19): `ls tests/*/recipe_meta.py` minus fixtures `_generic, regression, concurrency,
|
||||
custom-html-bkp-bad, custom-html-rst-bad` (those first three have no recipe_meta.py; the two
|
||||
`-bad` ones do but are harness canaries).
|
||||
- Matrix: BACKLOG-shot.md "P1 — Audit matrix". Reproduce any row:
|
||||
`ssh cc-ci 'grep -o "\"screenshot\": *[^,}]*" /var/lib/cc-ci-runs/<run>/results.json; stat -c%s /var/lib/cc-ci-runs/<run>/screenshot.png'`
|
||||
then scp the PNG and Read it. Run ids are in the matrix "latest run" column.
|
||||
- plausible NULL evidence: Drone sqlite, build 357 ci step (step_id 947):
|
||||
`ssh cc-ci 'docker run --rm -v drone_ci_commoninternet_net_data:/data alpine sh -c "apk add -q sqlite; sqlite3 /data/database.sqlite \"select log_data from logs where log_id=947\"" | grep -o "screenshot[^\"]*"'`
|
||||
EXPECTED: `capture failed … last status=500` after 15 attempts/45s.
|
||||
- bluesky-pds NULL evidence: `grep '"install"' /var/lib/cc-ci-runs/m2rr-bluesky-pds/results.json`
|
||||
→ fail, level=0; capture is gated on deploy_ok (runner/run_recipe_ci.py:1024).
|
||||
- Default capture path under audit: runner/harness/screenshot.py:84-93 (domcontentloaded, no paint
|
||||
wait) — the BLANK/LOADING mechanism; accept_statuses excludes 500 — the plausible mechanism.
|
||||
- mumble web UI exists: tests/mumble/recipe_meta.py header (compose.mumbleweb.yml, HEALTH_PATH "/").
|
||||
- custom-html fresh install serves nginx default: no install_steps.sh in tests/custom-html/ (only
|
||||
pre_backup/pre_upgrade seeds in ops.py, which run AFTER the capture moment).
|
||||
|
||||
## Blocked
|
||||
|
||||
(nothing)
|
||||
@ -1,22 +0,0 @@
|
||||
# Adversary inbox — from Builder @2026-06-11T00:20Z (re: your 23:53Z asks — both in flight + new facts)
|
||||
|
||||
Both asks are queued serially on cc-ci (driver log /root/m2-proof-logs/driver.log):
|
||||
1. **lasuite-drive @ffa7d585afa2 PR=1 on merged main @5c0676b** (post-1357544) — RUNNING now,
|
||||
run id m2p-lasuite-drive, log /root/m2-proof-logs/lasuite-drive.log. Expected L5.
|
||||
2. **discourse @7ae7b0f76efb PR=2 on merged main** (exact baseline-184 invocation, vs m2b's PR=0)
|
||||
— m2p-discourse, queued behind 1.
|
||||
3. **discourse @7ae7b0f76efb PR=2 on OLD main** (/root/m2-oldmain) — ab-discourse-7ae7b0f-oldmain,
|
||||
queued behind 2. This is your same-ref A/B.
|
||||
|
||||
New facts you'll want for your cold re-verify (details + paths in STATUS-rcust.md):
|
||||
- m2b-discourse: the per-run clone is PRESERVED at /var/lib/cc-ci-runs/m2b-discourse/abra/recipes/
|
||||
discourse with HEAD=7ae7b0f — the upgrade re-checkout executed and persisted; `eb96de94` (the
|
||||
stamped chaos commit) is the prev-base tag commit 0.7.0+3.3.1. So the failure is "chaos redeploy
|
||||
left the base stamp", not "re-checkout failed" (the HC1 message's wording is its generic guess).
|
||||
- The `service "sidekiq" depends on undefined service "discourse"` line in the m2b log is NOT the
|
||||
failure: it appears verbatim in the PASSING m2r/m2rr upgrade sections (dangling depends_on ships
|
||||
in the published compose; see tests/discourse/compose.ccci.yml NOTE).
|
||||
- bluesky-pds re-characterized: all three failures (m2r, m2rr, ab-oldmain) are the SAME app
|
||||
crash-loop `Cannot find module '/app/index.js'` — upstream image moved under the pinned tag;
|
||||
no harness can deploy it. Not a pull timeout (my earlier STATUS wording was wrong, now fixed).
|
||||
grep MODULE_NOT_FOUND in the runs' abra/logs/default/.
|
||||
@ -335,3 +335,15 @@ before the build is called done) — but does **not** force closure.
|
||||
- **Re-entry trigger:** Builder authors recipe-PR Q4.7b (cache tarball on a volume / wget
|
||||
retry+backoff / drop `2>/dev/null` / `set +e` w/ fallback), then runs plausible-full green + claims.
|
||||
- **Linked:** REVIEW-2 `e850281` (root-cause + DENY), `71af595` (§4.3 floor); DECISIONS 2026-05-30.
|
||||
- discourse upgrade-HC1 @7ae7b0f stamps prev-base tag commit (eb96de94+U) on BOTH old+new harness since ~06-10 (baseline 184 was L4 on 06-05); harness-neutral (rcust exonerated, M2-closed) but abra stamp-resolution mechanism UNATTRIBUTED — worth a standalone dig outside rcust. Evidence: /var/lib/cc-ci-runs/{m2p-discourse,ab-discourse-7ae7b0f-oldmain}, JOURNAL-rcust 2026-06-11.
|
||||
- bluesky-pds: UPSTREAM IMAGE BREAKAGE (non-rcust, M2-justified exclusion from baseline match).
|
||||
The app container crash-loops `Error: Cannot find module '/app/index.js'` (MODULE_NOT_FOUND,
|
||||
Node v24.15.0) under the recipe's pinned tag on EVERY current run — new main @ mirror head
|
||||
(m2r-bluesky-pds), new main serial re-run (m2rr-bluesky-pds), AND old pre-rcust main @ old
|
||||
default head b2d86ef (ab-bluesky-pds-oldmain): identical failure on both harnesses and both
|
||||
refs → upstream re-published/moved the image under the tag; NO harness change can make this
|
||||
recipe deploy until the recipe re-pins. Baseline ("full lifecycle green", pre-results-era
|
||||
Phase-2 evidence e45e0ee) is unreproducible on any current run for reasons outside this repo.
|
||||
Evidence: `grep -r MODULE_NOT_FOUND /var/lib/cc-ci-runs/{m2r,m2rr,ab}-bluesky-pds*/abra/logs/
|
||||
default/`; REVIEW-rcust.md 2026-06-11 entries. Follow-up (post-phase): file/propose a re-pin PR
|
||||
against the bluesky-pds recipe mirror.
|
||||
|
||||
@ -18,6 +18,7 @@ missing, app slow, navigation error) is swallowed and returns None so the run/ve
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import contextlib
|
||||
import os
|
||||
|
||||
from . import browser as harness_browser
|
||||
@ -28,6 +29,55 @@ VIEWPORT = {"width": 1280, "height": 800}
|
||||
# Hard cap so a wedged app can never hang the run on the screenshot step (R7 / Phase-1 timeouts).
|
||||
NAV_DEADLINE_S = 45
|
||||
|
||||
# ---- post-navigation settle (phase-shot fix, 2026-06-11) ----
|
||||
# SPAs (immich, n8n, cryptpad, the keycloak admin console, lasuite-*, mumble-web, mattermost) fire
|
||||
# `domcontentloaded` on their empty HTML shell and only paint after the JS bundle loads — snapping
|
||||
# immediately produced solid blank frames (byte-stable 4801-2 B) or loading spinners. After nav,
|
||||
# wait for network-idle up to SETTLE_TIMEOUT_MS (apps that never go idle — continuous polling —
|
||||
# simply spend the cap; bounded, never raises), then RENDER_GRACE_MS for the final paint.
|
||||
SETTLE_TIMEOUT_MS = 10_000
|
||||
RENDER_GRACE_MS = 500
|
||||
# A 1280x800 PNG below this is near-certainly a solid frame or a bare loading spinner (phase-shot
|
||||
# audit: blank frames were 4801-2 B across three different apps, lone spinners 5.9-8.8 KB; the
|
||||
# smallest real page was 12950 B). One bounded retry with an extra settle, then keep what we get —
|
||||
# an honest late frame beats none, and the retry only ever replaces a tiny frame with a later one.
|
||||
BLANK_SIZE_BYTES = 10_000
|
||||
BLANK_RETRY_SETTLE_MS = 4_000
|
||||
# Wait-budget arithmetic (plan-phase-shot §3 P3: step worst case ≤ ~60s): NAV_DEADLINE_S (45s,
|
||||
# spent only while the app isn't serving yet) + SETTLE_TIMEOUT_MS + RENDER_GRACE_MS +
|
||||
# BLANK_RETRY_SETTLE_MS + RENDER_GRACE_MS = 60s of bounded waiting; tested in unit tests.
|
||||
|
||||
|
||||
def _settle(page, idle_timeout_ms: int) -> None:
|
||||
"""Best-effort bounded settle: network-idle up to the cap, then a short render grace.
|
||||
Never raises (R7) — a timeout just means the page kept polling; we snap what's painted."""
|
||||
# cosmetic path (R7): a timeout on a never-idle app is expected — the cap IS the wait
|
||||
with contextlib.suppress(Exception):
|
||||
page.wait_for_load_state("networkidle", timeout=idle_timeout_ms)
|
||||
with contextlib.suppress(Exception):
|
||||
page.wait_for_timeout(RENDER_GRACE_MS)
|
||||
|
||||
|
||||
def _snap_with_blank_retry(page, out_path: str) -> None:
|
||||
"""Screenshot the page; if the PNG is blank/spinner-sized, retry ONCE after a longer settle.
|
||||
The retry overwrites the tiny frame with a strictly-later one (same page, more paint time)."""
|
||||
page.screenshot(path=out_path, full_page=False)
|
||||
try:
|
||||
first = os.path.getsize(out_path)
|
||||
except OSError:
|
||||
return
|
||||
if first >= BLANK_SIZE_BYTES:
|
||||
return
|
||||
print(
|
||||
f" screenshot: frame looks blank/loading ({first} B < {BLANK_SIZE_BYTES} B) — "
|
||||
"one retry after a longer settle",
|
||||
flush=True,
|
||||
)
|
||||
_settle(page, BLANK_RETRY_SETTLE_MS)
|
||||
page.screenshot(path=out_path, full_page=False)
|
||||
with contextlib.suppress(OSError):
|
||||
print(f" screenshot: retry frame {os.path.getsize(out_path)} B", flush=True)
|
||||
|
||||
|
||||
def screenshot_path(run_artifact_dir: str) -> str:
|
||||
"""Canonical on-disk path for a run's app screenshot (pure)."""
|
||||
@ -79,7 +129,7 @@ def capture(domain: str, out_path: str, *, recipe_meta: dict | None = None) -> s
|
||||
# the uniform ctx convention (rcust P3).
|
||||
hook(page, meta_mod.hook_ctx(domain, recipe_meta))
|
||||
if not os.path.exists(out_path):
|
||||
page.screenshot(path=out_path, full_page=False)
|
||||
_snap_with_blank_retry(page, out_path)
|
||||
else:
|
||||
# Default: landing page. Accept any rendered status (200 or an auth redirect to a
|
||||
# login form) — both are credential-free and representative of "the app is up".
|
||||
@ -90,7 +140,9 @@ def capture(domain: str, out_path: str, *, recipe_meta: dict | None = None) -> s
|
||||
deadline_seconds=NAV_DEADLINE_S,
|
||||
wait_until="domcontentloaded",
|
||||
)
|
||||
page.screenshot(path=out_path, full_page=False)
|
||||
# SPA paint race fix (phase-shot): settle before snapping, retry a blank frame.
|
||||
_settle(page, SETTLE_TIMEOUT_MS)
|
||||
_snap_with_blank_retry(page, out_path)
|
||||
finally:
|
||||
browser.close()
|
||||
if os.path.exists(out_path) and os.path.getsize(out_path) > 0:
|
||||
|
||||
@ -32,6 +32,90 @@ def test_hook_returned_when_callable():
|
||||
assert S._load_screenshot_hook({"SCREENSHOT": hook}) is hook
|
||||
|
||||
|
||||
class _FakePage:
|
||||
"""Minimal Playwright-page stand-in for the settle/blank-retry helpers (no browser needed)."""
|
||||
|
||||
def __init__(self, shot_sizes, idle_raises=False):
|
||||
self._shot_sizes = list(shot_sizes) # bytes written per successive screenshot() call
|
||||
self._idle_raises = idle_raises
|
||||
self.idle_waits = [] # (state, timeout) per wait_for_load_state call
|
||||
self.timeout_waits = [] # ms per wait_for_timeout call
|
||||
self.shots = 0
|
||||
|
||||
def wait_for_load_state(self, state, timeout=None):
|
||||
self.idle_waits.append((state, timeout))
|
||||
if self._idle_raises:
|
||||
raise TimeoutError(f"page kept polling past {timeout}ms")
|
||||
|
||||
def wait_for_timeout(self, ms):
|
||||
self.timeout_waits.append(ms)
|
||||
|
||||
def screenshot(self, path, full_page=False):
|
||||
self.shots += 1
|
||||
with open(path, "wb") as f:
|
||||
f.write(b"\x89PNG" + b"\0" * (self._shot_sizes.pop(0) - 4))
|
||||
|
||||
|
||||
def test_settle_swallows_never_idle_pages():
|
||||
"""R7: an app that never reaches network-idle (continuous polling) must not raise — the
|
||||
timeout cap IS the wait."""
|
||||
page = _FakePage([], idle_raises=True)
|
||||
S._settle(page, 1234) # must not raise
|
||||
assert page.idle_waits == [("networkidle", 1234)]
|
||||
assert page.timeout_waits == [S.RENDER_GRACE_MS]
|
||||
|
||||
|
||||
def test_snap_retries_blank_frame(tmp_path):
|
||||
"""A blank-sized first frame (audit fingerprint: 4801 B) triggers exactly one retry with a
|
||||
longer settle, overwriting the tiny frame with the later (painted) one."""
|
||||
out = str(tmp_path / "shot.png")
|
||||
page = _FakePage([4801, 30256])
|
||||
S._snap_with_blank_retry(page, out)
|
||||
assert page.shots == 2
|
||||
assert page.idle_waits == [("networkidle", S.BLANK_RETRY_SETTLE_MS)]
|
||||
assert os.path.getsize(out) == 30256
|
||||
|
||||
|
||||
def test_snap_no_retry_for_real_frame(tmp_path):
|
||||
"""A real-sized first frame is kept as-is — no second screenshot, no extra waiting."""
|
||||
out = str(tmp_path / "shot.png")
|
||||
page = _FakePage([35707])
|
||||
S._snap_with_blank_retry(page, out)
|
||||
assert page.shots == 1
|
||||
assert page.idle_waits == []
|
||||
assert os.path.getsize(out) == 35707
|
||||
|
||||
|
||||
def test_snap_retry_keeps_late_frame_even_if_still_blank(tmp_path):
|
||||
"""If the retry frame is still tiny we keep it (honest best-effort) — exactly one retry,
|
||||
never a loop."""
|
||||
out = str(tmp_path / "shot.png")
|
||||
page = _FakePage([4801, 4801])
|
||||
S._snap_with_blank_retry(page, out)
|
||||
assert page.shots == 2
|
||||
assert os.path.getsize(out) == 4801
|
||||
|
||||
|
||||
def test_blank_threshold_brackets_observed_sizes():
|
||||
"""Threshold sits between the audited defect sizes (blank 4801-2 B, lone spinners up to
|
||||
8764 B) and the smallest real page (custom-html-tiny, 12950 B)."""
|
||||
for defect in (4801, 4802, 5895, 6022, 7913, 8764):
|
||||
assert defect < S.BLANK_SIZE_BYTES
|
||||
assert S.BLANK_SIZE_BYTES < 12950
|
||||
|
||||
|
||||
def test_wait_budget_within_step_cap():
|
||||
"""plan-phase-shot §3 P3: the screenshot step's bounded waiting must stay ≤ ~60s worst case."""
|
||||
total_ms = (
|
||||
S.NAV_DEADLINE_S * 1000
|
||||
+ S.SETTLE_TIMEOUT_MS
|
||||
+ S.RENDER_GRACE_MS
|
||||
+ S.BLANK_RETRY_SETTLE_MS
|
||||
+ S.RENDER_GRACE_MS
|
||||
)
|
||||
assert total_ms <= 60_000, f"screenshot wait budget {total_ms}ms exceeds the ~60s step cap"
|
||||
|
||||
|
||||
def test_screenshot_reachable_through_real_load_path(tmp_path):
|
||||
"""R2 proof (rcust P1): a recipe SCREENSHOT hook declared in recipe_meta.py arrives at
|
||||
screenshot._load_screenshot_hook through the REAL orchestrator load path (meta.load — the
|
||||
|
||||
Reference in New Issue
Block a user