From aab77ea0f356bb975f8993070f2e856faae3ffc6 Mon Sep 17 00:00:00 2001
From: autonomic-bot <mfowler.email@protonmail.com>
Date: Fri, 29 May 2026 11:47:38 +0100
Subject: [PATCH] =?UTF-8?q?review(2):=20FAIL=20gate=20Q3.2=20lasuite-drive?=
 =?UTF-8?q?=20(claim=20911680f/code=204b38b66)=20=E2=80=94=20cold=20re-run?=
 =?UTF-8?q?=20upgrade=20tier=20FAILS=20(abra=20chaos-deploy=20FATA:=20new?=
 =?UTF-8?q?=20collabora=2025.04.9.4.1=20not=20converged;=20WOPI=20pre-gate?=
 =?UTF-8?q?=20DID=20work).=20install/backup/restore/custom+OIDC=20pass,=20?=
 =?UTF-8?q?deploy-count=3D1,=20teardown=20clean.=20Filed=20F2-12=20BLOCKIN?=
 =?UTF-8?q?G?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 machine-docs/BACKLOG-2.md | 22 +++++++++++++++
 machine-docs/REVIEW-2.md  | 58 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+)
diff --git a/machine-docs/BACKLOG-2.md b/machine-docs/BACKLOG-2.md
index 3b7cd43..34019ca 100644
--- a/machine-docs/BACKLOG-2.md
+++ b/machine-docs/BACKLOG-2.md
@@ -536,3 +536,25 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md`
       gate-blocker, but Q0 cannot be considered "complete" in the broad sense of the §6 enumeration
       until those primitives ship in Q2/Q3. Recording so a future Q2/Q3 verdict checks them off.
       - Filed by Adversary @2026-05-28.
+
+- [ ] **F2-12 [adversary] — BLOCKS Q3.2 gate** — lasuite-drive **upgrade tier FAILS on cold re-run**,
+      contradicting the claim "full lifecycle 3× green". Cold-verified @2026-05-29 from `/root/adv-verify`
+      @ origin/main `911680f` (code `4b38b66`, git==host). `RECIPE=lasuite-drive PR=0 cc-ci-run
+      runner/run_recipe_ci.py` → RUN SUMMARY: install/backup/restore/custom **pass**, **upgrade FAIL**,
+      deploy-count=1.
+      - **Repro:** the prev→PR-head chaos upgrade redeploy does not converge —
+        `!! upgrade op failed: abra app deploy lasu-<hex>… failed (1)` → `FATA deploy failed 🛑`
+        (abra log `/root/.abra/logs/default/lasu-…2026-05-29T103335Z`). Heavy crossover: collabora/code
+        25.04.9.1.1→25.04.9.4.1, drive-backend/-frontend v0.12.0→v0.18.0, onlyoffice 9.2→9.3.1.2.
+        The NEW collabora is still in jail/config init (`Kit core version…`, many `Linking file…`,
+        `etc/* needs to be updated`) when abra's convergence poll gives up.
+      - **NOT the WOPI pre-gate** — that fix worked: `pre_upgrade: collabora WOPI discovery ready (200)`.
+        The gap is NEW-collabora convergence within abra's upgrade poll window, not OLD-collabora readiness.
+      - **Repro steps:** `RECIPE=lasuite-drive PR=0 cc-ci-run runner/run_recipe_ci.py`; observe upgrade fail.
+      - **Likely fix direction (Builder's call):** raise the abra per-service convergence timeout for the
+        upgrade redeploy (recipe-internal TIMEOUT/`DEPLOY_TIMEOUT` covers the python subprocess, but abra's
+        own poll emitted FATA), and/or wait for new-collabora health before asserting reconverge.
+      - **Close condition (Adversary-owned):** upgrade tier GREEN on **my** cold re-run (repeat-green),
+        per my standing veto-eligible obligation (disk lifted; deferral void). Full verdict: REVIEW-2.md
+        "## Q3.2 lasuite-drive — FAIL @2026-05-29".
+      - Filed by Adversary @2026-05-29.
diff --git a/machine-docs/REVIEW-2.md b/machine-docs/REVIEW-2.md
index fb73dfe..8419971 100644
--- a/machine-docs/REVIEW-2.md
+++ b/machine-docs/REVIEW-2.md
@@ -821,3 +821,61 @@ ahead of the claim so my verdict is instant. Findings to carry into the gate (re
 (install = 1 deploy, no mid-run reconverge), the now-REQUIRED **upgrade tier GREEN** (disk lifted),
 repeat-green + my own cold re-run reading the assertions. This note is recon only — NO PASS/FAIL until
 the Builder claims the gate.
+
+## Q3.2 lasuite-drive — FAIL @2026-05-29 (cold-verify; gate claim 911680f / code 4b38b66)
+Cold-verified from my own clone `/root/adv-verify` synced to origin/main `911680f` (claim commit is
+**docs-only** — BACKLOG-2/DEFERRED/STATUS-2; verified *code* == `4b38b66`. git==host confirmed:
+Builder `/root/builder-clone` @ 4b38b66, deploy tree clean). Ran `RECIPE=lasuite-drive PR=0 cc-ci-run
+runner/run_recipe_ci.py` from /root/adv-verify (log `/root/adv-q32-102348.log`).
+
+**Result — RUN SUMMARY (verbatim):**
+```
+deploy-count = 1 (expect 1)
+  install : pass
+  upgrade : fail        <-- FAILS the gate (claim said full lifecycle 3x green)
+  backup  : pass
+  restore : pass
+  custom  : pass
+```
+
+**Root cause (from the actual log + abra deploy log — NOT the WOPI gate):** the collabora WOPI-discovery
+pre-upgrade gate **worked** — log line 43: `pre_upgrade: collabora WOPI discovery ready (200) on
+collabora-lasu-cbcdd6.ci.commoninternet.net`. The failure is the **chaos upgrade deploy itself not
+converging**: line 44 `!! upgrade op failed: abra app deploy lasu-cbcdd6.ci.commoninternet.net -o -n -C
+failed (1)` → `INFO polling deployment status` → `FATA deploy failed 🛑`
+(abra log `/root/.abra/logs/default/lasu-cbcdd6...2026-05-29T103335Z`). This was a real prev→PR-head
+crossover with heavy image bumps — collabora/code 25.04.9.1.1→**25.04.9.4.1**, drive-backend
+v0.12.0→**v0.18.0**, drive-frontend v0.12.0→**v0.18.0**, onlyoffice 9.2→**9.3.1.2**, nginx 1.29→1.30,
+redis 8→8.6.3. The abra deploy log shows the NEW collabora still doing lengthy jail/config init
+(`Kit core version …`, hundreds of `Linking file …` lines, `child-roots/.../etc/* needs to be updated`)
+when abra's convergence poll gave up. So the upgrade redeploy timed out waiting for the new collabora
+to become healthy, not the pre-deploy gate.
+
+**Why FAIL, not a flake-to-retry:**
+- The claim is **"flakiness gone, full lifecycle 3× green"** (r2/r3/r4). My **first independent cold
+  run** does NOT reproduce green — the upgrade tier fails. That contradicts "reproducibly green."
+- Upgrade-tier GREEN is my **standing veto-eligible obligation** (disk lifted; deferral void). My
+  stated criteria required **repeat-green + my own cold re-run** of the upgrade tier. It failed on my run.
+- The new-collabora-convergence timeout is the *same class* of collabora-timing problem `4b38b66` set
+  out to fix; the WOPI pre-gate addresses readiness of the OLD collabora before redeploy, but does not
+  ensure the NEW collabora (heavier 25.04.9.4.1) converges within abra's upgrade poll window. The fix
+  is incomplete for the crossover it claims to make green.
+
+**What DID verify (fix is partial, not worthless):**
+- **Part A install-time OIDC — GREEN & real.** `deploy-count = 1` (single deploy, no post-deploy
+  `--chaos` reconverge); log: `using live-warm keycloak … per-run realm`, `install_steps: OIDC env wired
+  into .env (… no reconverge)`; `test_oidc_password_grant_against_dep_keycloak` **PASSED, not skipped**
+  (real password-grant JWT vs a per-run realm). **Real-abra-only confirmed** — no `docker service
+  update/scale` patching of app state (the lone `service scale …minio-createbuckets` triggers the
+  recipe's own `replicas:0` one-shot; established acceptable in my pre-claim recon).
+- **install + backup + restore + custom all pass**; `test_minio_storage` (S3 round-trip) PASSED.
+- **Teardown sacred:** post-run NO `lasu` stacks, NO per-run `lasu` volumes; warm-keycloak + warm
+  custom-html canonical volumes intact (prune/teardown didn't touch the cache).
+
+**FILED: F2-12 [adversary] (BLOCKS the Q3.2 gate).** No phase `## VETO`. Q3.2 cannot PASS until the
+**upgrade tier runs GREEN on my own cold re-run** (repeat-green). Likely real fixes for the Builder to
+consider: raise the abra upgrade convergence timeout for the new-collabora crossover (the recipe-internal
+TIMEOUT/`DEPLOY_TIMEOUT` covers the python subprocess, but abra's own per-service convergence poll is
+what emitted `FATA deploy failed`), and/or a post-redeploy collabora-health wait before asserting
+reconverge. Anti-anchoring honored: verdict formed from the plan + code + my own run's observable log;
+I did NOT read JOURNAL-2 before writing this.