# Sub-plan — lasuite-drive recipe robustness PR (fix the root cause upstream) **Status:** QUEUED — a **recipe-maintainer PR to the lasuite-drive recipe** (we maintain it). Picks up after the Q3.2 lasuite-drive test work settles. Complements — and largely **obsoletes** — the CI-side workarounds the harness currently uses for lasuite-drive's fragility. **Owner:** Builder + Adversary loops. **This file:** `/srv/cc-ci/cc-ci-plan/plan-lasuite-drive-recipe-pr.md` **Relationship:** this is the **recipe-side** deliverable. The cc-ci **harness-side** OIDC-at-install work is `plan-lasuite-drive-oidc-robustness.md` Part A; this sub-plan is its Part B, broken out. --- ## 0. Why (CI surfaced real recipe bugs — fix them at the source) cc-ci has surfaced genuine fragility in the lasuite-drive recipe that a **real operator would also hit**, currently papered over by CI-side workarounds: - **Install-time:** backend comes up before collabora's WOPI discovery is ready → transient **WOPI-404** + a **gunicorn-perms** startup race. The flaky 12-service `--chaos` OIDC redeploy. - **Upgrade-time (F2-12):** upgrading to the heavier new collabora (25.04.9.4.1) **does not converge within abra's monitor window** → abra FATAs. The harness currently works around this by skipping abra's convergence monitor (`-c`) and using its own collabora WOPI-200 `READY_PROBE`. These are recipe defects. Fixing them upstream helps every lasuite-drive operator **and** lets cc-ci **go back to abra's native convergence** (per the guardrail "prefer abra convergence; custom probe only when necessary") — turning the harness `-c`/READY_PROBE from a *necessity* into a *backstop*. ## 1. The fixes (lasuite-drive recipe) 1. **Collabora healthcheck + start_period (the keystone).** Add a real Docker **healthcheck** to the collabora service — WOPI discovery endpoint returns 200 — with a `start_period` generous enough for the heavy 25.04 image to boot. Effect: (a) swarm/abra see collabora as *unhealthy until WOPI is actually up*, so **abra's own convergence monitor waits correctly** (fixes F2-12 at the source — no `-c` skip needed); (b) the install-time WOPI-404 window closes because dependents can gate on collabora health. 2. **Backend tolerates / waits for collabora WOPI.** Make backend **retry WOPI discovery with backoff** (and/or order it behind collabora health) instead of failing on the transient 404. 3. **Fix the gunicorn-perms startup race.** Set the volume permissions in the backend entrypoint (or an init step) **before** exec'ing gunicorn, so there's no read/write race on a freshly-mounted volume at startup. 4. **Lazy / retrying OIDC discovery.** Backend resolves the OIDC provider **at first login with retry**, not eagerly at boot — so the app boots cleanly with OIDC env set even if the provider isn't reachable yet. (This is also what the harness-side OIDC-at-install pattern relies on, and what keeps the generic-first invariant safe.) ## 2. Mechanics — branch, PR, and the merge rule - Make the change on a **lasuite-drive recipe branch** and open a PR via the **`recipe-create-pr` skill** (`/srv/recipe-maintainer/.opencode/skills/recipe-create-pr/SKILL.md`) — mirror to `git.autonomic.zone/recipe-maintainers/lasuite-drive` as needed; upstream is `git.coopcloud.tech`. - **The PR is "working" ONLY when cc-ci verifies it green** (operator rule): trigger cc-ci (`!testme` on the lasuite-drive PR) and require the **full suite incl. the UPGRADE tier** to pass **repeatedly-green** (e.g. 3 consecutive passes, not a one-off), **Adversary cold-verified**. **Only then does the operator merge.** This dogfoods cc-ci: the CI that found the bugs gates the fix. - **SCOPE (operator, 2026-05-29):** this repeated-green / 3× bar is **specific to lasuite-drive because it was demonstrably FLAKY** — it's a *flakiness proof* (show the fix made it reliably green, not green-by-luck-once). It is **NOT the general testing standard.** Normal recipe gates remain **one Adversary cold-verified green** (`plan.md §6.1`); do not generalize 3× to other recipes/gates. ## 3. Definition of done - [ ] Recipe branch with fixes #1–#4; PR opened (recipe-create-pr). - [ ] **cc-ci runs the full suite (install + upgrade + backup + restore + custom/OIDC) on the PR, repeatedly green, Adversary cold-verified.** - [ ] **Root-cause proof:** with the collabora healthcheck in place, demonstrate the upgrade tier passes under **abra's NATIVE convergence** (i.e. drop `-c` for lasuite-drive and it still converges + stays green) — confirming the recipe fix resolved F2-12 at the source. If it still needs the harness backstop, say so honestly (record why). - [ ] Operator merges the recipe PR. Then: cc-ci can **revert the lasuite-drive `-c`/READY_PROBE workaround to abra-native convergence** (per the guardrail), and close the lasuite-drive flaky items. ## 4. Guardrails - **Don't weaken any test** to make the PR pass — the fixes must make the recipe genuinely robust, proven by repeated-green cc-ci runs, not by loosening assertions. - **Real abra path** throughout (no docker-level bypass). - **Bounded** — the four targeted robustness fixes; not a recipe rewrite. Bigger recipe improvements → upstream issues / IDEAS, not this PR.