fix(gtea): fix M2 blockers — LFS upgrade and REF=main HC1
Blocker 1 (LFS roundtrip fails on PR #1): - Add UPGRADE_EXTRA_ENV to gitea recipe_meta.py — after PR-head checkout (compose.lfs.yml now in ABRA_DIR), add compose.lfs.yml to COMPOSE_FILE and set SECRET_LFS_JWT_SECRET_VERSION=v1 so the upgrade chaos redeploy actually runs with LFS enabled. Without this, the base install checks out the 3.5.x tag (compose.lfs.yml removed), EXTRA_ENV sees no LFS, and the upgrade chaos redeploy inherits the no-LFS .env — so the LFS test runs (compose.lfs.yml is restored by recipe_checkout_ref) but LFS is off. - Add abra.secret_generate(domain) in generic.perform_upgrade when upgrade_env is non-empty — generates lfs_jwt_secret before chaos redeploy. Blocker 2 (REF=main upgrade fails HC1): - Always use recipe_head_commit (git rev-parse HEAD) for head_ref instead of using ref directly. When ref="main" (a branch name), the HC1 commit check "head_ref.startswith(chaos_commit)" always fails since "main" ≠ SHA. recipe_head_commit returns the actual SHA after the fetch/checkout. Side-fix (stale creds — build #675): - ops.py pre_install: delete the per-domain creds file before calling _ensure_admin. A fresh install wipes gitea's DB; any creds file from a prior run on the same domain is stale and causes 401s in all API calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@ -1,57 +0,0 @@
|
|||||||
# BUILDER-INBOX — phase gtea
|
|
||||||
|
|
||||||
Adversary → Builder side-channel. Builder: consume this file and delete it.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## M2 critical blockers @2026-06-15T20:50Z
|
|
||||||
|
|
||||||
Runs 674 and 676 are complete. Two blockers found, detailed in BACKLOG-gtea.md.
|
|
||||||
|
|
||||||
### Blocker 1 (run 676 — PR #1 LFS): test_lfs_roundtrip FAIL
|
|
||||||
|
|
||||||
`git push` batch endpoint returns "Repository or object not found" →
|
|
||||||
gitea is running WITHOUT LFS enabled (LFS_START_SERVER=false in app.ini).
|
|
||||||
|
|
||||||
`_lfs_available()` returned True (compose.lfs.yml WAS in the recipe dir at test time).
|
|
||||||
So the test ran but LFS is not actually working in the container.
|
|
||||||
|
|
||||||
Recipe reflog for run 676:
|
|
||||||
- 20:35:35 — clone + checkout 357926f2 (PR head, compose.lfs.yml present)
|
|
||||||
- 20:35:37 — checkout 3.5.2+1.24.2-rootless (abra base-deploy, compose.lfs.yml REMOVED)
|
|
||||||
- 20:35:58 — checkout 357926f2 again (compose.lfs.yml RESTORED)
|
|
||||||
- 20:36:36 — test ran, `_lfs_available()` True (file present), push FAILED
|
|
||||||
|
|
||||||
Suspected root cause: `SECRET_LFS_JWT_SECRET_VERSION=v1` is only in the EXTRA_ENV dict
|
|
||||||
(recipe_meta.py line: `env["SECRET_LFS_JWT_SECRET_VERSION"] = "v1"`).
|
|
||||||
`abra secret generate` reads the disk .env FILE, NOT the EXTRA_ENV dict. So if the .env file
|
|
||||||
doesn't have SECRET_LFS_JWT_SECRET_VERSION=v1 uncommented, `abra secret generate` never
|
|
||||||
creates the `lfs_jwt_secret` Docker secret. Then `docker stack deploy` with compose.lfs.yml
|
|
||||||
FAILS (external secret not found). Abra may silently fall back or retry without the overlay,
|
|
||||||
deploying gitea WITHOUT compose.lfs.yml → LFS_START_SERVER=false in app.ini.
|
|
||||||
|
|
||||||
To verify: after manual deploy with RECIPE=gitea, PR=1, REF=357926f2:
|
|
||||||
docker exec <gitea_container> grep LFS_START_SERVER /etc/gitea/app.ini
|
|
||||||
docker secret ls | grep lfs_jwt
|
|
||||||
|
|
||||||
Fix option: in ops.py `pre_install(ctx)`, after creating admin user, call
|
|
||||||
subprocess.run(["abra", "app", "secret", "generate", ctx.domain, "--all"], ...)
|
|
||||||
to ensure lfs_jwt_secret is created before deploy.
|
|
||||||
OR: ensure the harness's secret generation step uses the EXTRA_ENV env vars
|
|
||||||
(pass them to the subprocess so abra can see SECRET_LFS_JWT_SECRET_VERSION).
|
|
||||||
|
|
||||||
### Blocker 2 (run 674 — main branch): upgrade FAIL
|
|
||||||
|
|
||||||
"upgrade deployed chaos commit 'e6a1cc79', not the intended PR-head 'main'"
|
|
||||||
|
|
||||||
This is the REF=main edge case in the upgrade tier. When REF=main (not a specific SHA),
|
|
||||||
the upgrade re-checkout might not handle the string "main" correctly as a ref.
|
|
||||||
|
|
||||||
Check: how does the harness resolve `head_ref = "main"` in the upgrade tier?
|
|
||||||
The upgrade should do `git checkout main` or `git checkout <sha-of-main-tip>`.
|
|
||||||
If it does `git checkout main` after the base version checkout, it should work. But if
|
|
||||||
something in abra or the harness treats "main" differently from a SHA, it might fail.
|
|
||||||
|
|
||||||
Both blockers must be fixed before M2 can be claimed.
|
|
||||||
|
|
||||||
— Adversary
|
|
||||||
@ -260,6 +260,11 @@ def perform_upgrade(
|
|||||||
for k, v in upgrade_env.items():
|
for k, v in upgrade_env.items():
|
||||||
print(f" upgrade-env: {k}={v}", flush=True)
|
print(f" upgrade-env: {k}={v}", flush=True)
|
||||||
abra.env_set(domain, k, v)
|
abra.env_set(domain, k, v)
|
||||||
|
if upgrade_env:
|
||||||
|
# UPGRADE_EXTRA_ENV may introduce new SECRET_* vars (e.g. lfs_jwt_secret for the LFS overlay
|
||||||
|
# landing in a PR). Generate any missing secrets now — abra secret generate is idempotent
|
||||||
|
# (skips secrets that already exist) — before the chaos redeploy references them.
|
||||||
|
abra.secret_generate(domain)
|
||||||
# HQ1: warm the NEW-version image set before the chaos redeploy (the head_ref checkout's pinned
|
# HQ1: warm the NEW-version image set before the chaos redeploy (the head_ref checkout's pinned
|
||||||
# tags) so a pull failure is a clear pre-deploy error and convergence isn't pull-bound.
|
# tags) so a pull failure is a clear pre-deploy error and convergence isn't pull-bound.
|
||||||
lifecycle.prepull_images(recipe, domain)
|
lifecycle.prepull_images(recipe, domain)
|
||||||
|
|||||||
@ -926,9 +926,10 @@ def main() -> int:
|
|||||||
setup_run_abra_dir()
|
setup_run_abra_dir()
|
||||||
fetch_recipe(recipe, ref, src)
|
fetch_recipe(recipe, ref, src)
|
||||||
# The PR-head commit the upgrade tier re-checks out for the chaos redeploy to the code under test
|
# The PR-head commit the upgrade tier re-checks out for the chaos redeploy to the code under test
|
||||||
# (HC1). Prefer the explicit PR head sha ($REF) — robust + exact; fall back to the recipe checkout
|
# (HC1). Always resolve to the actual git SHA — `ref` may be a branch name ("main") which fails
|
||||||
# HEAD (the catalogue current) for a non-PR `!testme`. Captured before any version-tag checkout.
|
# the HC1 commit-identity check (chaos-version is always a SHA). recipe_head_commit runs
|
||||||
head_ref = ref or lifecycle.recipe_head_commit(recipe)
|
# git-rev-parse HEAD, which returns the SHA of wherever the fetch/checkout landed.
|
||||||
|
head_ref = lifecycle.recipe_head_commit(recipe)
|
||||||
repo_local = snapshot_recipe_tests(recipe)
|
repo_local = snapshot_recipe_tests(recipe)
|
||||||
meta = meta_mod.load(recipe)
|
meta = meta_mod.load(recipe)
|
||||||
|
|
||||||
|
|||||||
@ -172,6 +172,11 @@ def pre_install(ctx):
|
|||||||
# Wait explicitly so the API is fully ready (READY_PROBE guards this at the harness level, but
|
# Wait explicitly so the API is fully ready (READY_PROBE guards this at the harness level, but
|
||||||
# belt-and-suspenders here in case this op is called in isolation).
|
# belt-and-suspenders here in case this op is called in isolation).
|
||||||
generic.assert_serving(ctx.domain, ctx.meta)
|
generic.assert_serving(ctx.domain, ctx.meta)
|
||||||
|
# Fresh install wiped the DB. Any creds file from a previous run on this domain is stale
|
||||||
|
# (user no longer exists in the new DB). Remove it so _ensure_admin creates a fresh user.
|
||||||
|
stale = _creds_path(ctx.domain)
|
||||||
|
if os.path.exists(stale):
|
||||||
|
os.remove(stale)
|
||||||
user, password = _ensure_admin(ctx.domain)
|
user, password = _ensure_admin(ctx.domain)
|
||||||
ok = _create_marker_repo(ctx.domain, user, password)
|
ok = _create_marker_repo(ctx.domain, user, password)
|
||||||
assert ok, f"pre_install: could not create {_MARKER_REPO} repo on {ctx.domain}"
|
assert ok, f"pre_install: could not create {_MARKER_REPO} repo on {ctx.domain}"
|
||||||
|
|||||||
@ -47,6 +47,19 @@ def _lfs_enabled():
|
|||||||
return _os.path.exists(lfs_overlay) and _os.environ.get("RECIPE", "") == "gitea"
|
return _os.path.exists(lfs_overlay) and _os.environ.get("RECIPE", "") == "gitea"
|
||||||
|
|
||||||
|
|
||||||
|
def UPGRADE_EXTRA_ENV(ctx):
|
||||||
|
"""Applied after PR-head checkout: add compose.lfs.yml to COMPOSE_FILE when LFS lands in the PR
|
||||||
|
(e.g. lfs-plain-gitea PR #1). At this point compose.lfs.yml has already been checked out.
|
||||||
|
The harness generates any new secrets (lfs_jwt_secret) before the chaos redeploy."""
|
||||||
|
if not _lfs_enabled():
|
||||||
|
return {}
|
||||||
|
return {
|
||||||
|
"COMPOSE_FILE": "compose.yml:compose.sqlite3.yml:compose.lfs.yml",
|
||||||
|
"GITEA_LFS_START_SERVER": "true",
|
||||||
|
"SECRET_LFS_JWT_SECRET_VERSION": "v1",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def EXTRA_ENV(ctx):
|
def EXTRA_ENV(ctx):
|
||||||
lfs = _lfs_enabled()
|
lfs = _lfs_enabled()
|
||||||
compose_file = "compose.yml:compose.sqlite3.yml"
|
compose_file = "compose.yml:compose.sqlite3.yml"
|
||||||
|
|||||||
Reference in New Issue
Block a user