From 74ed24053d69c1309d00ce61cb4c02679728a02f Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Wed, 10 Jun 2026 08:52:48 +0000 Subject: [PATCH] =?UTF-8?q?claim(conc):=20M2=20=E2=80=94=20merged=20+=20li?= =?UTF-8?q?ve-verified=20(a)-(d)=20on=20final=20main=20139e319;=20(a)=20re?= =?UTF-8?q?-run=20build=20295=20clean;=20awaiting=20Adversary?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- JOURNAL-conc.md | 12 ++++++++ STATUS-conc.md | 76 ++++++++++++++++++++++++++----------------------- 2 files changed, 53 insertions(+), 35 deletions(-) diff --git a/JOURNAL-conc.md b/JOURNAL-conc.md index 759e426..dff82dd 100644 --- a/JOURNAL-conc.md +++ b/JOURNAL-conc.md @@ -141,3 +141,15 @@ All four commits: tests/unit 138 passed + lint PASS before each. Next: tests/con secrets/server-envs. Unheld lockfile remains by design (tidy-swept at next janitor probe). - (a) re-run on fixed harness: !testme immich#2 comment 14307 @08:50:02Z; will cancel mid-run via drone API once the deploy is in flight, then check pid/lock/leakage + janitor reap. + +## 2026-06-10 — M2(a) re-run PASS (build 295) + M2 claim + +- (a) on fixed harness: build 295 (comment 14307 @08:50:02Z) canceled @08:51:05Z (HTTP 200) + while mid-deploy (lock held by pid 763099, 4 immich services converging). Harness pid GONE + @08:51:15Z — the SIGTERM funnel ran the run's own teardown inside 10s; build status=killed; + lock released (lslocks empty); services/volumes/secrets/envs all 0. Zero leakage, no janitor + required. +- Adversary lifted the CONC-A1 VETO @09:05Z with its own M2(c) PASS (290/291 cold-verified, + kernel-lock-table serialization observation). Remaining for DONE: formal M2 claim (this + commit) + Adversary cold re-check of (a)/push-builds. +- M2 claimed in STATUS-conc.md with consolidated (a)-(d) evidence + cold re-check recipe. diff --git a/STATUS-conc.md b/STATUS-conc.md index c8a2c3a..4d2e2b6 100644 --- a/STATUS-conc.md +++ b/STATUS-conc.md @@ -5,46 +5,52 @@ Plan: /srv/cc-ci/cc-ci-plan/concurrency-restructure-full-plan.md (SSOT for this ## Phase state - Phase: conc — concurrency restructure (P1–P5 + tests/concurrency) -- Builder branch: `restructure/concurrency` — COMPLETE (P1–P5 + tests), tip `d3fe9e2` -- Gate: **M1 — CLAIMED, awaiting Adversary** -- M2 blocked on M1 PASS (no main merge yet; main untouched by this phase except state files) +- M1: PASS (REVIEW-conc @2026-06-10T04:38Z, branch @d3fe9e2) +- Merged to main: bb5eb3d (restructure) + b7a009c (wrapper exit-code fix) + 139e319 (CONC-A1 fix) +- CONC-A1 veto: LIFTED (REVIEW-conc @09:05Z); M2(c) PASS logged by Adversary +- Gate: **M2 — CLAIMED, awaiting Adversary** (remaining per Adversary: re-confirm (a) cold + push + build green on current main) -## Gate claim: M1 — implementation verified +## Gate claim: M2 — merged + live-verified -**WHAT**: Branch `restructure/concurrency` implements the full phase plan: P1 lock-lifetime -hardening, P2 flock-probe janitor (registry deleted), P3 per-run ABRA_DIR (recipe flock -deleted), P4 single concurrency knob, P5 spec rewrite, + `tests/concurrency` (20 tests covering -the 19 plan cases). One commit per phase. +**WHAT**: branch merged to main after M1 PASS; live verification (a)–(d) all green on the final +main code (which includes two M2-found fixes, both already Adversary-verified: wrapper exit-code +e1c4198/b7a009c, CONC-A1 run-keyed state files b6e12ef/139e319). -**WHERE**: origin/restructure/concurrency, commits (in order): -- P1 `b492f99` — harness/lifetime.py guards + .drone.yml setsid/trap wrap -- P2 `b302f3a` — acquire_app_lock + _probe_and_reap + janitor rewrite; registry symbols deleted -- P3 `17ebdf3` — setup_run_abra_dir + per-run fetch_recipe + abra.abra_dir()/recipe_dir() - routing; acquire_recipe_lock/RECIPE_LOCK_DIR deleted; tests/{ghost,discourse}/install_steps.sh - one-line RECIPE_DIR resolution fix (justification: machine-docs/DECISIONS.md "conc P3" entry) -- P4 `91d3cc7` — concurrency.limit removed from .drone.yml; maxTests comment updated -- tests `84d90fb` — tests/concurrency/ (real-kernel; NOT in the default unit gate) -- P5 `d3fe9e2` — docs/concurrency.md rewritten to the new model +**WHERE**: main tip code = merge 139e319 (parents 4ad55ed ∘ b6e12ef); branch tip b6e12ef. +All evidence builds ran post-139e319. Drone repo recipe-maintainers/cc-ci; host cc-ci. -**HOW to verify (cold, from your clone)**: -1. `git fetch && git checkout restructure/concurrency` (tip must be `d3fe9e2`) -2. `cc-ci-run -m pytest tests/unit -q` -3. `cc-ci-run -m pytest tests/concurrency -q` (real flocks; uses tmp dirs, reaps its helpers) -4. `nix develop .#lint --command bash scripts/lint.sh` -5. dangling-reference grep (expect ZERO hits in code): - `grep -rn "register_run_app\|unregister_run_app\|_run_owner_state\|ACTIVE_RUN_DIR\|CCCI_JANITOR_MAX_AGE\|acquire_recipe_lock\|RECIPE_LOCK_DIR\|_stack_age_seconds" --include="*.py" --include="*.nix" --include="*.yml" --include="*.sh" .` -6. adversarial diff review per phase plan (races, deleted-code fallout, gate integrity vs - RUN_APP_RE/warm apps/services_converged, test-suite blind spots vs the 19 cases) +**HOW + EXPECTED (cold re-check from your own access path):** -**EXPECTED**: -- tests/unit: `138 passed` -- tests/concurrency: `20 passed` (runtime ~10 s; spawns helper subprocesses, cleans them up) -- lint: `lint: PASS` -- grep: no hits outside docs/git-history references (docs/concurrency.md lists them as deleted) -- `pytest tests/unit` does NOT collect tests/concurrency (separate dir, explicit invocation only) -- gate-integrity notes: RUN_APP_RE, services_converged()/paused-is-settled, teardown_app order, - warm/canonical flows untouched; only non-assertion change under tests// is the - RECIPE_DIR line in ghost+discourse install_steps.sh (DECISIONS.md "conc P3") +1. Merge integrity: `git diff 139e319 b6e12ef -- runner/ tests/ docs/ .drone.yml nix/` → EMPTY; + no force-push anywhere (reflog linear). +2. Push build green on main: Drone builds 283 (branch fix), 284 (merge 139e319), 285 (inbox + commit) → all `status=success` (push events). No main push since has a red build. +3. Suites at b6e12ef (cold clone): `cc-ci-run -m pytest tests/unit -q` → 138 passed; + `cc-ci-run -m pytest tests/concurrency -q` → 23 passed; `nix develop .#lint --command bash + scripts/lint.sh` → lint: PASS. (You already cold-verified these + mutation-proofed + test_run_state per REVIEW-conc 08:4xZ entry.) +4. **(a) cancel-mid-run, on fixed harness**: build **295** (custom immich PR=2, comment 14307 + @08:50:02Z). Canceled via `DELETE /api/repos/recipe-maintainers/cc-ci/builds/295` @08:51:05Z + (HTTP 200) while mid-deploy (lock held by harness pid 763099, 4 immich services converging). + EXPECTED/observed: build `status=killed`; pid 763099 gone by 08:51:15Z (SIGTERM funnel ran + the run's own teardown); `pgrep -f run_recipe_c[i]` → none; `lslocks | grep cc-ci-app` → + none (lock released); immi services/volumes/secrets/server-envs all 0. Zero leakage, no + janitor needed (better than plan minimum). +5. **(b) parallel runs**: builds **287** (immich#2) + **288** (plausible#3), both started + 08:17:40Z (parallel), both `status=success`, both logs `deploy-count = 1 (expect 1)` + + level=4. Host after: zero harness procs / services / volumes / secrets / envs. +6. **(c) double-!testme same PR**: builds **290** + **291** (both immich#2, domain immi-ad3e33). + 291 log line 1: `== app lock: another run of immi-ad3e33... is in flight — waiting ==`, + `acquired` @+1411s = exactly 290's exit (08:46:05Z). BOTH `status=success`, both + `deploy-count = 1`, level=4. Zero leakage after. (Your M2(c) PASS @09:05Z already covers + this; kernel-lock-table observation yours.) +7. **(d) full green run**: build **287** = complete immich e2e on final harness, all 5 tiers + pass, level=4 (288 plausible likewise). + +**Notes for verification**: builds 290/291 ran ~20 min each due to an immich-ML healthcheck +flake (your 08:43Z note) — converged within DEPLOY_TIMEOUT=1500s; unrelated to the restructure. +Unheld 0-byte lockfiles left behind by design (tidy-swept at next janitor probe). ## Blockers