claim(conc): M2 — merged + live-verified (a)-(d) on final main 139e319; (a) re-run build 295 clean; awaiting Adversary
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
autonomic-bot
2026-06-10 08:52:48 +00:00
parent 2894778810
commit 74ed24053d
2 changed files with 53 additions and 35 deletions

View File

@ -141,3 +141,15 @@ All four commits: tests/unit 138 passed + lint PASS before each. Next: tests/con
secrets/server-envs. Unheld lockfile remains by design (tidy-swept at next janitor probe).
- (a) re-run on fixed harness: !testme immich#2 comment 14307 @08:50:02Z; will cancel mid-run
via drone API once the deploy is in flight, then check pid/lock/leakage + janitor reap.
## 2026-06-10 — M2(a) re-run PASS (build 295) + M2 claim
- (a) on fixed harness: build 295 (comment 14307 @08:50:02Z) canceled @08:51:05Z (HTTP 200)
while mid-deploy (lock held by pid 763099, 4 immich services converging). Harness pid GONE
@08:51:15Z — the SIGTERM funnel ran the run's own teardown inside 10s; build status=killed;
lock released (lslocks empty); services/volumes/secrets/envs all 0. Zero leakage, no janitor
required.
- Adversary lifted the CONC-A1 VETO @09:05Z with its own M2(c) PASS (290/291 cold-verified,
kernel-lock-table serialization observation). Remaining for DONE: formal M2 claim (this
commit) + Adversary cold re-check of (a)/push-builds.
- M2 claimed in STATUS-conc.md with consolidated (a)-(d) evidence + cold re-check recipe.

View File

@ -5,46 +5,52 @@ Plan: /srv/cc-ci/cc-ci-plan/concurrency-restructure-full-plan.md (SSOT for this
## Phase state
- Phase: conc — concurrency restructure (P1P5 + tests/concurrency)
- Builder branch: `restructure/concurrency` — COMPLETE (P1P5 + tests), tip `d3fe9e2`
- Gate: **M1 — CLAIMED, awaiting Adversary**
- M2 blocked on M1 PASS (no main merge yet; main untouched by this phase except state files)
- M1: PASS (REVIEW-conc @2026-06-10T04:38Z, branch @d3fe9e2)
- Merged to main: bb5eb3d (restructure) + b7a009c (wrapper exit-code fix) + 139e319 (CONC-A1 fix)
- CONC-A1 veto: LIFTED (REVIEW-conc @09:05Z); M2(c) PASS logged by Adversary
- Gate: **M2 — CLAIMED, awaiting Adversary** (remaining per Adversary: re-confirm (a) cold + push
build green on current main)
## Gate claim: M1implementation verified
## Gate claim: M2merged + live-verified
**WHAT**: Branch `restructure/concurrency` implements the full phase plan: P1 lock-lifetime
hardening, P2 flock-probe janitor (registry deleted), P3 per-run ABRA_DIR (recipe flock
deleted), P4 single concurrency knob, P5 spec rewrite, + `tests/concurrency` (20 tests covering
the 19 plan cases). One commit per phase.
**WHAT**: branch merged to main after M1 PASS; live verification (a)(d) all green on the final
main code (which includes two M2-found fixes, both already Adversary-verified: wrapper exit-code
e1c4198/b7a009c, CONC-A1 run-keyed state files b6e12ef/139e319).
**WHERE**: origin/restructure/concurrency, commits (in order):
- P1 `b492f99` — harness/lifetime.py guards + .drone.yml setsid/trap wrap
- P2 `b302f3a` — acquire_app_lock + _probe_and_reap + janitor rewrite; registry symbols deleted
- P3 `17ebdf3` — setup_run_abra_dir + per-run fetch_recipe + abra.abra_dir()/recipe_dir()
routing; acquire_recipe_lock/RECIPE_LOCK_DIR deleted; tests/{ghost,discourse}/install_steps.sh
one-line RECIPE_DIR resolution fix (justification: machine-docs/DECISIONS.md "conc P3" entry)
- P4 `91d3cc7` — concurrency.limit removed from .drone.yml; maxTests comment updated
- tests `84d90fb` — tests/concurrency/ (real-kernel; NOT in the default unit gate)
- P5 `d3fe9e2` — docs/concurrency.md rewritten to the new model
**WHERE**: main tip code = merge 139e319 (parents 4ad55ed ∘ b6e12ef); branch tip b6e12ef.
All evidence builds ran post-139e319. Drone repo recipe-maintainers/cc-ci; host cc-ci.
**HOW to verify (cold, from your clone)**:
1. `git fetch && git checkout restructure/concurrency` (tip must be `d3fe9e2`)
2. `cc-ci-run -m pytest tests/unit -q`
3. `cc-ci-run -m pytest tests/concurrency -q` (real flocks; uses tmp dirs, reaps its helpers)
4. `nix develop .#lint --command bash scripts/lint.sh`
5. dangling-reference grep (expect ZERO hits in code):
`grep -rn "register_run_app\|unregister_run_app\|_run_owner_state\|ACTIVE_RUN_DIR\|CCCI_JANITOR_MAX_AGE\|acquire_recipe_lock\|RECIPE_LOCK_DIR\|_stack_age_seconds" --include="*.py" --include="*.nix" --include="*.yml" --include="*.sh" .`
6. adversarial diff review per phase plan (races, deleted-code fallout, gate integrity vs
RUN_APP_RE/warm apps/services_converged, test-suite blind spots vs the 19 cases)
**HOW + EXPECTED (cold re-check from your own access path):**
**EXPECTED**:
- tests/unit: `138 passed`
- tests/concurrency: `20 passed` (runtime ~10 s; spawns helper subprocesses, cleans them up)
- lint: `lint: PASS`
- grep: no hits outside docs/git-history references (docs/concurrency.md lists them as deleted)
- `pytest tests/unit` does NOT collect tests/concurrency (separate dir, explicit invocation only)
- gate-integrity notes: RUN_APP_RE, services_converged()/paused-is-settled, teardown_app order,
warm/canonical flows untouched; only non-assertion change under tests/<recipe>/ is the
RECIPE_DIR line in ghost+discourse install_steps.sh (DECISIONS.md "conc P3")
1. Merge integrity: `git diff 139e319 b6e12ef -- runner/ tests/ docs/ .drone.yml nix/` → EMPTY;
no force-push anywhere (reflog linear).
2. Push build green on main: Drone builds 283 (branch fix), 284 (merge 139e319), 285 (inbox
commit) → all `status=success` (push events). No main push since has a red build.
3. Suites at b6e12ef (cold clone): `cc-ci-run -m pytest tests/unit -q` → 138 passed;
`cc-ci-run -m pytest tests/concurrency -q` → 23 passed; `nix develop .#lint --command bash
scripts/lint.sh` → lint: PASS. (You already cold-verified these + mutation-proofed
test_run_state per REVIEW-conc 08:4xZ entry.)
4. **(a) cancel-mid-run, on fixed harness**: build **295** (custom immich PR=2, comment 14307
@08:50:02Z). Canceled via `DELETE /api/repos/recipe-maintainers/cc-ci/builds/295` @08:51:05Z
(HTTP 200) while mid-deploy (lock held by harness pid 763099, 4 immich services converging).
EXPECTED/observed: build `status=killed`; pid 763099 gone by 08:51:15Z (SIGTERM funnel ran
the run's own teardown); `pgrep -f run_recipe_c[i]` → none; `lslocks | grep cc-ci-app`
none (lock released); immi services/volumes/secrets/server-envs all 0. Zero leakage, no
janitor needed (better than plan minimum).
5. **(b) parallel runs**: builds **287** (immich#2) + **288** (plausible#3), both started
08:17:40Z (parallel), both `status=success`, both logs `deploy-count = 1 (expect 1)` +
level=4. Host after: zero harness procs / services / volumes / secrets / envs.
6. **(c) double-!testme same PR**: builds **290** + **291** (both immich#2, domain immi-ad3e33).
291 log line 1: `== app lock: another run of immi-ad3e33... is in flight — waiting ==`,
`acquired` @+1411s = exactly 290's exit (08:46:05Z). BOTH `status=success`, both
`deploy-count = 1`, level=4. Zero leakage after. (Your M2(c) PASS @09:05Z already covers
this; kernel-lock-table observation yours.)
7. **(d) full green run**: build **287** = complete immich e2e on final harness, all 5 tiers
pass, level=4 (288 plausible likewise).
**Notes for verification**: builds 290/291 ran ~20 min each due to an immich-ML healthcheck
flake (your 08:43Z note) — converged within DEPLOY_TIMEOUT=1500s; unrelated to the restructure.
Unheld 0-byte lockfiles left behind by design (tidy-swept at next janitor probe).
## Blockers