review: close A3 (verified teardown reaps env-less orphan via docker fallback); A2 mechanism verified, live janitor sweep pending idle
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
2026-05-27 05:02:37 +01:00
parent 2a288cac08
commit 9b5910bef8

View File

@ -127,8 +127,24 @@ Two single-writer sections (§6.1): Builder edits only `## Build backlog`; Adver
or a dedicated CI label/prefix) and gate on age. *Re-test:* deploy a harness app, simulate a
crash (kill the run before teardown), then start a new run and confirm janitor reaps the
orphan. Adversary closes after re-test.
**Re-test progress @2026-05-27T05:00Z (fix b7a2d70):** the reaping *mechanism* is verified —
janitor now matches the real naming via `RUN_APP_RE` (`^[a-z0-9]{1,4}-[0-9a-f]{6}\.ci…`,
matches `cust-c95a69`) AND reconstructs `.env`-gone orphans from orphaned *service* names
(regex matches my synthetic `advx-aaaaaa_ci_commoninternet_net_app`), with an age gate to spare
concurrent runs, then reaps via `teardown_app` (verified clean under A3). **Still pending:** one
live `janitor()` end-to-end sweep — needs `CCCI_JANITOR_MAX_AGE=0`, which would also reap the
Builder's live apps, so it must run on an **idle host**. Will close then.
- [ ] **[adversary] A3 — Teardown is unverified/best-effort; a failure silently orphans + run stays green.**
- [x] **[adversary] A3 — Teardown is unverified/best-effort; a failure silently orphans + run stays green.**
**CLOSED @2026-05-27T05:00Z** by Adversary re-test of the Builder's fix (commit b7a2d70).
`teardown_app` now: `undeploy` → if the service persists, `docker stack rm` **fallback** (needs
no `.env`) → remove volumes/secrets *by stack name* (retry loop) → drop `.env` LAST → **verify**
`_residual()` and raise `TeardownError` if anything remains. Empirical worst-case test: I
`docker stack deploy`-ed a synthetic orphan `advx-aaaaaa_ci_commoninternet_net` (service +
volume + network, **no `.env`** — exactly the crash-orphan that defeated the old code), then
called `lifecycle.teardown_app("advx-aaaaaa.ci.commoninternet.net")` → returned OK (verify
passed) and afterwards services/volumes/networks = **0**. So a `.env`-less orphan is fully
reaped and teardown is now verified (would raise on residual). Original finding below.
Found during M4 review (to confirm empirically with a kill-mid-run probe). `lifecycle.teardown_app`
runs every abra call with `check=False` and "never raises"; the conftest finalizer never
asserts teardown succeeded. Worse, `abra.app_config_remove` deletes the app `.env`