All checks were successful
continuous-integration/drone Build is passing
HC1 ✓ HC2 ✓ HC3 ✓ all Adversary cold-verified. F1e-2 (pre-existing 1d concurrent fetch race) not a 1e regression; tracked separately. Awaiting Adversary HC4 verdict → ## DONE. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
58 lines
4.7 KiB
Markdown
58 lines
4.7 KiB
Markdown
# BACKLOG — Phase 1e (generic-harness corrections)
|
||
|
||
Phase-namespaced backlog. Builder edits `## Build backlog`; Adversary edits `## Adversary findings`.
|
||
|
||
## Build backlog
|
||
- [x] **E0 / HC2** — repo-local approval allowlist (`tests/repo-local-approved.txt`, default-deny);
|
||
gate `discovery.resolve_op`/`custom_tests`/`install_steps` behind `repo_local_approved(recipe)`;
|
||
update unit tests (`tests/unit/test_discovery.py`) for approved vs non-approved.
|
||
- [x] **E1 / HC3** — generic-by-default (additive); op/assertion split. Orchestrator performs each
|
||
mutating op once; runs generic test_<op>.py (unless opt-out) + overlay test_<op>.py. Opt-out:
|
||
`CCCI_SKIP_GENERIC` / `CCCI_SKIP_GENERIC_<OP>` / `recipe_meta.SKIP_GENERIC`. Pre-op seed via
|
||
optional `tests/<recipe>/ops.py`. Migrate generic + overlays to assertion-only. Keep count==1.
|
||
- [x] **E2 / HC1** — upgrade to PR head via `abra app deploy --chaos`: deploy prev, re-checkout PR
|
||
head, chaos redeploy in place; adapt moved-assertion (chaos label proof); reconcile deploy-count.
|
||
- [x] **E3 / HC4** — docs (docs/testing.md, enroll-recipe.md) + DECISIONS; claim gates; await Adversary
|
||
cold-verify of HC1–HC4; flip STATUS-1e → ## DONE on full PASS.
|
||
|
||
## Adversary findings
|
||
|
||
- [x] **F1e-1 [adversary]** *(CLOSED @2026-05-28, fix-verified cold on commit 6eabfdc)* — *`lifecycle.exec_in_app` silently swallows a failed `docker exec`
|
||
(returns empty stdout, returncode ignored) → backup/restore data-continuity overlays go RED on a
|
||
healthy recipe when the post-op container cycle is slow.* Found cold-verifying E1/HC3 (commit
|
||
b7e6cbd) on custom-html: one opt-out run had backup=FAIL with `AssertionError: '' == 'original'`
|
||
from `tests/custom-html/test_backup.py::test_backup_captures_state` — the marker `cat` returned
|
||
empty. **CORRECTION (2026-05-28):** isolated, no-concurrency repro (3× opt-out + 1× default,
|
||
install,backup,restore) — **4/4 PASS**, deploy-count=1 each. So the opt-out flag is **NOT** the
|
||
trigger (my earlier "removes the ~1s generic-pytest timing buffer" theory is **withdrawn**); the
|
||
original symptom coincided with parallel Builder e2e runs loading the node. Real trigger: load /
|
||
concurrency slowing the post-backup container cycle into a window where `exec_in_app`'s
|
||
`docker exec` fails. The **static defect is the same** regardless of trigger.
|
||
**Root cause (static):** `exec_in_app` runs `docker exec <cid> …` and returns `proc.stdout`
|
||
**without checking `returncode`**; when backup-bot cycles the app container post-op, `docker exec`
|
||
can fail → empty stdout silently passed back as data. The backup/restore overlays read via
|
||
`exec_in_app` immediately after the cycling op with no readiness retry, despite docstrings
|
||
claiming immunity. (Secondary risk: a failed exec masquerading as `""` could also make a real
|
||
failure spuriously *pass* in a different assertion.)
|
||
**Repro (orig symptom):** under any concurrent same-recipe load, an opt-out
|
||
`STAGES=install,backup,restore` custom-html run can show `test_backup_captures_state` empty-string
|
||
AssertionError.
|
||
**Status:** Builder pushed fix at **commit 6eabfdc** — `exec_in_app` now polls (re-resolve
|
||
container + re-exec) until `rc==0` or 90s, then **raises** (never masks failed exec as empty).
|
||
No assertion weakened. Adversary fix-verification in flight on `/tmp/adv-fix`. **Closes when:**
|
||
cold-verified PASS under opt-out (and a reasonable concurrency probe), per Adversary close-rule.
|
||
|
||
- [ ] **F1e-2 [adversary]** — *Two concurrent same-recipe runs collide on `~/.abra/recipes/<recipe>`
|
||
(rm-rf + abra-fetch race).* Found during a controlled 2-concurrent custom-html test (PR=8001,
|
||
PR=8002): run-a died at `subprocess.CalledProcessError: 'abra recipe fetch custom-html -n' rc=1`;
|
||
run-b completed all-green. Cause: `runner/run_recipe_ci.py::fetch_recipe` does `rm -rf
|
||
~/.abra/recipes/<recipe>` then `abra recipe fetch <recipe> -n` — concurrent execution on the same
|
||
recipe races on the same directory. Domain/volume/secret isolation hold (different PRs ⇒ different
|
||
domains), but the shared recipe checkout is a serialisation point.
|
||
**Why it matters:** §6/D-gate requires "two concurrent !testme runs don't collide." Drone caps
|
||
`MAX_TESTS=1-2` today so practical impact is bounded, but as breadth scales (D10) this surfaces.
|
||
Pre-existing in 1d; orthogonal to E1/HC3; not blocking E1.
|
||
**Fix direction:** per-run recipe snapshot dir (`~/.abra/recipes/<recipe>` may need to be
|
||
run-scoped, or a flock around fetch+checkout, or move PR-head clones out of the shared abra dir).
|
||
**Status:** Filed for HC4 / no-regression scope.
|