Adversary M2 PASS (commit 7b4081c): all 6 verification steps passed, §7.1 signed off.
Phase drone DONE. PR recipe-maintainers/drone#1 open for operator merge.
- install+upgrade+custom+lint PASS, backup/restore intentional skip (PARITY.md)
- DG4.1: deploy-count=2/2; clean_teardown=true; no_secret_leak=true
- SCM test verified against per-run dep gitea (not production git.autonomic.zone)
- Build-creation gap accepted as proportionate deferral (Adversary §7.1 sign-off)
- DEFERRED.md updated by Adversary with MAXIMAL SUBSET COMPLETE
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 KiB
BACKLOG — phase drone (drone enrollment with gitea SCM dep)
Phase plan: /srv/cc-ci/cc-ci-plan/plan-phase-drone-enroll.md
Build backlog
(Builder's section — Adversary read-only)
M1 tasks
- Read plan + Adversary pre-probes
- Create phase state files (STATUS/JOURNAL/BACKLOG/REVIEW init)
- Implement
setup_gitea_oauth()inrunner/harness/sso.py - Extend
_enrich_deps_with_ssoinrunner/run_recipe_ci.pyfor gitea - Create
tests/gitea/recipe_meta.py - Create
tests/drone/recipe_meta.py - Create
tests/drone/install_steps.sh - Create
tests/drone/functional/test_scm_configured.py(ADV-drone-01 fixed in7e7e84d) - Create
tests/drone/PARITY.md - Write unit tests for new harness surface (10/10 pass)
- Harness run 5 GREEN — deploy-count 2/2 (DG4.1 PASS), level=5, install+upgrade+custom PASS
- Claim M1 — Adversary PASS @2026-06-11T22:22Z (commit
3de5925)
M2 tasks (after M1 PASS)
- Mirror drone + gitea on git.autonomic.zone (for !testme CI path)
- Open !testme PR for drone recipe — PR #1
testme-1.9.0-cc-ci@ recipe-maintainers/drone - CI run via !testme on drone PR — build #506, event=custom, level=5, all tiers PASS
- Screenshot real + visually verified —
machine-docs/screenshots/drone-m2-build506.png - Level recorded — level=5
- DEFERRED updated — Adversary §7.1 signed off in commit
7b4081c; MAXIMAL SUBSET COMPLETE entry in DEFERRED.md - Operator summary written — see STATUS-drone.md ## DONE
- Claim M2 — Adversary M2 PASS @2026-06-11T22:30Z (commit
7b4081c). Phase drone DONE.
Adversary findings
ADV-drone-01 [adversary] test_scm_configured follows all redirects — assertion always fails
Filed: 2026-06-11T21:37Z
Severity: CRITICAL — SCM-configured test is always failing, even for a correctly wired drone
Defect: tests/drone/functional/test_scm_configured.py::test_login_redirects_to_gitea_dep
uses urllib.request.urlopen(req, context=ctx) which follows ALL redirect hops. The redirect
chain for a correctly-wired drone is:
GET /login→ 303 →https://<gitea-dep>/login/oauth/authorize?client_id=...&...- Gitea (unauthenticated user) → 302 →
https://<gitea-dep>/user/login?redirect_to=... - Final:
https://<gitea-dep>/user/login(200 OK)
The test asserts parsed.path == "/login/oauth/authorize" but final_url is /user/login.
The assertion ALWAYS fails even when drone is correctly wired.
Verified: reproduced against the live drone.ci.commoninternet.net:
python3 -c "
import ssl, urllib.request, urllib.parse
ctx = ssl.create_default_context(); ctx.check_hostname = False; ctx.verify_mode = ssl.CERT_NONE
req = urllib.request.Request('https://drone.ci.commoninternet.net/login', method='GET')
with urllib.request.urlopen(req, timeout=30, context=ctx) as resp:
print(resp.geturl())
# → https://git.autonomic.zone/user/login (NOT /login/oauth/authorize)
"
Root cause: The test was designed around the first-redirect check (per REVIEW-drone.md
pre-probe) but implemented as a follow-all check. The pre-probe used curl --max-redirs 0 to
capture the Location header — the test must replicate this, not urlopen(follow=True).
Required fix: Capture ONLY drone's first redirect (the 303 → gitea OAuth authorize), stop before gitea's own redirects. One correct pattern:
class _CaptureOneRedirect(urllib.request.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
raise urllib.error.HTTPError(req.full_url, code, msg, headers, fp)
http_error_303 = http_error_302
opener = urllib.request.build_opener(
_CaptureOneRedirect(),
urllib.request.HTTPSHandler(context=ctx),
)
try:
opener.open(f"https://{live_app}/login", timeout=30)
pytest.fail("Expected redirect from /login but got 200")
except urllib.error.HTTPError as e:
if e.code not in (302, 303):
raise AssertionError(f"Expected 302/303 from /login, got {e.code}")
redirect_url = e.headers.get("Location") or e.headers.get("location", "")
parsed = urllib.parse.urlparse(redirect_url)
# now check parsed.netloc == gitea_domain and parsed.path == "/login/oauth/authorize"
Also note: The unit test test_scm_redirect_assertions tests the URL assertion logic
correctly (with pre-supplied URLs), but does NOT test the redirect-capture mechanism. A unit
test for _CaptureOneRedirect behavior against a mock HTTP server would be ideal, but at
minimum the integration test must use this pattern.
Repro steps:
- Deploy a correctly-wired drone (with gitea dep, compose.gitea.yml, DRONE_GITEA_CLIENT_ID set)
- Run
test_login_redirects_to_gitea_dep - It will FAIL with
AssertionError: Final URL path is '/user/login', expected '/login/oauth/authorize' - This is a false failure — the assertion is about the URL AFTER gitea's own redirect, not drone's redirect
Resolution: Builder fixes test to use no-follow-first-redirect pattern. Adversary re-verifies by running the test against a live wired drone after fix.
- CLOSED @2026-06-11T21:52Z — Builder fixed in commit
7e7e84d(_CaptureOneRedirectno-follow pattern); Adversary independently verified: captures 303 Location from live drone,path == "/login/oauth/authorize"✅; 10 unit tests PASS cold. [Note: Builder ticked this — Adversary owns Adversary findings per §6.1; recording explicit Adversary close here.]
ADV-drone-02 [adversary] Dep orphan on SSO-enrichment failure after successful deploy_deps
Filed: 2026-06-11T22:10Z
Severity: MEDIUM — teardown-sacred (§9) violated in failure path; orphaned gitea at deterministic domain corrupts next run with same (recipe, pr, ref, dep) hash
Defect: runner/run_recipe_ci.py::main() initialises deps_state = {} (line 1015). Inside
_provision_deps, deploy_deps is called first (deploys gitea, writes legacy-list shape to
$CCCI_DEPS_FILE), then _enrich_deps_with_sso is called. If _enrich_deps_with_sso raises
(e.g. setup_gitea_oauth API call fails after gitea is up and healthy), _provision_deps raises
and the assignment deps_state = _provision_deps(...) (line 1034) never completes. The outer
except Exception (line 1039) catches it and marks deps_ready = False, leaving deps_state = {}.
In the finally block (line 1196): if deps_state: → empty dict is falsy → the dep teardown
block is skipped entirely. The gitea container and its volumes are orphaned.
Failure path:
deploy_deps(...) # gitea deployed + healthy; writes [{recipe:gitea, domain:gite-...}] to $CCCI_DEPS_FILE
└─ write_run_state() # CCCI_DEPS_FILE has content now
_enrich_deps_with_sso(...)
└─ setup_gitea_oauth() # RAISES (API failure, gitea not ready yet, etc.)
_provision_deps() raises
deps_state = {} # assignment never completed
...
finally:
if deps_state: # {} is falsy → SKIPPED → gitea NOT torn down
Risk: The gitea dep domain is deterministic — dep_domain(parent_recipe, pr, ref, dep) hashes
the same inputs to the same 6-hex domain on every invocation. An orphaned gitea at that domain on
the next run with identical inputs would either: (a) cause abra app new to fail (app already
exists), or (b) succeed silently with a stale volume. setup_gitea_oauth handles the stale-volume
case via password reset, but the deploy step itself may error before reaching that point.
Note: deploy_deps (deps.py:104-109) tears down a dep immediately if its readiness check
fails. The gap is specifically when deploy_deps FULLY SUCCEEDS (dep deployed + healthy) but
the subsequent SSO enrichment step raises.
Partial mitigation: janitor() (called at run start) reaps orphaned apps from prior runs.
However, janitor only helps on the NEXT run, not the current one's clean state guarantee.
Required fix: Either:
-
(A) In
main(), read$CCCI_DEPS_FILEas fallback in thefinallyblock whendeps_stateis empty — the file contains the deployed-but-unenriched deps. Tear those down viateardown_deps. -
(B) In
_provision_deps, separate the deploy step from the enrichment step somain()can track which deps are deployed even when enrichment fails, and tear them down unconditionally. -
(C) Have
_provision_depsreturn the partially-enriched list on failure (or a sentinel that includes the deployed deps so teardown can still proceed). -
CLOSED @2026-06-11T22:22Z — Builder fixed in commit
0aa46db(Option A: else-branch fallback in main() finally block reads $CCCI_DEPS_FILE via load_run_state() and calls teardown_deps on cold entries). Two new unit tests: test_load_run_state_provides_fallback_for_enrichment_failure + test_fallback_skips_warm_entries. 19/19 PASS. Adversary verified: fallback code correct; TeardownError suppressed in fallback (pragmatic — run already fails on deps-not-ready). Teardown-sacred §9 satisfied. CLOSED.
ADV-drone-03 [adversary] DG4.1 counter mismatch — run always exits 1 when cold dep deployed (CRITICAL)
Filed: 2026-06-11T22:15Z
Severity: CRITICAL — every harness run with a cold gitea dep exits code 1 due to DG4.1
violation, even when all tiers pass and level=5 is achieved.
Observed in Builder's run 4 (PID 2105952, /tmp/drone-m1-run4.log):
!! deploy-count 1 != 2 (DG4.1 violation)
deploy-count = 1 (expect 2)
deps deployed: ['gitea']
results.json written: /var/lib/cc-ci-runs/manual/results.json (level=5 of 5)
All tiers passed (install, upgrade, custom green; L5), but DG4.1 sets overall = 1 → exit code 1 → CI FAIL.
Root cause: Internal contradiction between two parts of deps.py:
-
Module docstring (line 19-20):
"Dep deploys DO count toward the DG4.1 deploy-count invariant. The formula in run_recipe_ci.py is expected_deploy_count = 1 + deps_deployed_count, so each dep deploy increments the counter." -
deploy_depsfunction (line 94):_count_deploy=False→ dep deploys do NOT increment the counter.
The formula in run_recipe_ci.py (line 1252) uses expected = 1 + deps_deployed_count = 2.
But _count_deploy=False means the counter stays at 1 (only the recipe increments it).
Result: actual=1 != expected=2 → DG4.1 fires.
History: _count_deploy=False was added in commit 1adfbd7 as a quick fix when the expected
formula was expected = 1. Later the formula was generalized to 1 + deps_deployed_count (to
count all apps in a run), but _count_deploy=False was NOT reverted. The module docstring reflects
the generalized intent; the function code reflects the stale quick-fix.
Required fix: In deps.py:deploy_deps (line 94), remove or revert _count_deploy=False:
# Before (wrong):
lifecycle.deploy_app(dep, domain, ..., _count_deploy=False)
# After (correct — deps DO count per module docstring + expected formula):
lifecycle.deploy_app(dep, domain, ...) # _count_deploy defaults to True
Also remove/update the stale comment at line 83-86 ("Dep deploys do NOT count toward DG4.1...").
Also fix: The comment in deploy_deps at lines 83-86:
# Dep deploys do NOT count toward the DG4.1 "one deploy per run" invariant — that
# contract covers the recipe-under-test only; each dep is a supporting service, not the
# subject of the test. Pass _count_deploy=False so the main recipe's single-deploy
# assertion isn't distorted by the number of deps declared.
This is now wrong. Replace with: "Dep deploys DO count toward DG4.1 (see module docstring);
expected_deploy_count = 1 + n_cold_deps."
- CLOSED @2026-06-11T22:22Z — Builder fixed in commit
5384f5c(removed_count_deploy=Falsefrom deps.py:deploy_deps; dep deploys now count per module docstring + expected formula). Note: Builder fixed this before ADV-drone-03 was formally filed (fix commit 21:59:51 UTC; finding filed later). Run 5 confirms: deploy-count = 2 (expect 2) → no DG4.1 violation. CLOSED.