Files
cc-ci/machine-docs/REVIEW-drone.md
autonomic-bot 3de5925614
Some checks failed
continuous-integration/drone/push Build is failing
review(drone): M1 PASS @2026-06-11T22:22Z — build run 5 L5; all DoD + ADV findings verified
Adversary M1 verdict: PASS. Evidence:

- results.json: level=5, install+upgrade+custom+lint PASS, backup_restore intentional skip,
  clean_teardown=True, no_secret_leak=True, no unintentional skips
- SCM test has teeth: ran against dep gitea @ gite-557a83 (not production); client_id
  2a4dfaba matches dep-provisioned app; wrong domain/path/client_id would fail
- DG4.1 satisfied: deploy-count=2 (expect 2)
- ADV-drone-02 CLOSED: fallback teardown from $CCCI_DEPS_FILE in finally else-branch;
  2 new unit tests; 19/19 pass; teardown-sacred §9 satisfied
- ADV-drone-03 CLOSED: _count_deploy=False reverted; run 5 confirms no violation
- All three adversary findings now closed; no open findings

Builder may proceed to M2: recipe mirrors + !testme CI run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-11 22:08:33 +00:00

11 KiB

REVIEW — phase drone (drone enrollment with gitea SCM dep)

Adversary: Adversary loop / Claude Phase plan: /srv/cc-ci/cc-ci-plan/plan-phase-drone-enroll.md Started: 2026-06-11T21:30Z


Verdicts

M1 PASS @2026-06-11T22:22Z

Build: manual run 5, host cc-ci, repo head 0aa46db
Evidence source: /tmp/drone-m1-run5.log + /var/lib/cc-ci-runs/manual/results.json on cc-ci
Level: 5 of 5

Adversary verification steps (all PASS):

  1. Results JSON independently read: level=5, install:pass, upgrade:pass, custom:pass, lint:pass, backup_restore:skip (intentional, reason="not backup-capable"), clean_teardown:True, no_secret_leak:True, skips.unintentional:[]

  2. SCM-configured test has teeth (ADV-drone-01 fix): Test ran against dep gitea at gite-557a83.ci.commoninternet.net (NOT production git.autonomic.zone). OAuth2 app client_id=2a4dfaba-f8d5-4641-b860-b56bee414c14 created by dep provisioning, wired by install_steps.sh, verified by test assertion actual_client_id == expected_client_id. A drone without gitea wiring would redirect to GitHub or 200 — test would fail.

  3. DG4.1 satisfied: deploy-count = 2 (expect 2) — recipe + gitea dep both counted. No !! error lines in run summary.

  4. ADV-drone-02 CLOSED: Fallback teardown in finally else-branch (0aa46db) confirmed in code (line 1224-1240). Two unit tests confirm data flow. TeardownError suppressed in fallback (pragmatic — run already fails on deps-not-ready). Teardown-sacred §9 satisfied.

  5. ADV-drone-03 CLOSED: _count_deploy=False removed from deps.py:deploy_deps (5384f5c). Builder fixed before formal filing. Run 5 confirms DG4.1 passes.

  6. Unit tests 19/19 PASS cold: Independently verified on cc-ci. Covers gitea/drone recipe_meta loading, _enrich_deps_with_sso routing, SCM redirect assertions (4 scenarios), deps state fallback teardown.

  7. Backup structural skip: PARITY.md documents justification. Results.json confirms skips.intentional.backup_restore = "not backup-capable (no backupbot labels / declared)". No unintentional skips.

  8. No open adversary findings: ADV-drone-01 CLOSED (verified commit 7e7e84d), ADV-drone-02 CLOSED (verified commit 0aa46db), ADV-drone-03 CLOSED (verified commit 5384f5c).

M1 PASS. Builder may proceed to M2 (recipe mirrors + !testme CI run).


Pre-verification probes (Adversary-initiated, before any Builder claim)

P0 verification — /etc/timezone on cc-ci host

Verified: 2026-06-11T21:30Z

ssh cc-ci 'test -f /etc/timezone && cat /etc/timezone'
# → UTC
ssh cc-ci 'ls -la /etc/localtime /etc/timezone'
# → /etc/localtime -> /etc/zoneinfo/UTC
# → /etc/timezone -> /etc/static/timezone (content: UTC)

Result: P0 SATISFIED. Both /etc/timezone (content UTC) and /etc/localtime exist. The gitea recipe's bind mounts (/etc/timezone:ro and /etc/localtime:ro) will succeed. The host-config fix from commit 3bde76f is live.

Pre-probe: drone recipe versions

ssh cc-ci 'abra recipe versions drone --machine'
  • Latest: 1.9.0+2.26.0 (drone/drone:2.26.0)
  • Previous: 1.8.0+2.25.0 (drone/drone:2.25.0)
  • Upgrade tier: viable (2 published versions; upgrade 1.8 → 1.9 is the natural choice)

Pre-probe: gitea recipe versions

ssh cc-ci 'abra recipe versions gitea --machine'
  • Latest: 3.5.3+1.24.2-rootless (gitea + postgres)
  • Previous: 3.5.2+1.24.2-rootless
  • Gitea uses postgres by default (not sqlite3). The sqlite3 overlay exists but is non-default.
  • The compose.sqlite3.yml sets GITEA_DB_TYPE=sqlite3 — if gitea is used as a dep without postgres, sqlite3 is the right choice (simpler dep deploy, less resource overhead).
  • Upgrade tier: viable for gitea as a dep, but the phase plan scope only requires drone's upgrade tier. Gitea as a dep is deployed at the PR version; upgrade tier for the dep is out of scope per plan §1.

Pre-probe: drone recipe structure

The compose.gitea.yml overlay requires:

  • GITEA_CLIENT_ID in .env
  • GITEA_DOMAIN in .env
  • client_secret swarm secret

The drone.env.tmpl conditionally injects DRONE_GITEA_CLIENT_SECRET from secret "client_secret" when DRONE_GITEA_CLIENT_ID is set. So the install hook must:

  1. Create gitea admin user + admin token via API
  2. Create OAuth2 application via POST /api/v1/user/applications/oauth2
  3. Set GITEA_CLIENT_ID, GITEA_DOMAIN, COMPOSE_FILE (to include compose.gitea.yml) in drone's .env
  4. Insert client_secret into drone's swarm secrets

Pre-probe: SCM-configured test teeth

The drone health endpoint /healthz returns OK regardless of SCM connectivity. This means a drone deployed WITHOUT gitea wiring would also pass a health check.

Verified the correct approach by querying the live drone instance:

curl -ski --max-redirs 0 https://drone.ci.commoninternet.net/login | grep location
# → location: https://git.autonomic.zone/login/oauth/authorize?client_id=ab4cdb9d-...&redirect_uri=...

GET /login (no-follow) → 303 redirect to <gitea-domain>/login/oauth/authorize?client_id=<id>&...

The correct "SCM-configured" test:

  1. GET https://<drone-domain>/login with allow_redirects=False
  2. Assert response is 302/303
  3. Assert Location header starts with https://<gitea-domain>/login/oauth/authorize
  4. Assert client_id query param matches the OAuth2 app we created in gitea

Why this has teeth: a drone deployed WITHOUT DRONE_GITEA_CLIENT_ID + DRONE_GITEA_SERVER (i.e., just the base compose.yml without compose.gitea.yml) would NOT redirect to the gitea domain — it would either error or redirect to a GitHub OAuth URL. The test is falsified by a misconfigured drone.

Adversary position (pre-claim): the SCM-configured test MUST use the /login redirect mechanism (or equivalent API proof of gitea wiring). A bare /healthz check is INSUFFICIENT and will be flagged as a test without teeth. The redirect target must point to the TEST-RUN gitea instance (the dep deployed by the harness), NOT to git.autonomic.zone (that would prove nothing).

Pre-probe: recipe mirrors

# drone: NOT mirrored on git.autonomic.zone/recipe-maintainers/drone (404)
# gitea: NOT mirrored on git.autonomic.zone/recipe-maintainers/gitea (404)

Both need to be mirrored before !testme can be used. Builder must follow the recipe mirror+PR flow (plan §4.1 / recipe-create-pr.md). This is expected and not a blocker — it's in scope.


Pre-claim findings (before M1 is claimed)

ADV-drone-01 — test_scm_configured redirect bug (CRITICAL)

Filed: 2026-06-11T21:37Z — see BACKLOG-drone.md for full details.

test_login_redirects_to_gitea_dep uses urllib.request.urlopen (follow-all-redirects). The chain is: drone /login → 303 → gitea OAuth authorize → 302 → gitea /user/login (unauthenticated). final_url is /user/login, so parsed.path == "/login/oauth/authorize" is always False. The test always fails, even for a correctly wired drone.

Fix: capture only drone's first redirect (no-follow pattern; capture Location header from 303).

This must be fixed before M1 can be claimed. If M1 is claimed without this fix, I will VETO.

RESOLVED @2026-06-11T21:52Z: Builder fixed in commit 7e7e84d. _CaptureOneRedirect raises HTTPError on 303, test reads Location header directly. Verified against live drone: captures /login/oauth/authorize path . Unit tests 10/10 PASS cold. ADV-drone-01 CLOSED.

ADV-drone-02 — dep orphan on SSO-enrichment failure (MEDIUM)

Filed: 2026-06-11T22:10Z — see BACKLOG-drone.md for full details.

deps_state = {} is initialised empty in main(). _provision_deps calls deploy_deps first (gitea deployed + healthy, $CCCI_DEPS_FILE written), then _enrich_deps_with_sso. If the enrichment step raises (e.g. setup_gitea_oauth API call fails), _provision_deps re-raises and the deps_state = _provision_deps(...) assignment (line 1034) never completes. In the finally block, if deps_state: is falsy → dep teardown block is entirely skipped. The gitea container and volumes are orphaned at their deterministic domain.

Teardown-sacred (§9) violated in failure path.

Required fix before M1: option A (fallback teardown from $CCCI_DEPS_FILE in the finally block when deps_state is empty) or option B (separate deploy from enrichment tracking). See BACKLOG.

CLOSED @2026-06-11T22:22Z — commit 0aa46db; 19/19 unit tests pass; code verified. See BACKLOG-drone.md § ADV-drone-02.

ADV-drone-03 — DG4.1 counter mismatch; run always exits 1 with cold dep (CRITICAL)

Filed: 2026-06-11T22:15Z — see BACKLOG-drone.md for full details.

deps.py module docstring (line 19-20) says "Dep deploys DO count toward DG4.1; expected = 1 + deps_deployed_count." But deploy_deps passes _count_deploy=False → dep deploys never increment the counter. With gitea as a cold dep: actual=1, expected=2 → DG4.1 fires → overall = 1 → CI FAIL, even when all tiers pass and level=5 is reached.

Confirmed in Builder's run 4 log (/tmp/drone-m1-run4.log): all tiers green, L5, but deploy-count 1 != 2 (DG4.1 violation).

Fix: remove _count_deploy=False from deploy_deps (deps SHOULD count per the docstring and the expected formula). Update the stale comment that contradicts the module docstring.

CLOSED @2026-06-11T22:22Z — commit 5384f5c; Builder fixed before formal filing. Run 5 confirms DG4.1 PASS. See BACKLOG-drone.md § ADV-drone-03.


Standing break-it probes

  • Verify drone WITHOUT gitea wiring fails SCM-configured test (negative control) — defer to M2 CI run; requires live deploy; structural analysis confirms install_steps.sh no-ops on absent deps file and test detects wrong netloc/path in redirect URL
  • Verify gitea teardown doesn't orphan containers when drone test fails mid-run — structural PASS for normal test failures (finally block guaranteed); GAP filed as ADV-drone-02 for SSO-enrichment failure before deps_state populated
  • Verify no secrets (OAuth client secret, admin token) appear in drone logs/dashboard — defer to M2 CI run; structural review of sso.py + install_steps.sh shows client_secret not printed in happy path; _scrub() + D6 redaction in run_redacted() provide belt-and-suspenders
  • Verify two concurrent runs don't collide on gitea/drone domains or OAuth apps — structural PASS: domain is dep_domain(parent_recipe, pr, ref, dep_recipe) — hash of 4 inputs; two concurrent !testme runs on different PRs or refs produce distinct 6-hex domains; per-run ABRA_DIR isolation prevents recipe tree conflicts