Files
cc-ci/machine-docs/BACKLOG-5.md
autonomic-bot 9bad0ba671
Some checks failed
continuous-integration/drone/push Build is failing
review(5): close matrix-synapse status-gap finding
2026-06-01 18:53:31 +00:00

7.5 KiB
Raw Blame History

Phase 5 — BACKLOG

SSOT: /srv/cc-ci/cc-ci-plan/plan-phase5-verify-upgrade-flow.md. DoD = V1V9. Single-writer: ## Build backlog = Builder-only; ## Adversary findings = Adversary-only.


Build backlog

  • Create phase 5 state files (STATUS-5.md, BACKLOG-5.md, JOURNAL-5.md)
  • Fix A5-2: Add commit status posting to bridge.py (pending on trigger, success/failure on finish)
  • Fix A5-1: Add custom-html-tiny to bridge POLL_REPOS; redeploy bridge (cc-ci-bridge:3761c4221042)
  • V3: /recipe-upgrade custom-html-tiny end-to-end GREEN (!testme PASS; PR #2 open)
  • V7: mirror reconciliation (PR #1 superseded, PR #4 merged-upstream, main force-synced)
  • V1/V2: !testme trigger + testme-on-pr.sh reads verdict (GREEN on PR #2/#35; RED on PR #5/#34)
  • Fix A5-3: make POST=1 testme-on-pr.sh ignore stale prior status on same PR head
  • V4: 3-iteration regression loop (seed bad tag → RED → fix → GREEN in 2 runs)
  • V5: stale-test DEFAULT = comment, no test edit
  • V6: --with-tests opens + verifies cc-ci test PR (verify-pr.sh run)
  • V8: /upgrade-all DEFAULT run (--dry-run list + small live run)
  • V8a: cc-ci-upgrader agent (launch-upgrader.sh start/stop/status cycle)
  • V9: cleanup all verification PRs + deploys; install weekly cron (Phase 5 §4)

Adversary findings

[adversary] A5-4 — matrix-synapse stale-test/default path leaves no recipe commit status

Status: CLOSED — re-tested 2026-06-01T18:53:30Z; see REVIEW-5.md follow-up entry.

On the live V5 stale-test candidate recipe-maintainers/matrix-synapse PR #1, the PR comments show a terminal failed !testme result for build #53 plus the default-mode explanatory stale-test comment, but the recipe PR head has no cc-ci/testme commit status at all. As a result, the helper cannot read the verdict back from the PR and poll-only returns PENDING even though the PR already shows the terminal outcome.

Cold repro:

  1. Use recipe-maintainers/matrix-synapse PR #1, head 21e5d84430bdc52f8fa8aa9a40fa5bda8adf06c0.
  2. Confirm PR comments include:
    • failure result comment for build #53 (#13872), and
    • explanatory stale-test comment (#13877).
  3. Run: POST=0 MAX_WAIT=20 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh matrix-synapse 1
  4. Observe:
    • helper returns VERDICT=PENDING and BUILD=?;
    • GET /repos/recipe-maintainers/matrix-synapse/commits/21e5d84430bdc52f8fa8aa9a40fa5bda8adf06c0/status returns {"state":"","total_count":0,"statuses":null}.

Impact: this breaks the Phase-5 requirement that the upgrade tooling read the verdict back from the PR on the live stale-test/default path. The comment surface says the run is terminal; the status surface still says nothing.

Re-test result: no longer reproducible on rerun build #63. The recipe PR head now shows cc-ci/testme pending -> failure with target URL .../63, and poll-only returns VERDICT=PENDING BUILD=.../63 while in flight, then VERDICT=RED BUILD=.../63 after completion.

[adversary] A5-3 — POST=1 testme-on-pr.sh can return a stale prior GREEN on re-runs

Status: CLOSED — re-tested 2026-06-01T03:31:30Z; see REVIEW-5.md follow-up entry.

The helper currently posts a fresh !testme, then polls the recipe PR head's combined commit status. If that PR head SHA already has a previous successful cc-ci/testme status and the bridge has not yet processed the new comment, the helper exits immediately with the old GREEN/build URL instead of a fresh PENDING or the new run's URL.

This is a real Phase-5/V2 correctness bug because re-commenting !testme on the same PR head is a supported path, and the helper is meant to report the verdict for the run it just triggered.

Cold repro:

  1. Use an open PR whose current head SHA already has cc-ci/testme: success from an earlier run.
  2. Record the PR comment count.
  3. Run: POST=1 MAX_WAIT=40 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5
  4. Observe:
    • the PR comment count increases by exactly one (3 -> 4 in the reproducer), so one fresh !testme was posted;
    • the helper returns VERDICT=GREEN with the old build URL https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/37;
    • later, the live system shows a new run was actually triggered and reflected on the PR as build #41 (cc-ci/testme pending -> success, target URL /41).

Likely fix direction: after POST=1, do not trust a pre-existing terminal status on the same SHA. Poll for evidence that belongs to the newly-triggered run (e.g. a newer status timestamp, a pending status after the new comment, or a changed build URL/context generation marker) before returning.

[adversary] A5-2 — CRITICAL: testme-on-pr.sh cannot read verdicts (commit status vs comment mismatch)

Status: CLOSED — re-tested 2026-05-31T19:41:12Z; see REVIEW-5.md follow-up entry.

testme-on-pr.sh reads Gitea commit statuses on the recipe PR's head SHA. But the bridge NEVER sets Gitea commit statuses on recipe repos — it only posts PR comments (the YunoHost card+badge). Drone posts commit statuses on the cc-ci repo (its own repo), not on recipe repos.

Evidence:

  • GET /repos/recipe-maintainers/custom-html/commits/db9a95024e9d.../statusstate:'', statuses:0
  • POST=0 testme-on-pr.sh custom-html 2VERDICT=PENDING BUILD=? (always, on any known-green PR)
  • Bridge source bridge.py: no call to POST /repos/{owner}/{recipe}/statuses/{sha} anywhere

Required fix (one of):

  1. (Preferred) Bridge: after triggering a Drone build, POST state=pending on the recipe PR's head SHA; on build completion, POST state=success or state=failure with the build URL as target_url. This makes testme-on-pr.sh work unmodified, adds a native SCM status indicator.
  2. testme-on-pr.sh: scan the recipe PR's comments for the <!-- cc-ci:testme --> marker and parse the result from the comment body (fragile but avoids bridge changes).

Repro: POST=0 MAX_WAIT=60 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html 2 → always VERDICT=PENDING even after a green Drone build.

(Only Adversary closes this, after re-testing with a VERDICT=GREEN on a real green build.)

[adversary] A5-1 — custom-html-tiny not in bridge poll list

Status: CLOSED — re-tested 2026-05-31T19:41:12Z; see REVIEW-5.md follow-up entry.

The Phase 5 plan specifies using custom-html-tiny as the sandbox recipe for V3V8 tests. However the bridge's poll list (from live container logs) does NOT include recipe-maintainers/custom-html-tiny:

poller (primary) watching ['recipe-maintainers/cc-ci', 'recipe-maintainers/custom-html',
'recipe-maintainers/keycloak', 'recipe-maintainers/cryptpad', 'recipe-maintainers/matrix-synapse',
'recipe-maintainers/lasuite-docs', 'recipe-maintainers/n8n', 'recipe-maintainers/hedgedoc'] every 30s

This means !testme on a custom-html-tiny PR will NOT trigger a Drone build. Either:

  1. The builder must add custom-html-tiny to the bridge's enrolled repos list (and enroll its tests), OR
  2. Use custom-html (which IS enrolled) as the sandbox recipe instead, OR
  3. The plan's V3V8 tests must first enroll the sandbox recipe as part of Phase 5 setup

Repro: docker logs ccci-bridge_app.1.<id> 2>&1 | head -3 on cc-ci shows the poll list.

Impact: V3, V4, V5, V8 tests using custom-html-tiny as sandbox will fail silently (the !testme comment is posted but the bridge never sees it → VERDICT stays PENDING forever).

(Only Adversary closes this after re-test.)