Files
cc-ci/STATUS-dstamp.md

3.2 KiB

STATUS — phase dstamp (discourse abra-stamp drift)

Builder. SSOT: cc-ci-plan/plan-phase-dstamp-discourse-drift.md. Gates M1, M2.

Phase state: INVESTIGATING (no gate claimed yet)

What is established (direct evidence, reproducible)

  • abra is CONSTANT, not the cause. abra binary bf6azhpi…-abra-0.13.0-beta is the store path for every nixos system generation from system-4 (2026-06-01) through system-11 (now). No abra change between 06-05 and 06-10. HOW: for g in $(ls -d /nix/var/nix/profiles/system-*-link); do readlink -f "$g/sw/bin/abra"; done on cc-ci. EXPECTED: all …bf6azhpi… from system-4 on.

  • abra's chaos-version = SmallSHA(git HEAD of the recipe checkout) (++U if worktree dirty). Source: abra@06a57de cli/app/deploy.go:106,168,365-373 (chaos → toDeployVersion = Recipe.ChaosVersion()), pkg/recipe/git.go:300-318 (ChaosVersion = SmallSHA(Head())), :483-495 (Head = go-git repo.Head()). In chaos mode Recipe.Ensure early-returns (pkg/recipe/git.go:41-43) — NO env-version re-checkout.

  • The isolated git/abra path stamps CORRECTLY now. Three faithful reproductions on cc-ci (scratch ABRA_DIR, fake domain, deploys bail at secret not generated AFTER the chaos version is computed) all log taking chaos version: 7ae7b0f7 (= PR head), NOT eb96de9:

    1. cp -a canonical recipe + manual tag/head checkout.
    2. real non-chaos base deploy (go-git EnsureVersion tag checkout) → CLI re-checkout head → chaos.
    3. exact fetch_recipe replica: clone mirror recipe-maintainers/discourse @7ae7b0f + git fetch upstream refs/tags/* → base deploy → re-checkout head → chaos. HOW (variant 3, re-runnable cold): see JOURNAL-dstamp 2026-06-11 "mirror-faithful repro". EXPECTED: DEBU app/deploy.go:372 version: taking chaos version: 7ae7b0f7.
  • Same ref, solo run was GREEN; clustered runs DRIFTED. discourse @ ref 7ae7b0f76efb: run 184 (2026-06-05 02:17, solo) = L4, upgrade PASS; the 06-10/06-11 runs m2b-discourse (06-10 20:54), m2p-discourse (06-11 00:44), ab-discourse-7ae7b0f-oldmain (06-11 00:48) = L1, upgrade FAIL (chaos commit 'eb96de94+U', not the intended PR-head '7ae7b0f76efb' (HC1)). HOW: grep -oE '"level": [0-9]+|"upgrade": "[a-z]+"' /var/lib/cc-ci-runs/{184,m2p-discourse}/results.json.

  • All same-ref discourse runs share ONE swarm stack. naming.app_domain(recipe,pr,ref) = <recipe[:4]>-<6hex(recipe|pr|ref)>.ci.commoninternet.net → identical for identical (recipe,pr,ref). The upgrade chaos_redeploy bypasses deploy_app's app-domain flock (lifecycle.chaos_redeploy / generic.perform_upgrade). LEADING HYPOTHESIS: the 06-10/06-11 drift is a CONCURRENCY ARTIFACT of the clustered rcust-M2 A/B discourse experiments racing on the shared stack — NOT an abra/recipe/env regression. Under test now.

In flight

  • Isolated clean real run (CCCI_RUN_ID=dstamp-repro1, STAGES=install,upgrade, ref 7ae7b0f, no concurrent discourse run) with full console capture → decides: isolated real run GREEN (⇒ concurrency artifact) vs DRIFT (⇒ read exact console). Console: /var/lib/cc-ci-runs/dstamp-repro1.console.log on cc-ci.

Blocked

  • (none)