Files
cc-ci/machine-docs/JOURNAL-prevb.md
2026-06-17 00:37:23 +00:00

73 lines
5.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# JOURNAL — phase `prevb` (Builder reasoning; append-only)
## 2026-06-17 — Bootstrap + recon
Read SSOT (plan-phase-prevb), plan.md §6.1/§7/§9, Adversary's REVIEW-prevb (live, idle awaiting M1 claim).
**Mapped the harness upgrade flow** (`runner/run_recipe_ci.py`, `harness/lifecycle.py`,
`harness/generic.py`, `harness/meta.py`, `harness/canonical.py`):
- Base decision: `upgrade_base(stages, meta, recipe)``None` if upgrade∉stages or EXPECTED_NA[upgrade],
else `meta.UPGRADE_BASE_VERSION or lifecycle.previous_version(recipe)` (= `recipe_versions[-2]`).
`base = prev or target`; `prev` also gates whether the upgrade tier runs.
- Deploy: `deploy_app(version=base)` → pinned `recipe_checkout(version)` + (auto-chaos if overlay/lightweight tag);
`version=None` → chaos deploy of the current (head) checkout.
- Overlay `compose.ccci.yml`: copied into the checkout (`provide_ccci_overlay`), referenced by
`EXTRA_ENV.COMPOSE_FILE`, persists untracked across the head re-checkout → applies to ALL deploys.
- Upgrade op (`generic.perform_upgrade`): `recipe_checkout_ref(head_ref)` then chaos redeploy; the
ccci overlay persists → leaks version-specific pins onto the head. **That is the bug.**
- Last-green source: `canonical.read_registry(recipe)``{version, commit, status}` (promoted only on
GREEN LATEST cold runs for `WARM_CANONICAL` recipes). No separate "last-green" file.
**Ground-truth discourse facts** (gitea API, verified — see STATUS for the table). Key correction vs
plan §3 prose: main is `bitnamilegacy/discourse:3.5.0` (not 3.3.1 — main advanced). Thesis holds: base
(last-green/main = bitnamilegacy 3.5.0, deployable) → head (PR #4 = official discourse/discourse:3.5.3,
sidekiq dropped). So discourse needs NO `previous/`; the env overlay shrinks to `order: stop-first`.
**Design decisions (WHY):**
- *Resolution order* last-green → main-tip → skip. main-tip = the recipe's `main` branch HEAD = the true
predecessor the PR merges onto (more faithful than the old `vers[-2]`, which could span 2 version jumps).
This intentionally changes EVERY recipe's default base from `vers[-2]` to main-tip — plan-mandated, not a
regression; M2 spot-check validates representative recipes still go green.
- *Keep `UPGRADE_BASE_VERSION` as an optional explicit override* (still wins when set), but remove it from
discourse and make the DEFAULT dynamic. Rationale: fully deleting the meta field would break `plausible`
(its meta sets it) and the documented "PR adds a version above newest tag" escape hatch, without a deploy
test — risk vs guardrail "don't regress other recipes". The plan's "UPGRADE_BASE_VERSION removed" is in the
discourse-migration context; the normal/discourse path is now hardcode-free. Recorded in DECISIONS.
- *`previous/` scoped to last-green (published-version) base only* — version-guarded by a declared target;
on a main-tip base or version mismatch it is skipped + flagged stale. Discourse ships none (base deploys clean).
## 2026-06-17T00:30Z — M1 code done (unit+lint green); discourse e2e launched
Implemented B1B4 (commit bb2e3c6): resolve_upgrade_base/BasePlan, deploy_app base_ref+apply_previous,
previous/ surface in lifecycle, generic.perform_upgrade strip, discourse migration, unit tests.
Unit: 88 relevant pass (full suite 283 pass; 1 PRE-EXISTING unrelated fail
`test_warm_reconcile::test_traefik_spec_is_stateless_with_setup` KeyError 'health_domain' — fails on
clean HEAD, not mine; flagged for Adversary). Lint PASS.
B5 e2e launched on cc-ci (/root/prevb-deploy @ bb2e3c6), STAGES=install,upgrade, discourse PR#4
(REF=ae5a8180, SRC=recipe-maintainers/discourse). First log lines confirm the core mechanism:
`== upgrade base: kind=ref ref=f87c612d71b4 (target-branch (main) tip)` → base = main-tip chaos deploy
(bitnamilegacy:3.5.0), env overlay provided. Base now in slow Rails cold boot (15-25min). Polling ~5min.
(lint rung fail R011 = recipe-level, a rung not a gate; prepull skipped on the known sidekiq-depends-on
config rc=15 — non-fatal.)
## 2026-06-17T00:40Z — M1 GREEN locally; claiming
discourse install,upgrade e2e GREEN (2nd run, after the prune fix). Evidence in run-prevb-disc2.log on
cc-ci /root/prevb-deploy. The dynamic main-tip base worked first try (kind=ref f87c612d) — crucial,
because main (0.8.1+3.5.0) is AHEAD of the newest published tag (0.7.0+3.3.1), so the OLD vers[-2]
default (=0.6.3) would have been the wrong predecessor entirely. The upgrade moved
0.8.1+3.5.0 (bitnamilegacy, main-tip) → 1.0.0+3.5.3 (official, PR head), chaos-version=ae5a8180+U.
**The one real bug found+fixed (WHY):** first run, `test_head_runs_official_image` PASSED (head app =
official 3.5.3 — the leak is gone) but `test_sidekiq_service_dropped` FAILED: `docker stack deploy`
(what `abra app deploy` runs) only adds/updates services, it does NOT prune ones the new compose dropped,
so the base's sidekiq orphaned on the old image. This is a swarm mechanic, not a head-deploy failure, but
it means the deployed stack didn't faithfully reflect the head. Fix = `prune_orphan_services` in
perform_upgrade: reconcile the live stack to the head compose's `config --services` set (remove orphans).
Faithful (deployed stack == head), no-op when service sets match / compose unresolvable, weakens nothing.
Decided to CLAIM with the e2e green + image/sidekiq proof and leave the deliberately-broken-head teeth
probe to the Adversary's cold acceptance (its explicit M1 check; I can't push a broken commit to the
recipe mirror per guardrails). STATUS spells out where the teeth hold so they know where to probe.