Files
cc-ci/machine-docs/JOURNAL-1d.md
autonomic-bot ef44d4658b feat(1d): G0 — generic install + deploy-once orchestrator (DG1 green on hedgedoc)
- harness/generic.py: recipe-agnostic assert_serving (converged + real HTTP, 404-excluded +
  not Traefik 404 body + CA-verified trusted wildcard cert), op helpers, backup_capable detect
- harness/discovery.py: per-op overlay resolution (repo-local > cc-ci > generic), custom + hook
- tests/_generic/: assertion-only tiers (install/upgrade/backup/restore) on the shared deployment
- run_recipe_ci.py: deploy-ONCE orchestrator, per-op summary, deploy-count guard (DG4.1)
- conftest live_app fixture; lifecycle deploy-count + install-steps hook + pin DOMAIN to run domain

DG1 cold-verified green on hedgedoc (pure generic, deploy-count=1, clean teardown). G0 CLAIMED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 23:27:55 +01:00

69 lines
4.5 KiB
Markdown

# JOURNAL — Phase 1d (append-only)
## 2026-05-27 — Bootstrap Phase 1d
Read SSOT `plan-phase1d-generic-test-suite.md` + plan.md §6.1/§7/§9. Studied the post-1b codebase:
`runner/run_recipe_ci.py` (per-stage pytest, currently deploy-per-stage), `tests/conftest.py`
(fixtures `deployed_app`/`deployed`/`old_app` each deploy+teardown), `runner/harness/{lifecycle,abra,naming}.py`,
and existing recipe tests (custom-html/keycloak/etc.).
Access re-verified (bootstrap, new phase):
```
$ ssh cc-ci 'hostname && whoami && nixos-version'
nixos / root / 24.11.20250630.50ab793 (Vicuna)
$ ssh cc-ci 'abra --version' -> abra version 0.13.0-beta-06a57de
$ ssh cc-ci 'docker stack ls' -> traefik, drone, ccci-bridge, ccci-dashboard, backups all up
$ ssh cc-ci 'grep -ri backupbot ~/.abra/recipes/custom-html/'
compose.yml: backupbot.backup=true ; backupbot.backup.path=/usr/share/nginx/html
$ curl -u bot ... /repos/recipe-maintainers/custom-html-tiny -> 200 (mirrored)
```
So: backup-capability is detectable by scanning compose for `backupbot.backup`; custom-html-tiny is
mirrored and has NO cc-ci tests dir → it's the DG1 pure-generic target.
**Design recorded in DECISIONS.md (Phase 1d section).** Key calls: tier model with the lifecycle OP
owned by the shared harness (test files = assertions only); OVERRIDE precedence repo-local > cc-ci >
generic with extend-by-composition; deploy-ONCE with a deploy-count guard; base version = previous
(when upgrade runs) else target; backup-capability auto-detect; install-steps shell hook.
Seeded STATUS-1d / BACKLOG-1d / JOURNAL-1d. Next: implement G0 (generic.py + discovery.py +
tests/_generic/ + deploy-once orchestrator), then verify generic install green on custom-html-tiny.
## 2026-05-27 — G0 generic install + deploy-once orchestrator: DG1 GREEN
Built the G0 machinery and proved DG1 end-to-end on the real server:
- `runner/harness/generic.py``assert_serving` (services converged + real HTTP in HEALTH_OK [excludes
404] + not Traefik's 404 body + **CA-verified TLS cert is the trusted wildcard**), op helpers
(`do_upgrade`/`do_backup`/`do_restore`), `backup_capable` (scan compose for backupbot.backup).
- `runner/harness/discovery.py` — per-op overlay resolution (repo-local > cc-ci > generic), custom
test discovery (both locations, additive), install-steps hook discovery.
- `tests/_generic/test_{install,upgrade,backup,restore}.py` — assertion-only tiers using `live_app`.
- `runner/run_recipe_ci.py` — deploy-ONCE orchestrator: base version (prev if upgrade+exists else
target), tiers run against the shared deployment, one teardown in finally, deploy-count guard +
per-op summary.
- `tests/conftest.py``live_app` fixture (reads CCCI_APP_DOMAIN; tiers never deploy).
- `lifecycle.deploy_app` — deploy-count recorder + install-steps hook + **pin DOMAIN to the run
domain** (fixes recipes whose .env.sample uses `{{ .Domain }}`, which this abra leaves unexpanded).
**Two real generic bugs found+fixed via live runs (not "should work"):**
1. custom-html-tiny deploy failed: `DOMAIN={{ .Domain }}` not auto-filled by `abra app new -D` on
0.13.0-beta → `can't evaluate field Domain`. Fix: `env_set(domain,"DOMAIN",domain)` in deploy_app.
2. `served_cert_subject` used `openssl s_client`, but **openssl is not on the host** (`cc-ci-run`
runtimeInputs has no openssl) → it silently returned None → the "not default cert" check was a
no-op (a DG7 can't-fail smell). Replaced with a pure-Python **CA-verified handshake** (`ssl`):
a publicly-trusted LE wildcard verifies + matches hostname; Traefik's self-signed default fails
verification → a genuine assertion. Verified the verify path on the host:
`ssl.create_default_context()` against ci.commoninternet.net → VERIFIED, CN=*.ci.commoninternet.net,
SAN=[*.ci.commoninternet.net, ci.commoninternet.net].
**DG1 evidence (cc-ci, final code):** custom-html-tiny is a static-web-server with an empty content
volume → genuinely serves 404 zero-config (not a serving demo), so picked **hedgedoc** (simple
category, NO cc-ci/repo-local tests → pure generic; backup-capable bonus):
```
$ RECIPE=hedgedoc STAGES=install cc-ci-run runner/run_recipe_ci.py
===== TIER: install (generic: tests/_generic/test_install.py) =====
tests/_generic/test_install.py::test_serving PASSED
===== RUN SUMMARY ===== deploy-count = 1 (expect 1) install : pass
$ docker stack ls | grep hedg -> (none — clean teardown)
```
Lint+format clean (`ruff check`/`ruff format --check` via `nix develop .#lint`). Claiming the G0 gate.