Reboot survival for the Pi orchestrator host: - systemd unit cc-ci-plan/systemd/cc-ci-loops.service (installed + enabled): on boot records the reboot, starts loops+watchdog (RESUME_PHASE=1), and resumes the orchestrator session. - reboot-log.sh: boot_id-gated reboot record -> REBOOTS.md (manual restarts don't count). - launch-orchestrator.sh: injects an AGENTS.md startup nudge so an auto-resumed orchestrator announces itself (PushNotification) + reports reboots. - AGENTS.md: on-startup notify routine documented. Plans/tooling accumulated this session: - plan-phase1d (generic suite), 1e (harness corrections), phase4 (final review), sso-dep-testing, orchestrator-migration (parked), test-e2e-testme-acceptance. - launch.sh: 1d/1e/2/2b/3/4 phase sequence, machine-docs-aware state resolution, limit-stall re-nudge, INBOX side-channel detection. - plan.md §6.1/§7: artifact-layer isolation, INBOX, 5-min long-run polling, DEFERRED. - prompts: isolation discipline + INBOX + pacing. - .gitignore: harden (.sops/, cc-ci-secrets/, .claude/, *.tmp.*). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
43 lines
4.3 KiB
Markdown
43 lines
4.3 KiB
Markdown
# cc-ci-plan
|
||
|
||
Self-contained handoff package for building the **cc-ci** Co-op Cloud recipe CI server with two
|
||
autonomous Claude loops (a Builder and an adversarial Reviewer) running over days.
|
||
|
||
## Start here
|
||
|
||
1. Read **`plan.md`** — the full plan and single source of truth (mission, Definition of Done,
|
||
architecture, milestones, the two-agent coordination protocol, loop discipline).
|
||
2. Read **`kickoff.md`** — how to launch and supervise the loops.
|
||
3. Run **`./launch.sh start`** to bring up both loops + the watchdog.
|
||
|
||
## Files
|
||
|
||
| File | Purpose |
|
||
|---|---|
|
||
| `plan.md` | The Phase-1 plan (build the CI server). Agents treat it as their single source of truth. |
|
||
| `plan-phase1c-full-reproducibility.md` | **Phase 1c** (runs first): make the VM fully reproducible from git (all secrets incl. the wildcard cert in sops, in a separate private `cc-ci-secrets` repo as a flake input; base stays well-parameterized) and do the **genuine throwaway-VM live rebuild** to close D8 honestly (the "infeasible by design" was overstated). |
|
||
| `plan-phase1b-review-lint.md` | **Phase 1b** (after 1c): deterministic linting/formatting in CI + a white-box review checklist (real tests, DRY harness, idempotent Nix, no footguns/secrets), ending in a full cold re-verification of all D1–D10 — now covering 1c's refactor. |
|
||
| `plan-phase1d-generic-test-suite.md` | **Phase 1d** (after 1b, before 2): a **generic install/upgrade/backup/restore** suite that runs on *any* recipe with zero config, with a recipe's own `test_<op>.py` **overriding or extending** the generic (Builder's call) and **reusing the generic's deployment — no redeploy**, plus optional custom install-steps; recipes needing special setup fail the generic form gracefully. The test-architecture foundation Phase 2 builds on. |
|
||
| `plan-phase1e-harness-corrections.md` | **Phase 1e** (after 1d, before 2): three operator-review corrections to the shared generic harness — (HC1) upgrade goes previous-release → **PR head** via `deploy --chaos`; (HC2) **repo-local PR code runs only for approved recipes** (default = cc-ci overlays + generic only); (HC3) the **generic runs by default** alongside an overlay, skipped only via explicit opt-out. |
|
||
| `plan-phase2-recipe-tests.md` | **Phase 2** (after Phase 1e): build on the corrected generic suite — author the recipe overlays (port recipe-maintainer tests as `test_*.py`) + define custom install steps where a recipe fails generically. |
|
||
| `plan-phase2b-test-performance.md` | **Phase 2b** (after Phase 2, before Phase 3): empirically measure where test time goes and reduce it (image cache, readiness tuning, dedup deploys, warm infra, concurrency) — no weakened tests. |
|
||
| `plan-phase3-results-ux.md` | **Phase 3** (after Phase 2b): beautiful YunoHost-style results — per-run **level**, image-forward PR comment (badge + summary card + app screenshot), polished dashboard. |
|
||
| `IDEAS.md` | Deferred/future ideas, parked out of current scope. |
|
||
| `brief.md` | The original one-page brief (context only; `plan.md` supersedes it). |
|
||
| `kickoff.md` | Launch & supervision guide. |
|
||
| `launch.sh` | Starts both loops + a watchdog; restarts dead loops; stops on `## DONE`. |
|
||
| `prompts/builder.md` | Builder loop prompt (fed to `claude` by the script). |
|
||
| `prompts/adversary.md` | Adversary loop prompt. |
|
||
|
||
## Before launching
|
||
|
||
- Set the org in `plan.md` (`git.autonomic.zone/recipe-maintainers/cc-ci`) and lock the six proof recipes (§8).
|
||
- Ensure the launching shell has: SSH+sudo to `cc-ci`, the Gitea token, `git.autonomic.zone` access.
|
||
- Preconfigure test-app DNS + TLS (plan §4.0): point a wildcard `*.ci.commoninternet.net` record at a gateway that TLS-passthroughs to cc-ci, and **pre-issue the wildcard cert** (`*.ci.commoninternet.net` + `ci.commoninternet.net`, via Gandi DNS-01) into `/var/lib/ci-certs/live/` on cc-ci. The agent handles everything else on cc-ci (Traefik file provider → that cert, swarm, routing) and does **no ACME**; renewal (~90 days) is an out-of-band operator task, so the DNS token never goes to the agent.
|
||
- `export CC_CI_REPO=https://git.autonomic.zone/recipe-maintainers/cc-ci.git` so the watchdog can detect `## DONE`.
|
||
|
||
## What "done" means
|
||
|
||
The loops stop only when all of `plan.md` §2 (D1–D10) hold **and** the Adversary has independently
|
||
re-verified each within 24h. The watchdog then tears the loops down automatically.
|