M0: flake + base NixOS config, rebuilt from repo on cc-ci
Pins nixpkgs to the rev cc-ci already ran (no-op-then-base); deploy via switch --flake on-host. System healthy (gen 3) post-switch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -6,10 +6,11 @@ Two single-writer sections (§6.1): Builder edits only `## Build backlog`; Adver
|
||||
## Build backlog
|
||||
|
||||
### M0 — Foundations
|
||||
- [ ] Author flake.nix (NixOS host cc-ci) + hosts/cc-ci/{configuration,hardware}.nix from baseline
|
||||
- [ ] Deploy mechanism decision + first rebuild from repo (DECISIONS.md)
|
||||
- [x] Author flake.nix (NixOS host cc-ci) + hosts/cc-ci/{configuration,hardware}.nix from baseline
|
||||
- [x] Deploy mechanism decision + first rebuild from repo (DECISIONS.md) — switch --flake on host
|
||||
- [ ] sops-nix wiring: host age key, secrets/secrets.yaml, decrypt a test secret on host
|
||||
- [ ] Gate: M0 — `ssh cc-ci 'systemctl is-system-running'` healthy after rebuild from repo
|
||||
- [ ] Gate: M0 — `ssh cc-ci 'systemctl is-system-running'` healthy after rebuild from repo (base
|
||||
rebuild verified healthy 2026-05-26; will CLAIM gate once sops test-secret also lands)
|
||||
|
||||
### M1 — Swarm + abra target
|
||||
- [ ] Docker + single-node swarm via Nix
|
||||
|
||||
23
DECISIONS.md
23
DECISIONS.md
@ -12,10 +12,17 @@ Architecture decisions and dead-ends. One line of rationale each. (§0, §8)
|
||||
|
||||
## Open (defaults from §8, to confirm as reality lands)
|
||||
|
||||
- **Deploy mechanism:** TBD in M0. Leaning `nixos-rebuild switch --flake` run *on cc-ci itself*
|
||||
(repo cloned on host) rather than `--target-host`/deploy-rs from the sandbox, to avoid copying
|
||||
large Nix closures over the userspace-tailscaled SOCKS proxy. Atomic-rollback is preserved by
|
||||
Nix generations. Will record final choice + rationale when M0 lands.
|
||||
- **Deploy mechanism — SETTLED (M0):** `nixos-rebuild switch --flake /root/cc-ci#cc-ci` run *on
|
||||
cc-ci itself*, with the repo materialised on the host at `/root/cc-ci`. Chosen over
|
||||
`--target-host`/deploy-rs to avoid pushing large closures over the userspace-tailscaled SOCKS
|
||||
proxy (slow/fragile). Atomic rollback preserved by Nix generations (`nixos-rebuild --rollback`).
|
||||
The switch is launched as a **detached transient systemd unit** (`systemd-run --unit=ccci-rebuild
|
||||
--collect`) so it survives a momentary ssh-over-tailscale drop during activation. For the build
|
||||
loop the host copy is synced from the sandbox clone via `tar | ssh` (rsync absent on host);
|
||||
source of truth stays the git repo. D8/install.md will document the from-scratch path (clone repo
|
||||
on a fresh host, then `nixos-rebuild switch --flake .#cc-ci`).
|
||||
- **nixpkgs pin:** flake pins the exact rev cc-ci already ran (`50ab793…`) so the first rebuild
|
||||
is a true no-op-then-base. Bump deliberately, never drift.
|
||||
- **Webhook scope:** default per-repo via enroll script.
|
||||
- **Drone runner type:** default exec (must drive host abra).
|
||||
- **Secret tool:** default sops-nix.
|
||||
@ -25,9 +32,11 @@ Architecture decisions and dead-ends. One line of rationale each. (§0, §8)
|
||||
|
||||
## Risks
|
||||
|
||||
- **Disk:** cc-ci has only ~3.8 GiB free on an 8.9 GiB root. Multiple recipe images + volumes may
|
||||
exhaust it during M6.5 breadth. Mitigation: aggressive teardown + image prune; if insufficient,
|
||||
request operator grow the VM disk (Incus, recreatable per the incus skill). Not yet blocking.
|
||||
- **Disk — RESOLVED 2026-05-26.** Original 8.9 GiB root had only ~3.8 GiB free *and* a hard
|
||||
**inode** ceiling (586k total, ~6k free) — the flake's nixpkgs fetch (~50k files) hit ENOSPC on
|
||||
inodes before bytes. Operator grew the VM to **28 GiB** (22 GiB free, 1.78M inodes / 1.21M free);
|
||||
the ext4 fs auto-resized (new block groups carry proportional inodes). Keep aggressive teardown +
|
||||
periodic `docker image prune` to avoid regressing during M6.5 breadth.
|
||||
|
||||
## Dead-ends
|
||||
- (none yet)
|
||||
|
||||
36
JOURNAL.md
36
JOURNAL.md
@ -22,3 +22,39 @@
|
||||
- Seeded skeleton layout (§3) + loop-state files + docs/baseline.md.
|
||||
|
||||
**Next:** commit + push bootstrap, then M0 (flake + base config + sops test secret).
|
||||
|
||||
## 2026-05-26 — M0: flake + base config rebuilt from repo
|
||||
|
||||
**Authored** `flake.nix` (pins nixpkgs rev `50ab793786d9…`, the exact rev cc-ci ran),
|
||||
`hosts/cc-ci/hardware.nix` (incus VM module + cloud-init + DHCP/nameservers) and
|
||||
`hosts/cc-ci/configuration.nix` (faithful baseline repro: tailscale w/ hardcoded `--hostname=
|
||||
cc-nix-test` since `builtins.readFile /etc/ts-hostname` is impure under flakes; sshd root; firewall
|
||||
trust tailscale0 + tcp/22; base pkgs).
|
||||
|
||||
**Disk/inode hiccup → resolved:** first `nix flake lock`/build hit `No space left on device` —
|
||||
diagnosed as **inode** exhaustion (`df -i` → 6005 free of 586336; old 8.9 GiB fs). Operator grew
|
||||
the VM to 28 GiB while I was measuring; ext4 auto-resized → 22 GiB free, 1.21M inodes free. Retried.
|
||||
|
||||
**Build + switch (commands + output):**
|
||||
- `ssh cc-ci 'cd /root/cc-ci && nix flake lock && nixos-rebuild build --flake .#cc-ci'` → `BUILD EXIT 0`,
|
||||
produced `nixos-system-nixos-24.11.20250630.50ab793`.
|
||||
- `ssh cc-ci 'systemd-run --unit=ccci-rebuild --collect --property=Type=oneshot nixos-rebuild switch
|
||||
--flake /root/cc-ci#cc-ci'` (detached so it survives ssh drop) → unit `Result=success
|
||||
ExecMainStatus=0`.
|
||||
|
||||
**Gate verification:**
|
||||
- `systemctl is-system-running` → `running`
|
||||
- `readlink /run/current-system` → `…-nixos-system-nixos-24.11.20250630.50ab793` (gen 3, from flake)
|
||||
- `systemctl is-active tailscaled` → `active`; `sshd.socket` → `active` (sshd is socket-activated, so
|
||||
`sshd.service` reads inactive — live ssh proves it works)
|
||||
- `systemctl --failed` → none
|
||||
- `nixos-rebuild list-generations` → gen 3 current @20:23, prior channel gen 2 retained for rollback.
|
||||
|
||||
**Known warning (tracked, non-blocking):** incus module enables `systemd.network` while we keep
|
||||
`networking.useDHCP=true` (scripted dhcpcd); Nix warns both may manage interfaces. Inherited from
|
||||
baseline; networking is up. Clean up by choosing one stack later.
|
||||
|
||||
**Deploy mechanism settled** (DECISIONS.md): `switch --flake` on-host, repo synced via `tar | ssh`.
|
||||
|
||||
**Next:** sops-nix wiring (host age key from ssh host key + a decrypt-a-test-secret proof), then
|
||||
CLAIM the M0 gate for the Adversary.
|
||||
|
||||
19
STATUS.md
19
STATUS.md
@ -1,17 +1,22 @@
|
||||
# STATUS — cc-ci Builder
|
||||
|
||||
**Phase:** M0 — Foundations
|
||||
**In-flight:** Bootstrap complete; starting M0 (flake + base config + sops test secret).
|
||||
**Last updated:** 2026-05-26 (bootstrap)
|
||||
**In-flight:** Base flake config deployed + verified. Next M0 task: sops-nix + decrypt a test secret.
|
||||
**Last updated:** 2026-05-26 (M0 base config live)
|
||||
|
||||
## Gates
|
||||
- (none claimed yet)
|
||||
- (none claimed yet — M0 gate pends sops wiring)
|
||||
|
||||
## Blocked
|
||||
- (none)
|
||||
|
||||
## Notes
|
||||
- cc-ci baseline: Incus VM, 2 vCPU, 3.5 GiB RAM, **3.8 GiB free disk** — tight for multi-recipe
|
||||
docker deploys; watch disk pressure, may need operator to grow the VM disk before M6.5 breadth.
|
||||
- Server config is currently channel-based `/etc/nixos/configuration.nix` (no flake). M0 converts
|
||||
to a flake checked out from this repo on the host.
|
||||
- **Disk RESOLVED:** operator grew the VM 8.9→**28 GiB** (22 GiB free) on 2026-05-26. Inodes
|
||||
1.78M total / 1.21M free (was ~6k free — old 8.9 GiB fs had only 586k inodes, which the flake's
|
||||
nixpkgs fetch exhausted). Both byte + inode pressure gone.
|
||||
- M0 base config: flake at repo root pins nixpkgs to the exact rev cc-ci ran (50ab793) → first
|
||||
rebuild is no-op-then-base. Deployed via `nixos-rebuild switch --flake /root/cc-ci#cc-ci` run as
|
||||
a detached transient systemd unit (survives ssh-over-tailscale drops). Gen 3 current, healthy.
|
||||
- Open warning: incus module enables `systemd.network` while we set `networking.useDHCP=true`
|
||||
(scripted dhcpcd) — Nix warns both may manage interfaces. Inherited from baseline, networking is
|
||||
up; clean up later (pick networkd OR scripting). Tracked, non-blocking.
|
||||
|
||||
27
flake.lock
generated
Normal file
27
flake.lock
generated
Normal file
@ -0,0 +1,27 @@
|
||||
{
|
||||
"nodes": {
|
||||
"nixpkgs": {
|
||||
"locked": {
|
||||
"lastModified": 1751274312,
|
||||
"narHash": "sha256-/bVBlRpECLVzjV19t5KMdMFWSwKLtb5RyXdjz3LJT+g=",
|
||||
"owner": "NixOS",
|
||||
"repo": "nixpkgs",
|
||||
"rev": "50ab793786d9de88ee30ec4e4c24fb4236fc2674",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
"owner": "NixOS",
|
||||
"repo": "nixpkgs",
|
||||
"rev": "50ab793786d9de88ee30ec4e4c24fb4236fc2674",
|
||||
"type": "github"
|
||||
}
|
||||
},
|
||||
"root": {
|
||||
"inputs": {
|
||||
"nixpkgs": "nixpkgs"
|
||||
}
|
||||
}
|
||||
},
|
||||
"root": "root",
|
||||
"version": 7
|
||||
}
|
||||
28
flake.nix
Normal file
28
flake.nix
Normal file
@ -0,0 +1,28 @@
|
||||
{
|
||||
description = "cc-ci — Co-op Cloud recipe CI server (NixOS)";
|
||||
|
||||
inputs = {
|
||||
# Pinned to the exact revision cc-ci already runs, so the first rebuild from
|
||||
# this repo is a true no-op-then-base (M0). Bump deliberately, not drift.
|
||||
nixpkgs.url = "github:NixOS/nixpkgs/50ab793786d9de88ee30ec4e4c24fb4236fc2674";
|
||||
};
|
||||
|
||||
outputs = { self, nixpkgs }:
|
||||
let
|
||||
system = "x86_64-linux";
|
||||
pkgs = nixpkgs.legacyPackages.${system};
|
||||
in
|
||||
{
|
||||
nixosConfigurations.cc-ci = nixpkgs.lib.nixosSystem {
|
||||
inherit system;
|
||||
modules = [ ./hosts/cc-ci/configuration.nix ];
|
||||
};
|
||||
|
||||
# Devshell for working on the harness/bridge locally.
|
||||
devShells.${system}.default = pkgs.mkShell {
|
||||
packages = with pkgs; [ git jq curl nixpkgs-fmt ];
|
||||
};
|
||||
|
||||
formatter.${system} = pkgs.nixpkgs-fmt;
|
||||
};
|
||||
}
|
||||
42
hosts/cc-ci/configuration.nix
Normal file
42
hosts/cc-ci/configuration.nix
Normal file
@ -0,0 +1,42 @@
|
||||
# cc-ci machine config. M0 = faithful reproduction of the baseline (docs/baseline.md)
|
||||
# so the first flake rebuild is a no-op-then-base. Services (swarm/Traefik/Drone/
|
||||
# bridge/dashboard) are layered in via ./modules/* in later milestones.
|
||||
{ pkgs, lib, ... }:
|
||||
{
|
||||
imports = [
|
||||
./hardware.nix
|
||||
];
|
||||
|
||||
# --- Tailscale (ACCESS-CRITICAL: do not break, this is the only route in) ---
|
||||
# Baseline read the hostname from /etc/ts-hostname at eval time; that is impure
|
||||
# under flakes, so we pin the known hostname. The reusable auth-key file persists.
|
||||
services.tailscale = {
|
||||
enable = true;
|
||||
authKeyFile = "/etc/ts-auth-key";
|
||||
extraUpFlags = [ "--hostname=cc-nix-test" ];
|
||||
};
|
||||
|
||||
# --- SSH (root login over tailscale) ---
|
||||
services.openssh = {
|
||||
enable = true;
|
||||
settings.PermitRootLogin = "yes";
|
||||
};
|
||||
|
||||
# --- Firewall: trust tailscale, allow SSH ---
|
||||
networking.firewall = {
|
||||
enable = true;
|
||||
trustedInterfaces = [ "tailscale0" ];
|
||||
allowedTCPPorts = [ 22 ];
|
||||
};
|
||||
|
||||
environment.systemPackages = with pkgs; [
|
||||
curl
|
||||
git
|
||||
jq
|
||||
openssh
|
||||
];
|
||||
|
||||
nix.settings.experimental-features = [ "nix-command" "flakes" ];
|
||||
|
||||
system.stateVersion = "24.11";
|
||||
}
|
||||
21
hosts/cc-ci/hardware.nix
Normal file
21
hosts/cc-ci/hardware.nix
Normal file
@ -0,0 +1,21 @@
|
||||
# Hardware / platform for cc-ci: an Incus VM (x86_64) on the autonomic infra.
|
||||
# Mirrors the pre-flake baseline (docs/baseline.md).
|
||||
{ modulesPath, ... }:
|
||||
{
|
||||
imports = [
|
||||
"${modulesPath}/virtualisation/incus-virtual-machine.nix"
|
||||
];
|
||||
|
||||
# incus-agent for `incus exec`
|
||||
virtualisation.incus.agent.enable = true;
|
||||
|
||||
# cloud-init seeded the VM (network + /etc/ts-* files); keep it enabled.
|
||||
services.cloud-init = {
|
||||
enable = true;
|
||||
network.enable = true;
|
||||
};
|
||||
|
||||
# DHCP from the incus bridge; bridge provides no resolver, so set our own.
|
||||
networking.useDHCP = true;
|
||||
networking.nameservers = [ "1.1.1.1" "8.8.8.8" ];
|
||||
}
|
||||
Reference in New Issue
Block a user