feat: project-orchestrator — engine@v0.1.0 submodule, PO config, fleet.toml registry, mgmt scripts, docs, Nix

The PO is itself a project using the agent-orchestrator harness (engine/ submodule pinned at
v0.1.0). Adds: agents.toml (one persistent fleet-management agent) + prompts/; fleet.toml (the
sole project<->harness<->ref registry) + docs/fleet-registry.md; scripts/ (fleet.py +
create/start/stop/update-project.sh); docs/manage-projects.md + docs/bootstrap.md; flake.nix/.lock
devShell (python311+tmux+git); README.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-13 19:15:47 +00:00
commit 346ed31acb
19 changed files with 849 additions and 0 deletions

66
docs/bootstrap.md Normal file
View File

@ -0,0 +1,66 @@
# Bootstrap — how the *first* project-orchestrator is hand-scaffolded
The PO creates projects — but nothing creates the *first* PO. It is hand-scaffolded once. Thereafter
the PO can create projects, and can even re-scaffold itself. This doc records that one-time setup so a
fresh PO can be stood up from scratch (it is exactly how *this* repo was made).
## What a PO is
A PO is **just a project** that uses the `agent-orchestrator` harness, plus two things no ordinary
project has: a **fleet registry** (`fleet.toml`) and **fleet-management agents/runbooks**
(`prompts/`, `scripts/`, `docs/`). Its architecture is identical to any project; only its *job*
differs. So bootstrapping a PO = scaffolding a normal agent-orchestrator project, then adding the
fleet pieces.
## Prerequisites
- `git` (with submodule support), `python3 >= 3.11` (stdlib `tomllib`), `tmux` — all provided by
`nix develop` (see the README's Nix section).
- The agent CLI the PO agent uses (`claude`) on `PATH` — external, not from Nix.
- A git host for the PO repo (this PO lives at `recipe-maintainers/project-orchestrator`).
## Steps (one time)
1. **Create the PO repo** and clone it.
```bash
git init -b main project-orchestrator && cd project-orchestrator
```
2. **Vendor the harness as `engine/`**, pinned at a release tag (the submodule pin IS the engine
version):
```bash
git submodule add https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git engine
git -C engine fetch --tags && git -C engine checkout v0.1.0
git add .gitmodules engine
```
3. **Write the harness config** — `agents.toml` declaring the PO's own agent(s). A single
`kind = "persistent"` `project-orchestrator` agent (backend `claude`) is enough to start; its
startup prompt is `prompts/orchestrator.md` and it gets an hourly `wake` →
`prompts/supervise.md`. (You can scaffold a starter with `python3 engine/agents.py init .` and
then edit it, or copy this repo's `agents.toml`.)
4. **Add the fleet pieces** (what makes this project a PO):
- `fleet.toml` — the registry (schema: `docs/fleet-registry.md`), starting empty or with a sample.
- `prompts/orchestrator.md` + `prompts/supervise.md` — the PO agent's role and periodic sweep.
- `scripts/` — `fleet.py` (read/validate the registry) and `create/start/stop/update-project.sh`.
- `docs/` — these runbooks.
5. **Add Nix + `.gitignore`** (`.ao-state/` is runtime state, never committed). Commit and push
`main` with the submodule pinned.
6. **Bring the PO up** — exactly like any project:
```bash
nix develop -c python3 engine/agents.py status # sanity: config parses, agent listed
nix develop -c python3 engine/agents.py up # start the PO agent + its watchdog
```
That's it — the PO is live. From here it manages the fleet via the runbooks in
`docs/manage-projects.md`, and can scaffold further projects (including a replacement PO) with
`scripts/create-project.sh`.
## Re-scaffolding the PO later
Because a PO is just a project, an existing PO can create a new PO by running the create-a-project
flow with this repo's URL as the engine consumer and then adding the fleet pieces — or, more simply,
by cloning this repo recursively. The bootstrap above is only needed when there is no PO at all.

55
docs/fleet-registry.md Normal file
View File

@ -0,0 +1,55 @@
# The fleet registry — `fleet.toml`
`fleet.toml` is the project-orchestrator's **authoritative, PO-only** record of every project in the
fleet. It is the *single* place where project ↔ harness ↔ ref ↔ location is recorded. Projects
themselves carry none of this: knowledge is one-directional (**PO → projects, never the reverse**), so
a project repo never contains a `fleet.toml`, a project id, or any mention of the PO.
Validate / inspect it with the helper:
```bash
python3 scripts/fleet.py validate # parse + schema-check; exit 1 on any error
python3 scripts/fleet.py list # one line per project
python3 scripts/fleet.py status # list + a summary count
python3 scripts/fleet.py get <name> # dump one project's full entry
```
## Schema
### `[fleet]` — registry metadata (optional)
| key | type | meaning |
|---|---|---|
| `version` | integer | registry schema version; bump on a breaking change to this format |
### `[[project]]` — one block per project
| key | required | type | meaning |
|---|---|---|---|
| `name` | **yes** | string | unique fleet id, kebab-case. Used by the helper scripts to address the project. |
| `location` | **yes** | string | where the project lives: a git URL (remote) **or** a local filesystem path (a working clone the PO can drive directly). |
| `harness` | **yes** | string | which harness runs the project (e.g. `agent-orchestrator`). The PO reads that harness's docs to know how to drive it. |
| `ref` | **yes** | string | the harness ref the project pins (its `engine/` submodule pin — a tag/branch/SHA). The in-repo submodule pin is the source of truth; this mirrors it for at-a-glance fleet inventory. |
| `enabled` | **yes** | bool | whether this project is part of the active fleet (a bare fleet "start everything" would skip `enabled = false`). |
| `secrets` | **yes** | string | where the project's creds live (a path like `.env`, or a secrets-manager ref). Never the secret values themselves; secrets are never in git. |
| `config` | no | string | the project's harness config file, relative to its root (default `agents.toml`). |
| `notes` | no | string | free-form operator notes. |
Unknown fields are rejected by `fleet.py validate` (typo guard). Duplicate `name`s are rejected.
## Why the registry, and only the registry, holds this
If a project repo recorded its own harness/ref/fleet-membership, that knowledge would be
bidirectional and a project would "know" it belongs to a fleet — breaking isolation and making a
project un-runnable standalone. Instead:
- **what harness + what ref** a project uses is captured *in the project* only as the `engine/`
submodule pin (a plain git fact, no fleet semantics), and *in the PO* as this registry's
`harness`/`ref`. The project never names a PO.
- **which projects exist, where, enabled** lives *only* here. Remove a project from the fleet by
deleting its `[[project]]` block; the project repo is unaffected and still runnable by hand.
## Adding an entry
Either let `scripts/create-project.sh --register` append one when it scaffolds a project, or add a
block by hand and run `python3 scripts/fleet.py validate`. See `docs/manage-projects.md`.

95
docs/manage-projects.md Normal file
View File

@ -0,0 +1,95 @@
# Managing projects — the PO runbooks
These are the flows the project-orchestrator (the AI in `prompts/orchestrator.md`, or a human
operator) follows. The PO is **AI-driven, not contract-bound**: for any harness it doesn't already
know, it *reads that harness's docs* and works out how to drive it. The helper scripts here cover the
common `agent-orchestrator` case; treat them as conveniences, not a rigid interface.
The one invariant across every flow: **knowledge is one-directional (PO → project)**. Nothing about
the PO or the fleet is ever written into a project repo.
---
## Create a project
**Goal:** a new, self-contained project repo — the chosen harness vendored as `engine/` at a pinned
ref, a harness config scaffolded by that harness, and **no** PO/fleet metadata — plus (separately) a
`fleet.toml` entry on the PO side.
```bash
scripts/create-project.sh <name> \
[--dir <parent>] # where to create it (default: ./projects)
[--engine-url <url>] # harness repo (default: the agent-orchestrator repo)
[--ref <ref>] # harness ref to pin (default: v0.1.0)
[--prefix <session-prefix>] # tmux namespace in the config (default: <name>-)
[--register] # also append a [[project]] entry to fleet.toml
```
What it does, step by step (so the PO can also do it by hand for a non-default harness):
1. `git init -b main <parent>/<name>`.
2. Vendor the harness: `git submodule add <engine-url> engine`, then `git -C engine checkout <ref>`.
The submodule pin **is** the recorded harness version inside the project (no other metadata).
3. Scaffold the harness config with the harness's own initializer
(`python3 engine/agents.py init .`), and stamp a unique `session_prefix`. This writes
`agents.toml` + `prompts/`**harness config only**.
4. Add a `.gitignore` for runtime state (`.ao-state/`), and an initial commit.
5. **Separately** (only with `--register`, or by hand): append a `[[project]]` block to the PO's
`fleet.toml`. This is PO-side; it never lands in the project.
Verify the new project standalone — it must work with no knowledge of any PO:
```bash
( cd <parent>/<name> && python3 engine/agents.py status )
```
And confirm isolation — the project must contain **no** PO/fleet metadata:
```bash
cd <parent>/<name>
grep -ril -e 'fleet' -e 'project-orchestrator' -e 'project orchestrator' . --exclude-dir=engine --exclude-dir=.git || echo "clean: no PO/fleet metadata"
```
(Exclude `engine/` — that's the upstream harness, whose own docs legitimately *describe* being
driven by a PO; that is the harness documenting a consumer, not this project knowing about a fleet.)
---
## Start / stop a project
Drive the project's harness. For an `agent-orchestrator` project:
```bash
scripts/start-project.sh <name> [agent...] # → engine/agents.py up (in the project dir)
scripts/stop-project.sh <name> [agent...] # → engine/agents.py down
```
Both resolve `<name>``location` via `fleet.toml`. For a remote-only `location`, clone it locally
first. For a non-`agent-orchestrator` harness, the wrapper bows out — read that harness's docs and
drive it directly.
---
## Update a project's harness
Bumping the engine is **per-project and opt-in** — it touches only that project's repo; every other
project keeps its own pin, so one bump can't break another.
```bash
scripts/update-project.sh <name> <new-ref> # checkout new ref in the project's engine/ + commit
```
Then update the project's `ref` in `fleet.toml` and `python3 scripts/fleet.py validate`.
---
## List / status the fleet
```bash
python3 scripts/fleet.py list # one line per project: name, enabled, harness@ref, location
python3 scripts/fleet.py status # + a total/enabled/disabled summary
```
This reads only `fleet.toml`. To also check live state, drive each enabled project's harness
(`engine/agents.py status --config <project>/agents.toml`) — `prompts/supervise.md` does this on the
PO's periodic wake.