feat: project-orchestrator — engine@v0.1.0 submodule, PO config, fleet.toml registry, mgmt scripts, docs, Nix

The PO is itself a project using the agent-orchestrator harness (engine/ submodule pinned at
v0.1.0). Adds: agents.toml (one persistent fleet-management agent) + prompts/; fleet.toml (the
sole project<->harness<->ref registry) + docs/fleet-registry.md; scripts/ (fleet.py +
create/start/stop/update-project.sh); docs/manage-projects.md + docs/bootstrap.md; flake.nix/.lock
devShell (python311+tmux+git); README.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-13 19:15:47 +00:00
commit 346ed31acb
19 changed files with 849 additions and 0 deletions

9
.gitignore vendored Normal file
View File

@ -0,0 +1,9 @@
# runtime state + logs (never committed)
.ao-state/
*.log
__pycache__/
*.pyc
result
# projects scaffolded locally by scripts/create-project.sh are their own repos — not tracked here
/projects/

3
.gitmodules vendored Normal file
View File

@ -0,0 +1,3 @@
[submodule "engine"]
path = engine
url = https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git

96
README.md Normal file
View File

@ -0,0 +1,96 @@
# project-orchestrator (PO)
The **project-orchestrator** manages a *fleet* of independent projects — creating, starting,
stopping, updating, and monitoring them. It is itself **just a project** that uses the
[`agent-orchestrator`](https://git.autonomic.zone/recipe-maintainers/agent-orchestrator) harness
(vendored as the `engine/` submodule). What makes it the PO is its *job*, not its architecture: there
is no special "control-plane" code path.
```
project-orchestrator/
agents.toml this project's harness config (one persistent fleet-management agent)
engine/ the agent-orchestrator harness, pinned as a submodule @ v0.1.0
prompts/ the PO agent's role (orchestrator.md) + periodic sweep (supervise.md)
fleet.toml THE fleet registry — the only record of project ↔ harness ↔ ref ↔ location
scripts/ fleet.py + create/start/stop/update-project.sh — the management helpers
docs/ runbooks: manage-projects.md, fleet-registry.md, bootstrap.md
flake.nix/.lock a Nix devShell (python311 + tmux + git)
.ao-state/ runtime state + logs (gitignored)
```
## The one rule: knowledge is one-directional (PO → projects, never the reverse)
A project repo contains **nothing** about the PO or the fleet — no fleet metadata, no `fleet.toml`,
no mention of a PO. A project can be run and inspected entirely by hand and would have no idea a PO
exists. The *only* record of which projects exist, where, which harness, what ref, and whether
they're enabled is this repo's **`fleet.toml`**. The PO knows the projects; the projects never know
the PO.
What a project *does* carry is its `engine/` submodule pin — a plain git fact (this harness, this
ref) with no fleet semantics. The PO's `fleet.toml` mirrors that for fleet inventory.
## Quick start
```bash
nix develop # python311 + tmux + git on PATH (see "Nix")
python3 engine/agents.py status # the PO's own agents (reads agents.toml)
python3 scripts/fleet.py status # the fleet registry: projects, enabled, harness@ref
python3 engine/agents.py up # start the PO agent + its watchdog (needs claude on PATH)
python3 engine/agents.py down # stop everything
```
The PO agent itself is one persistent `claude` session (`prompts/orchestrator.md`) with an hourly
fleet sweep (`prompts/supervise.md`), supervised by the harness watchdog. See `engine/README.md`
for the full harness reference.
## Managing the fleet
Full runbooks in [`docs/manage-projects.md`](docs/manage-projects.md). In short:
```bash
# create a new project: scaffolds engine/ submodule + harness config, NO PO/fleet metadata inside it
scripts/create-project.sh my-new-project --ref v0.1.0 [--register]
# drive an existing fleet project's harness (resolves location via fleet.toml)
scripts/start-project.sh <name>
scripts/stop-project.sh <name>
scripts/update-project.sh <name> <new-ref> # bump that project's engine pin (per-project, opt-in)
# inspect the fleet
python3 scripts/fleet.py list | status | validate | get <name>
```
The fleet registry schema is documented in [`docs/fleet-registry.md`](docs/fleet-registry.md).
## Nix
A `flake.nix` provides a reproducible devShell with the runtime deps (`python311` for stdlib
`tomllib`, plus `tmux` and `git` incl. submodule support):
```bash
nix develop # enter the shell
nix develop -c python3 -c 'import tomllib' # sanity: the runtime python has tomllib
nix develop -c python3 engine/agents.py status # run a command in the shell
nix flake check # evaluate + build the devShell
```
The agent CLI the PO uses (`claude`) is an **external, non-Nix tool** — install it per its own docs
and ensure it is on `PATH` before launching the live agent.
## First-time setup / re-scaffolding
Nothing creates the *first* PO — it is hand-scaffolded once.
[`docs/bootstrap.md`](docs/bootstrap.md) records exactly how (it is how this repo was made), and how
an existing PO can later re-scaffold itself.
## Cloning
Because the harness is a submodule, clone recursively:
```bash
git clone --recurse-submodules https://git.autonomic.zone/recipe-maintainers/project-orchestrator.git
# or, after a plain clone:
git submodule update --init --recursive
```

58
agents.toml Normal file
View File

@ -0,0 +1,58 @@
# project-orchestrator — agent-orchestrator harness config (this project's ONLY config).
#
# The PO is just a project that uses the agent-orchestrator harness (vendored as the `engine/`
# submodule). What makes it the project-orchestrator is its *job* — fleet management — not its
# architecture. Run it by hand exactly like any other project:
#
# nix develop -c python3 engine/agents.py status
# nix develop -c python3 engine/agents.py up # start the PO agent + watchdog
# nix develop -c python3 engine/agents.py down
#
# Runtime state (resume ids, limit windows) lives under <log_dir>/state/, NOT here. There is NO
# fleet/PO metadata in any project's config — the fleet lives only in this repo's fleet.toml.
# ─────────────────────────── global watchdog cadence ───────────────────────────
[watchdog]
signal_interval = 30 # s between handoff / stall / limit checks (light)
heavy_interval = 300 # s between heal / phase-advance checks
limit_probe_fallback = 300 # flat probe cadence when a reset time can't be parsed
limit_reset_slack = 45 # s past a parsed reset before probing
stall_grace = 180 # s of slack past a WAITING-UNTIL marker before a stall reboot
# ─────────────────────────── defaults inherited by every agent ───────────────────────────
[defaults]
session_prefix = "po-" # REQUIRED — tmux namespace for the project-orchestrator project
log_dir = ".ao-state" # REQUIRED — logs + state/, resolved relative to this file
backend = "claude"
model = "claude-opus-4-8"
watch = "heal" # none | heal | heal+stall
# ─────────────────────────── backends (declared as data) ───────────────────────────
[backend.claude]
bin = "claude"
flags = "--dangerously-skip-permissions"
remote_control = true
supports_resume = true
prompt_delivery = "arg"
process_name = "claude"
submit_key = "Enter"
stall_idle = 300
active_re = "esc to interrupt|Running tool|⠇|⠙|· \\d+"
limit_re = "spend limit|usage limit|limit reached|reached your .*limit|out of (credits|tokens)"
fatal_re = "redacted_thinking|blocks cannot be modified|cannot be modified"
# ─────────────────────────── the PO agent ───────────────────────────
# A single persistent fleet-management agent is enough to start (the plan: "add a loop only if
# useful"). It is NOT a build loop — it manages a fleet of *other* projects: create / start / stop
# / update / list / status, reading each project's harness docs to work out how to drive it. Its
# startup prompt + periodic wake nudge live in prompts/.
[[agent]]
name = "project-orchestrator" # tmux session: po-project-orchestrator
kind = "persistent"
backend = "claude"
model = "claude-opus-4-8"
resume = true # resume its session across restarts (--resume <state id>)
watch = "heal" # keep it alive/healed; never reboot just for being idle
prompt_file = "prompts/orchestrator.md" # startup prompt: read your role + fleet, then report
wake = { interval = 3600, prompt_file = "prompts/supervise.md" } # hourly fleet sweep

66
docs/bootstrap.md Normal file
View File

@ -0,0 +1,66 @@
# Bootstrap — how the *first* project-orchestrator is hand-scaffolded
The PO creates projects — but nothing creates the *first* PO. It is hand-scaffolded once. Thereafter
the PO can create projects, and can even re-scaffold itself. This doc records that one-time setup so a
fresh PO can be stood up from scratch (it is exactly how *this* repo was made).
## What a PO is
A PO is **just a project** that uses the `agent-orchestrator` harness, plus two things no ordinary
project has: a **fleet registry** (`fleet.toml`) and **fleet-management agents/runbooks**
(`prompts/`, `scripts/`, `docs/`). Its architecture is identical to any project; only its *job*
differs. So bootstrapping a PO = scaffolding a normal agent-orchestrator project, then adding the
fleet pieces.
## Prerequisites
- `git` (with submodule support), `python3 >= 3.11` (stdlib `tomllib`), `tmux` — all provided by
`nix develop` (see the README's Nix section).
- The agent CLI the PO agent uses (`claude`) on `PATH` — external, not from Nix.
- A git host for the PO repo (this PO lives at `recipe-maintainers/project-orchestrator`).
## Steps (one time)
1. **Create the PO repo** and clone it.
```bash
git init -b main project-orchestrator && cd project-orchestrator
```
2. **Vendor the harness as `engine/`**, pinned at a release tag (the submodule pin IS the engine
version):
```bash
git submodule add https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git engine
git -C engine fetch --tags && git -C engine checkout v0.1.0
git add .gitmodules engine
```
3. **Write the harness config** — `agents.toml` declaring the PO's own agent(s). A single
`kind = "persistent"` `project-orchestrator` agent (backend `claude`) is enough to start; its
startup prompt is `prompts/orchestrator.md` and it gets an hourly `wake` →
`prompts/supervise.md`. (You can scaffold a starter with `python3 engine/agents.py init .` and
then edit it, or copy this repo's `agents.toml`.)
4. **Add the fleet pieces** (what makes this project a PO):
- `fleet.toml` — the registry (schema: `docs/fleet-registry.md`), starting empty or with a sample.
- `prompts/orchestrator.md` + `prompts/supervise.md` — the PO agent's role and periodic sweep.
- `scripts/` — `fleet.py` (read/validate the registry) and `create/start/stop/update-project.sh`.
- `docs/` — these runbooks.
5. **Add Nix + `.gitignore`** (`.ao-state/` is runtime state, never committed). Commit and push
`main` with the submodule pinned.
6. **Bring the PO up** — exactly like any project:
```bash
nix develop -c python3 engine/agents.py status # sanity: config parses, agent listed
nix develop -c python3 engine/agents.py up # start the PO agent + its watchdog
```
That's it — the PO is live. From here it manages the fleet via the runbooks in
`docs/manage-projects.md`, and can scaffold further projects (including a replacement PO) with
`scripts/create-project.sh`.
## Re-scaffolding the PO later
Because a PO is just a project, an existing PO can create a new PO by running the create-a-project
flow with this repo's URL as the engine consumer and then adding the fleet pieces — or, more simply,
by cloning this repo recursively. The bootstrap above is only needed when there is no PO at all.

55
docs/fleet-registry.md Normal file
View File

@ -0,0 +1,55 @@
# The fleet registry — `fleet.toml`
`fleet.toml` is the project-orchestrator's **authoritative, PO-only** record of every project in the
fleet. It is the *single* place where project ↔ harness ↔ ref ↔ location is recorded. Projects
themselves carry none of this: knowledge is one-directional (**PO → projects, never the reverse**), so
a project repo never contains a `fleet.toml`, a project id, or any mention of the PO.
Validate / inspect it with the helper:
```bash
python3 scripts/fleet.py validate # parse + schema-check; exit 1 on any error
python3 scripts/fleet.py list # one line per project
python3 scripts/fleet.py status # list + a summary count
python3 scripts/fleet.py get <name> # dump one project's full entry
```
## Schema
### `[fleet]` — registry metadata (optional)
| key | type | meaning |
|---|---|---|
| `version` | integer | registry schema version; bump on a breaking change to this format |
### `[[project]]` — one block per project
| key | required | type | meaning |
|---|---|---|---|
| `name` | **yes** | string | unique fleet id, kebab-case. Used by the helper scripts to address the project. |
| `location` | **yes** | string | where the project lives: a git URL (remote) **or** a local filesystem path (a working clone the PO can drive directly). |
| `harness` | **yes** | string | which harness runs the project (e.g. `agent-orchestrator`). The PO reads that harness's docs to know how to drive it. |
| `ref` | **yes** | string | the harness ref the project pins (its `engine/` submodule pin — a tag/branch/SHA). The in-repo submodule pin is the source of truth; this mirrors it for at-a-glance fleet inventory. |
| `enabled` | **yes** | bool | whether this project is part of the active fleet (a bare fleet "start everything" would skip `enabled = false`). |
| `secrets` | **yes** | string | where the project's creds live (a path like `.env`, or a secrets-manager ref). Never the secret values themselves; secrets are never in git. |
| `config` | no | string | the project's harness config file, relative to its root (default `agents.toml`). |
| `notes` | no | string | free-form operator notes. |
Unknown fields are rejected by `fleet.py validate` (typo guard). Duplicate `name`s are rejected.
## Why the registry, and only the registry, holds this
If a project repo recorded its own harness/ref/fleet-membership, that knowledge would be
bidirectional and a project would "know" it belongs to a fleet — breaking isolation and making a
project un-runnable standalone. Instead:
- **what harness + what ref** a project uses is captured *in the project* only as the `engine/`
submodule pin (a plain git fact, no fleet semantics), and *in the PO* as this registry's
`harness`/`ref`. The project never names a PO.
- **which projects exist, where, enabled** lives *only* here. Remove a project from the fleet by
deleting its `[[project]]` block; the project repo is unaffected and still runnable by hand.
## Adding an entry
Either let `scripts/create-project.sh --register` append one when it scaffolds a project, or add a
block by hand and run `python3 scripts/fleet.py validate`. See `docs/manage-projects.md`.

95
docs/manage-projects.md Normal file
View File

@ -0,0 +1,95 @@
# Managing projects — the PO runbooks
These are the flows the project-orchestrator (the AI in `prompts/orchestrator.md`, or a human
operator) follows. The PO is **AI-driven, not contract-bound**: for any harness it doesn't already
know, it *reads that harness's docs* and works out how to drive it. The helper scripts here cover the
common `agent-orchestrator` case; treat them as conveniences, not a rigid interface.
The one invariant across every flow: **knowledge is one-directional (PO → project)**. Nothing about
the PO or the fleet is ever written into a project repo.
---
## Create a project
**Goal:** a new, self-contained project repo — the chosen harness vendored as `engine/` at a pinned
ref, a harness config scaffolded by that harness, and **no** PO/fleet metadata — plus (separately) a
`fleet.toml` entry on the PO side.
```bash
scripts/create-project.sh <name> \
[--dir <parent>] # where to create it (default: ./projects)
[--engine-url <url>] # harness repo (default: the agent-orchestrator repo)
[--ref <ref>] # harness ref to pin (default: v0.1.0)
[--prefix <session-prefix>] # tmux namespace in the config (default: <name>-)
[--register] # also append a [[project]] entry to fleet.toml
```
What it does, step by step (so the PO can also do it by hand for a non-default harness):
1. `git init -b main <parent>/<name>`.
2. Vendor the harness: `git submodule add <engine-url> engine`, then `git -C engine checkout <ref>`.
The submodule pin **is** the recorded harness version inside the project (no other metadata).
3. Scaffold the harness config with the harness's own initializer
(`python3 engine/agents.py init .`), and stamp a unique `session_prefix`. This writes
`agents.toml` + `prompts/`**harness config only**.
4. Add a `.gitignore` for runtime state (`.ao-state/`), and an initial commit.
5. **Separately** (only with `--register`, or by hand): append a `[[project]]` block to the PO's
`fleet.toml`. This is PO-side; it never lands in the project.
Verify the new project standalone — it must work with no knowledge of any PO:
```bash
( cd <parent>/<name> && python3 engine/agents.py status )
```
And confirm isolation — the project must contain **no** PO/fleet metadata:
```bash
cd <parent>/<name>
grep -ril -e 'fleet' -e 'project-orchestrator' -e 'project orchestrator' . --exclude-dir=engine --exclude-dir=.git || echo "clean: no PO/fleet metadata"
```
(Exclude `engine/` — that's the upstream harness, whose own docs legitimately *describe* being
driven by a PO; that is the harness documenting a consumer, not this project knowing about a fleet.)
---
## Start / stop a project
Drive the project's harness. For an `agent-orchestrator` project:
```bash
scripts/start-project.sh <name> [agent...] # → engine/agents.py up (in the project dir)
scripts/stop-project.sh <name> [agent...] # → engine/agents.py down
```
Both resolve `<name>``location` via `fleet.toml`. For a remote-only `location`, clone it locally
first. For a non-`agent-orchestrator` harness, the wrapper bows out — read that harness's docs and
drive it directly.
---
## Update a project's harness
Bumping the engine is **per-project and opt-in** — it touches only that project's repo; every other
project keeps its own pin, so one bump can't break another.
```bash
scripts/update-project.sh <name> <new-ref> # checkout new ref in the project's engine/ + commit
```
Then update the project's `ref` in `fleet.toml` and `python3 scripts/fleet.py validate`.
---
## List / status the fleet
```bash
python3 scripts/fleet.py list # one line per project: name, enabled, harness@ref, location
python3 scripts/fleet.py status # + a total/enabled/disabled summary
```
This reads only `fleet.toml`. To also check live state, drive each enabled project's harness
(`engine/agents.py status --config <project>/agents.toml`) — `prompts/supervise.md` does this on the
PO's periodic wake.

1
engine Submodule

Submodule engine added at 289ef07df4

27
flake.lock generated Normal file
View File

@ -0,0 +1,27 @@
{
"nodes": {
"nixpkgs": {
"locked": {
"lastModified": 1751274312,
"narHash": "sha256-/bVBlRpECLVzjV19t5KMdMFWSwKLtb5RyXdjz3LJT+g=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "50ab793786d9de88ee30ec4e4c24fb4236fc2674",
"type": "github"
},
"original": {
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "50ab793786d9de88ee30ec4e4c24fb4236fc2674",
"type": "github"
}
},
"root": {
"inputs": {
"nixpkgs": "nixpkgs"
}
}
},
"root": "root",
"version": 7
}

41
flake.nix Normal file
View File

@ -0,0 +1,41 @@
{
description = "project-orchestrator a project that uses the agent-orchestrator harness to manage a fleet of projects";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/50ab793786d9de88ee30ec4e4c24fb4236fc2674";
};
outputs = { self, nixpkgs }:
let
systems = [ "x86_64-linux" "aarch64-linux" "x86_64-darwin" "aarch64-darwin" ];
forAllSystems = f: nixpkgs.lib.genAttrs systems (system: f nixpkgs.legacyPackages.${system});
in
{
# Reproducible devShell for the PO. Everything the PO needs to run itself + drive the fleet:
# - python311 : stdlib tomllib (>=3.11) — engine/agents.py and scripts/fleet.py import it
# - tmux : the harness runs every agent/watchdog in tmux
# - git : the engine submodule + create/update flows need git (incl. submodule support)
# The agent CLI the PO agent uses (claude) is an external, non-Nix tool — install it per its
# own docs and put it on PATH before `engine/agents.py up`.
devShells = forAllSystems (pkgs: {
default = pkgs.mkShell {
packages = [
pkgs.python311 # tomllib (>=3.11)
pkgs.tmux # harness sessions + watchdog
pkgs.git # engine submodule + project create/update flows
pkgs.coreutils
pkgs.bash
];
shellHook = ''
echo "project-orchestrator devShell $(python3 --version), $(tmux -V), $(git --version)"
echo "try: python3 engine/agents.py status | python3 scripts/fleet.py status"
'';
};
});
# `nix flake check` evaluates this — a cheap smoke that the devShell builds.
checks = forAllSystems (pkgs: {
devshell-builds = self.devShells.${pkgs.system}.default;
});
};
}

26
fleet.toml Normal file
View File

@ -0,0 +1,26 @@
# fleet.toml — the project-orchestrator's fleet registry.
#
# THE authoritative, PO-only record of every project the fleet manages. This is the ONLY place where
# project ↔ harness ↔ ref ↔ location lives. Projects themselves carry none of this (one-directional
# knowledge: PO → projects, never the reverse). Full schema + field reference: docs/fleet-registry.md
#
# Validate / inspect this file with: python3 scripts/fleet.py list (or `status`).
# ─────────────────────────── registry metadata (optional) ───────────────────────────
[fleet]
version = 1 # registry schema version (integer; bump on breaking changes)
# ─────────────────────────── one [[project]] block per project ───────────────────────────
# Required fields: name, location, harness, ref, enabled, secrets.
# See docs/fleet-registry.md for the meaning and allowed values of each.
# Sample entry (a representative project; not started by the PO unless an operator asks).
[[project]]
name = "example-recipe-ci" # unique fleet id (kebab-case)
location = "https://git.autonomic.zone/recipe-maintainers/example-recipe-ci.git" # git url or local path
harness = "agent-orchestrator" # which harness runs it
ref = "v0.1.0" # pinned harness ref (submodule pin)
enabled = true # is this project part of the active fleet?
secrets = ".env" # where the project's creds live (path/ref; never in git)
config = "agents.toml" # the project's harness config file (relative to the project root)
notes = "Sample fleet entry — replace with a real project, or remove."

47
prompts/orchestrator.md Normal file
View File

@ -0,0 +1,47 @@
# Role: the project-orchestrator (PO)
You are the **project-orchestrator** — an AI that manages a *fleet* of independent projects. You are
yourself just a project that uses the `agent-orchestrator` harness (vendored at `engine/`); what is
special about you is your **job**, not your architecture.
## The one rule that governs everything: knowledge is one-directional
**PO → projects, never the reverse.** A project repo contains *nothing* about you or the fleet — no
fleet metadata, no `fleet.toml`, no mention of the PO. A project can be run and inspected entirely by
hand and would have no idea a PO exists. The *only* place that records which projects exist, where
they live, which harness they use, at what ref, and whether they're enabled is **this repo's
`fleet.toml`**. Never write PO/fleet metadata into a project repo.
## What you know about (your inputs)
- **`fleet.toml`** — the authoritative registry of every project (schema documented in
`docs/fleet-registry.md`). This is your source of truth for the fleet.
- **`engine/README.md`** — how the `agent-orchestrator` harness works (the harness most projects
use). Other projects may use a *different* harness; there is no rigid contract — you **read** each
project's harness docs and work out how to drive it.
- **`docs/`** — your runbooks:
- `docs/manage-projects.md` — the create / start / stop / update / list / status flows.
- `docs/fleet-registry.md` — the `fleet.toml` schema.
- `docs/bootstrap.md` — how the first PO (you) is hand-scaffolded.
## What you do (your job)
For each flow, follow the runbook in `docs/manage-projects.md`. In short:
- **create** a project — scaffold a new repo, add the chosen harness as a submodule at a ref, write
the project's harness config (and **no** PO/fleet metadata), then add a `fleet.toml` entry.
Helper: `scripts/create-project.sh`.
- **start / stop / update** a project — drive that project's harness by reading its docs (for an
`agent-orchestrator` project: `engine/agents.py up|down`, bump the submodule to update).
Helpers: `scripts/start-project.sh`, `scripts/stop-project.sh`, `scripts/update-project.sh`.
- **list / status** — read `fleet.toml` and report. Helper: `scripts/fleet.py list|status`.
## On startup (now)
1. Read `fleet.toml` and `docs/manage-projects.md` so you know the current fleet and your runbooks.
2. Run `python3 scripts/fleet.py status` to see the fleet's declared state.
3. Report a short summary: how many projects, which are enabled, anything that looks wrong. Then idle
until your next wake or an operator instruction.
Do not invent work. You act when an operator asks you to create/start/stop/update a project, or when
your periodic wake (`prompts/supervise.md`) tells you to sweep the fleet.

17
prompts/supervise.md Normal file
View File

@ -0,0 +1,17 @@
# Periodic fleet sweep
A scheduled wake. Do a light, read-only sweep of the fleet — do not start work unless something is
clearly wrong and a runbook covers the fix.
1. `python3 scripts/fleet.py status` — list every project in `fleet.toml` with its location, harness,
pinned ref, and enabled flag.
2. For each **enabled** project whose location is reachable from this host, optionally check whether
its harness reports it running (for an `agent-orchestrator` project:
`engine/agents.py status --config <project>/agents.toml`). Reading its harness docs first if the
harness is unfamiliar.
3. Report a one-paragraph summary: total / enabled / disabled, anything unreachable or stopped that
should be running. If a fix is needed and `docs/manage-projects.md` covers it, you may apply it;
otherwise just flag it.
Remember the one-directional rule: never write fleet/PO state into a project repo. The fleet's truth
is `fleet.toml` here.

19
scripts/_resolve.sh Executable file
View File

@ -0,0 +1,19 @@
#!/usr/bin/env bash
# _resolve.sh — shared helper: given a fleet project name, echo "<location>\t<config>\t<harness>".
# Sourced by start/stop/update scripts. Reads the PO's fleet.toml (the only project↔location record).
set -euo pipefail
PO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
resolve_project() {
local name="$1"
python3 - "$PO_ROOT/fleet.toml" "$name" <<'PY'
import sys, tomllib
path, name = sys.argv[1], sys.argv[2]
with open(path, "rb") as f:
raw = tomllib.load(f)
for p in raw.get("project", []):
if p.get("name") == name:
print("\t".join([p.get("location", ""), p.get("config", "agents.toml"), p.get("harness", "")]))
sys.exit(0)
sys.exit(f"_resolve: no project named {name!r} in {path}")
PY
}

117
scripts/create-project.sh Executable file
View File

@ -0,0 +1,117 @@
#!/usr/bin/env bash
# create-project.sh — scaffold a NEW project that uses a harness.
#
# Produces a self-contained project repo: the chosen harness vendored as the `engine/` submodule at
# a pinned ref, plus a harness config scaffolded by the harness's own `init`. The project contains
# NO project-orchestrator / fleet metadata — knowledge is one-directional (PO → project). Registering
# the project in the PO's fleet.toml is a SEPARATE, PO-side step (use --register, or edit fleet.toml
# by hand); nothing about the fleet ever lands inside the project repo.
#
# Usage:
# scripts/create-project.sh <name> [options]
#
# Options:
# --dir <parent> parent directory to create the project under (default: ./projects)
# --engine-url <url> harness repo to vendor as engine/ (default: the agent-orchestrator repo)
# --ref <ref> harness ref to pin the submodule at (default: v0.1.0)
# --prefix <prefix> session_prefix to write into the project's config (default: <name>-)
# --register also append a [[project]] entry to this PO's fleet.toml (PO-side only)
# --no-commit leave the project tree uncommitted (default: make an initial commit)
#
# Drive it by hand afterwards, exactly like any project:
# cd <parent>/<name> && python3 engine/agents.py status
set -euo pipefail
die() { echo "create-project: $*" >&2; exit 1; }
[ $# -ge 1 ] || die "usage: create-project.sh <name> [--dir P] [--engine-url U] [--ref R] [--prefix X] [--register] [--no-commit]"
NAME="$1"; shift
[[ "$NAME" =~ ^[a-z0-9][a-z0-9-]*$ ]] || die "name must be kebab-case (got: $NAME)"
PO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
PARENT="$PO_ROOT/projects"
ENGINE_URL="https://git.autonomic.zone/recipe-maintainers/agent-orchestrator.git"
REF="v0.1.0"
PREFIX=""
REGISTER=0
COMMIT=1
while [ $# -gt 0 ]; do
case "$1" in
--dir) PARENT="$2"; shift 2;;
--engine-url) ENGINE_URL="$2"; shift 2;;
--ref) REF="$2"; shift 2;;
--prefix) PREFIX="$2"; shift 2;;
--register) REGISTER=1; shift;;
--no-commit) COMMIT=0; shift;;
*) die "unknown option: $1";;
esac
done
PREFIX="${PREFIX:-${NAME}-}"
command -v git >/dev/null || die "git not on PATH"
command -v python3 >/dev/null || die "python3 not on PATH"
DEST="$PARENT/$NAME"
[ -e "$DEST" ] && die "$DEST already exists — refusing to overwrite"
mkdir -p "$PARENT"
echo "create-project: scaffolding '$NAME' at $DEST (engine $ENGINE_URL @ $REF)"
git init -q -b main "$DEST"
cd "$DEST"
# 1) vendor the harness as a pinned submodule under engine/
git -c protocol.version=2 submodule add -q "$ENGINE_URL" engine
( cd engine && git fetch -q --tags origin && git checkout -q "$REF" )
git add .gitmodules engine
# 2) scaffold the harness config + prompts via the harness's OWN init (no PO/fleet metadata)
python3 engine/agents.py init . >/dev/null
# stamp the chosen session_prefix into the scaffolded config (keeps namespaces unique per project)
python3 - "$PREFIX" <<'PY'
import re, sys, pathlib
prefix = sys.argv[1]
p = pathlib.Path("agents.toml")
txt = p.read_text()
txt = re.sub(r'session_prefix\s*=\s*"[^"]*"', f'session_prefix = "{prefix}"', txt, count=1)
p.write_text(txt)
PY
# 3) ignore runtime state; the project knows nothing about any PO
cat > .gitignore <<'EOF'
# runtime state + logs (never committed)
.ao-state/
*.log
__pycache__/
*.pyc
result
EOF
git add agents.toml prompts .gitignore 2>/dev/null || true
if [ "$COMMIT" -eq 1 ]; then
git -c user.name="project-orchestrator" -c user.email="po@localhost" \
commit -q -m "init: scaffold $NAME (engine @ $REF)"
fi
echo "create-project: done — $DEST"
echo " engine pinned at: $(cd engine && git rev-parse HEAD) ($REF)"
echo " config: agents.toml (session_prefix = $PREFIX)"
echo " verify it: ( cd $DEST && python3 engine/agents.py status )"
if [ "$REGISTER" -eq 1 ]; then
echo "create-project: registering '$NAME' in $PO_ROOT/fleet.toml"
cat >> "$PO_ROOT/fleet.toml" <<EOF
[[project]]
name = "$NAME"
location = "$DEST"
harness = "agent-orchestrator"
ref = "$REF"
enabled = false
secrets = ".env"
config = "agents.toml"
notes = "Created by scripts/create-project.sh"
EOF
python3 "$PO_ROOT/scripts/fleet.py" --file "$PO_ROOT/fleet.toml" validate
fi

117
scripts/fleet.py Executable file
View File

@ -0,0 +1,117 @@
#!/usr/bin/env python3
"""fleet.py — read, validate, and report the PO fleet registry (fleet.toml).
This is the PO's view of the fleet. It NEVER touches a project's repo and carries no project-side
state — the registry here is the single source of truth for project ↔ harness ↔ ref ↔ location.
Usage:
python3 scripts/fleet.py list # one line per project (also validates the file)
python3 scripts/fleet.py status # same, plus a one-line summary count
python3 scripts/fleet.py validate # parse + schema-check only; exit 1 on any error
python3 scripts/fleet.py get <name> # dump one project's full entry
python3 scripts/fleet.py --file PATH ... # use a non-default registry (default: ./fleet.toml)
Needs only the Python stdlib (tomllib → python >= 3.11).
"""
import sys
import tomllib
from pathlib import Path
REQUIRED = ["name", "location", "harness", "ref", "enabled", "secrets"]
OPTIONAL = ["config", "notes"]
def _registry_path(argv):
path = Path("fleet.toml")
out = []
i = 0
while i < len(argv):
if argv[i] == "--file" and i + 1 < len(argv):
path = Path(argv[i + 1]); i += 2; continue
out.append(argv[i]); i += 1
return path, out
def load(path):
if not path.exists():
sys.exit(f"fleet: registry not found: {path}")
with open(path, "rb") as f:
raw = tomllib.load(f)
return raw
def validate(raw, path):
errors = []
projects = raw.get("project", [])
if not isinstance(projects, list):
errors.append("[[project]] must be an array of tables")
projects = []
seen = set()
for i, p in enumerate(projects):
tag = p.get("name", f"#{i}")
for k in REQUIRED:
if k not in p:
errors.append(f"project {tag}: missing required field '{k}'")
if not isinstance(p.get("enabled", False), bool):
errors.append(f"project {tag}: 'enabled' must be a boolean")
name = p.get("name")
if name in seen:
errors.append(f"duplicate project name: {name}")
if name:
seen.add(name)
for k in p:
if k not in REQUIRED + OPTIONAL:
errors.append(f"project {tag}: unknown field '{k}'")
return projects, errors
def cmd_list(projects):
if not projects:
print("(no projects registered)")
return
w = max(len(p.get("name", "?")) for p in projects)
for p in projects:
flag = "enabled " if p.get("enabled") else "disabled"
print(f" {p.get('name','?'):<{w}} [{flag}] {p.get('harness','?')}@{p.get('ref','?')}"
f" {p.get('location','?')}")
def main():
path, argv = _registry_path(sys.argv[1:])
cmd = argv[0] if argv else "list"
raw = load(path)
projects, errors = validate(raw, path)
if cmd == "validate":
if errors:
for e in errors:
print(f" ERROR: {e}")
sys.exit(1)
print(f"fleet: OK — {len(projects)} project(s), schema v{raw.get('fleet', {}).get('version', '?')}")
return
# for list/status/get we still surface errors but don't hard-fail the read
for e in errors:
print(f" ERROR: {e}", file=sys.stderr)
if cmd == "list":
cmd_list(projects)
elif cmd == "status":
cmd_list(projects)
en = sum(1 for p in projects if p.get("enabled"))
print(f"\n total={len(projects)} enabled={en} disabled={len(projects) - en}"
f" (registry schema v{raw.get('fleet', {}).get('version', '?')})")
elif cmd == "get" and len(argv) > 1:
target = argv[1]
for p in projects:
if p.get("name") == target:
for k in REQUIRED + OPTIONAL:
if k in p:
print(f" {k:<9} = {p[k]!r}")
return
sys.exit(f"fleet: no project named {target!r}")
else:
sys.exit(__doc__)
if errors:
sys.exit(1)
if __name__ == "__main__":
main()

19
scripts/start-project.sh Executable file
View File

@ -0,0 +1,19 @@
#!/usr/bin/env bash
# start-project.sh <name> [agent...] — start a fleet project's agents via its harness.
#
# Resolves <name> in the PO's fleet.toml, then drives that project's harness. For an
# agent-orchestrator project that is `engine/agents.py up`. For an unfamiliar harness, READ that
# project's harness docs first — there is no rigid contract. This wrapper handles the common
# agent-orchestrator case; extend it (or act by hand) for other harnesses.
set -euo pipefail
[ $# -ge 1 ] || { echo "usage: start-project.sh <name> [agent...]" >&2; exit 1; }
NAME="$1"; shift || true
source "$(dirname "$0")/_resolve.sh"
IFS=$'\t' read -r LOCATION CONFIG HARNESS < <(resolve_project "$NAME")
[ -d "$LOCATION" ] || { echo "start-project: location not a local dir: $LOCATION (clone it first)" >&2; exit 1; }
case "$HARNESS" in
agent-orchestrator)
echo "start-project: $NAME → python3 engine/agents.py up $* (in $LOCATION)"
( cd "$LOCATION" && python3 engine/agents.py --config "$CONFIG" up "$@" );;
*) echo "start-project: harness '$HARNESS' is not agent-orchestrator — read its docs and drive it by hand." >&2; exit 2;;
esac

15
scripts/stop-project.sh Executable file
View File

@ -0,0 +1,15 @@
#!/usr/bin/env bash
# stop-project.sh <name> [agent...] — stop a fleet project's agents via its harness.
# Mirror of start-project.sh; for an agent-orchestrator project that is `engine/agents.py down`.
set -euo pipefail
[ $# -ge 1 ] || { echo "usage: stop-project.sh <name> [agent...]" >&2; exit 1; }
NAME="$1"; shift || true
source "$(dirname "$0")/_resolve.sh"
IFS=$'\t' read -r LOCATION CONFIG HARNESS < <(resolve_project "$NAME")
[ -d "$LOCATION" ] || { echo "stop-project: location not a local dir: $LOCATION" >&2; exit 1; }
case "$HARNESS" in
agent-orchestrator)
echo "stop-project: $NAME → python3 engine/agents.py down $* (in $LOCATION)"
( cd "$LOCATION" && python3 engine/agents.py --config "$CONFIG" down "$@" );;
*) echo "stop-project: harness '$HARNESS' is not agent-orchestrator — read its docs and drive it by hand." >&2; exit 2;;
esac

21
scripts/update-project.sh Executable file
View File

@ -0,0 +1,21 @@
#!/usr/bin/env bash
# update-project.sh <name> <new-ref> — bump a fleet project's harness submodule to a new ref.
#
# Updating the engine for a project = checkout a new ref of its `engine/` submodule + commit, IN
# THAT PROJECT'S REPO ONLY. It touches no other project (each pins its own copy). Afterwards, update
# the project's `ref` in this PO's fleet.toml so the registry stays accurate.
set -euo pipefail
[ $# -ge 2 ] || { echo "usage: update-project.sh <name> <new-ref>" >&2; exit 1; }
NAME="$1"; NEWREF="$2"
source "$(dirname "$0")/_resolve.sh"
IFS=$'\t' read -r LOCATION CONFIG HARNESS < <(resolve_project "$NAME")
[ -d "$LOCATION" ] || { echo "update-project: location not a local dir: $LOCATION" >&2; exit 1; }
[ "$HARNESS" = "agent-orchestrator" ] || { echo "update-project: harness '$HARNESS' not agent-orchestrator — update by hand per its docs." >&2; exit 2; }
echo "update-project: $NAME engine → $NEWREF (in $LOCATION)"
( cd "$LOCATION/engine" && git fetch -q --tags origin && git checkout -q "$NEWREF" )
( cd "$LOCATION" && git add engine \
&& git -c user.name="project-orchestrator" -c user.email="po@localhost" \
commit -q -m "chore: bump engine to $NEWREF" )
echo "update-project: project committed. Now update fleet.toml: set ref = \"$NEWREF\" for '$NAME'."
echo " (edit $(cd "$(dirname "$0")/.." && pwd)/fleet.toml, then: python3 scripts/fleet.py validate)"