mfowler 0456837444 feat(po): drop periodic fleet sweep — operator-driven, recover-if-dead only
The PO's job is to manage projects on request, not watch them live. Remove the
hourly wake/sweep entirely:

- agents.toml: watch="heal" (recover-if-dead), no `wake` field
- prompts/supervise.md: deleted
- prompts/orchestrator.md, README.md, docs/bootstrap.md, docs/manage-projects.md:
  drop sweep/wake references; document operator-driven, no periodic sweep

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 15:04:08 +00:00

project-orchestrator (PO)

The project-orchestrator manages a fleet of independent projects — creating, starting, stopping, updating, and monitoring them. It is itself just a project that uses the agent-orchestrator harness (vendored as the engine/ submodule). What makes it the PO is its job, not its architecture: there is no special "control-plane" code path.

project-orchestrator/
  agents.toml          this project's harness config (one persistent fleet-management agent)
  engine/              the agent-orchestrator harness, pinned as a submodule @ v0.1.0
  prompts/             the PO agent's role (orchestrator.md)
  fleet.toml           THE fleet registry — the only record of project ↔ harness ↔ ref ↔ location
  scripts/             fleet.py + create/start/stop/update-project.sh — the management helpers
  docs/                runbooks: manage-projects.md, fleet-registry.md, bootstrap.md
  flake.nix/.lock      a Nix devShell (python311 + tmux + git)
  .ao-state/           runtime state + logs (gitignored)

The one rule: knowledge is one-directional (PO → projects, never the reverse)

A project repo contains nothing about the PO or the fleet — no fleet metadata, no fleet.toml, no mention of a PO. A project can be run and inspected entirely by hand and would have no idea a PO exists. The only record of which projects exist, where, which harness, what ref, and whether they're enabled is this repo's fleet.toml. The PO knows the projects; the projects never know the PO.

What a project does carry is its engine/ submodule pin — a plain git fact (this harness, this ref) with no fleet semantics. The PO's fleet.toml mirrors that for fleet inventory.

Quick start

nix develop                                   # python311 + tmux + git on PATH (see "Nix")

python3 engine/agents.py status               # the PO's own agents (reads agents.toml)
python3 scripts/fleet.py status               # the fleet registry: projects, enabled, harness@ref

python3 engine/agents.py up                   # start the PO agent + its watchdog (needs claude on PATH)
python3 engine/agents.py down                 # stop everything

The PO agent itself is one persistent claude session (prompts/orchestrator.md), kept alive by the harness watchdog (recover-if-dead) but never woken on a timer: it is operator-driven and manages projects on request rather than watching them live. See engine/README.md for the full harness reference.

Managing the fleet

Full runbooks in docs/manage-projects.md. In short:

# create a new project: scaffolds engine/ submodule + harness config, NO PO/fleet metadata inside it
scripts/create-project.sh my-new-project --ref v0.1.0 [--register]

# drive an existing fleet project's harness (resolves location via fleet.toml)
scripts/start-project.sh  <name>
scripts/stop-project.sh   <name>
scripts/update-project.sh <name> <new-ref>     # bump that project's engine pin (per-project, opt-in)

# inspect the fleet
python3 scripts/fleet.py list | status | validate | get <name>

The fleet registry schema is documented in docs/fleet-registry.md.

Nix

A flake.nix provides a reproducible devShell with the runtime deps (python311 for stdlib tomllib, plus tmux and git incl. submodule support):

nix develop                                    # enter the shell
nix develop -c python3 -c 'import tomllib'      # sanity: the runtime python has tomllib
nix develop -c python3 engine/agents.py status  # run a command in the shell
nix flake check                                 # evaluate + build the devShell

The agent CLI the PO uses (claude) is an external, non-Nix tool — install it per its own docs and ensure it is on PATH before launching the live agent.

First-time setup / re-scaffolding

Nothing creates the first PO — it is hand-scaffolded once. docs/bootstrap.md records exactly how (it is how this repo was made), and how an existing PO can later re-scaffold itself.

Cloning

Because the harness is a submodule, clone recursively:

git clone --recurse-submodules https://git.autonomic.zone/recipe-maintainers/project-orchestrator.git
# or, after a plain clone:
git submodule update --init --recursive
Description
The project-orchestrator (PO): a project that uses the agent-orchestrator harness to create, run, and monitor a fleet of other projects. The only place that knows about the fleet.
Readme 58 KiB
Languages
Shell 59%
Python 28.3%
Nix 12.7%