The PO's job is to manage projects on request, not watch them live. Remove the hourly wake/sweep entirely: - agents.toml: watch="heal" (recover-if-dead), no `wake` field - prompts/supervise.md: deleted - prompts/orchestrator.md, README.md, docs/bootstrap.md, docs/manage-projects.md: drop sweep/wake references; document operator-driven, no periodic sweep Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
98 lines
4.4 KiB
Markdown
98 lines
4.4 KiB
Markdown
# project-orchestrator (PO)
|
|
|
|
The **project-orchestrator** manages a *fleet* of independent projects — creating, starting,
|
|
stopping, updating, and monitoring them. It is itself **just a project** that uses the
|
|
[`agent-orchestrator`](https://git.autonomic.zone/recipe-maintainers/agent-orchestrator) harness
|
|
(vendored as the `engine/` submodule). What makes it the PO is its *job*, not its architecture: there
|
|
is no special "control-plane" code path.
|
|
|
|
```
|
|
project-orchestrator/
|
|
agents.toml this project's harness config (one persistent fleet-management agent)
|
|
engine/ the agent-orchestrator harness, pinned as a submodule @ v0.1.0
|
|
prompts/ the PO agent's role (orchestrator.md)
|
|
fleet.toml THE fleet registry — the only record of project ↔ harness ↔ ref ↔ location
|
|
scripts/ fleet.py + create/start/stop/update-project.sh — the management helpers
|
|
docs/ runbooks: manage-projects.md, fleet-registry.md, bootstrap.md
|
|
flake.nix/.lock a Nix devShell (python311 + tmux + git)
|
|
.ao-state/ runtime state + logs (gitignored)
|
|
```
|
|
|
|
## The one rule: knowledge is one-directional (PO → projects, never the reverse)
|
|
|
|
A project repo contains **nothing** about the PO or the fleet — no fleet metadata, no `fleet.toml`,
|
|
no mention of a PO. A project can be run and inspected entirely by hand and would have no idea a PO
|
|
exists. The *only* record of which projects exist, where, which harness, what ref, and whether
|
|
they're enabled is this repo's **`fleet.toml`**. The PO knows the projects; the projects never know
|
|
the PO.
|
|
|
|
What a project *does* carry is its `engine/` submodule pin — a plain git fact (this harness, this
|
|
ref) with no fleet semantics. The PO's `fleet.toml` mirrors that for fleet inventory.
|
|
|
|
## Quick start
|
|
|
|
```bash
|
|
nix develop # python311 + tmux + git on PATH (see "Nix")
|
|
|
|
python3 engine/agents.py status # the PO's own agents (reads agents.toml)
|
|
python3 scripts/fleet.py status # the fleet registry: projects, enabled, harness@ref
|
|
|
|
python3 engine/agents.py up # start the PO agent + its watchdog (needs claude on PATH)
|
|
python3 engine/agents.py down # stop everything
|
|
```
|
|
|
|
The PO agent itself is one persistent `claude` session (`prompts/orchestrator.md`), kept alive by the
|
|
harness watchdog (recover-if-dead) but **never** woken on a timer: it is operator-driven and manages
|
|
projects on request rather than watching them live. See `engine/README.md` for the full harness
|
|
reference.
|
|
|
|
## Managing the fleet
|
|
|
|
Full runbooks in [`docs/manage-projects.md`](docs/manage-projects.md). In short:
|
|
|
|
```bash
|
|
# create a new project: scaffolds engine/ submodule + harness config, NO PO/fleet metadata inside it
|
|
scripts/create-project.sh my-new-project --ref v0.1.0 [--register]
|
|
|
|
# drive an existing fleet project's harness (resolves location via fleet.toml)
|
|
scripts/start-project.sh <name>
|
|
scripts/stop-project.sh <name>
|
|
scripts/update-project.sh <name> <new-ref> # bump that project's engine pin (per-project, opt-in)
|
|
|
|
# inspect the fleet
|
|
python3 scripts/fleet.py list | status | validate | get <name>
|
|
```
|
|
|
|
The fleet registry schema is documented in [`docs/fleet-registry.md`](docs/fleet-registry.md).
|
|
|
|
## Nix
|
|
|
|
A `flake.nix` provides a reproducible devShell with the runtime deps (`python311` for stdlib
|
|
`tomllib`, plus `tmux` and `git` incl. submodule support):
|
|
|
|
```bash
|
|
nix develop # enter the shell
|
|
nix develop -c python3 -c 'import tomllib' # sanity: the runtime python has tomllib
|
|
nix develop -c python3 engine/agents.py status # run a command in the shell
|
|
nix flake check # evaluate + build the devShell
|
|
```
|
|
|
|
The agent CLI the PO uses (`claude`) is an **external, non-Nix tool** — install it per its own docs
|
|
and ensure it is on `PATH` before launching the live agent.
|
|
|
|
## First-time setup / re-scaffolding
|
|
|
|
Nothing creates the *first* PO — it is hand-scaffolded once.
|
|
[`docs/bootstrap.md`](docs/bootstrap.md) records exactly how (it is how this repo was made), and how
|
|
an existing PO can later re-scaffold itself.
|
|
|
|
## Cloning
|
|
|
|
Because the harness is a submodule, clone recursively:
|
|
|
|
```bash
|
|
git clone --recurse-submodules https://git.autonomic.zone/recipe-maintainers/project-orchestrator.git
|
|
# or, after a plain clone:
|
|
git submodule update --init --recursive
|
|
```
|