Files
project-orchestrator/README.md
mfowler 0456837444 feat(po): drop periodic fleet sweep — operator-driven, recover-if-dead only
The PO's job is to manage projects on request, not watch them live. Remove the
hourly wake/sweep entirely:

- agents.toml: watch="heal" (recover-if-dead), no `wake` field
- prompts/supervise.md: deleted
- prompts/orchestrator.md, README.md, docs/bootstrap.md, docs/manage-projects.md:
  drop sweep/wake references; document operator-driven, no periodic sweep

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 15:04:08 +00:00

98 lines
4.4 KiB
Markdown

# project-orchestrator (PO)
The **project-orchestrator** manages a *fleet* of independent projects — creating, starting,
stopping, updating, and monitoring them. It is itself **just a project** that uses the
[`agent-orchestrator`](https://git.autonomic.zone/recipe-maintainers/agent-orchestrator) harness
(vendored as the `engine/` submodule). What makes it the PO is its *job*, not its architecture: there
is no special "control-plane" code path.
```
project-orchestrator/
agents.toml this project's harness config (one persistent fleet-management agent)
engine/ the agent-orchestrator harness, pinned as a submodule @ v0.1.0
prompts/ the PO agent's role (orchestrator.md)
fleet.toml THE fleet registry — the only record of project ↔ harness ↔ ref ↔ location
scripts/ fleet.py + create/start/stop/update-project.sh — the management helpers
docs/ runbooks: manage-projects.md, fleet-registry.md, bootstrap.md
flake.nix/.lock a Nix devShell (python311 + tmux + git)
.ao-state/ runtime state + logs (gitignored)
```
## The one rule: knowledge is one-directional (PO → projects, never the reverse)
A project repo contains **nothing** about the PO or the fleet — no fleet metadata, no `fleet.toml`,
no mention of a PO. A project can be run and inspected entirely by hand and would have no idea a PO
exists. The *only* record of which projects exist, where, which harness, what ref, and whether
they're enabled is this repo's **`fleet.toml`**. The PO knows the projects; the projects never know
the PO.
What a project *does* carry is its `engine/` submodule pin — a plain git fact (this harness, this
ref) with no fleet semantics. The PO's `fleet.toml` mirrors that for fleet inventory.
## Quick start
```bash
nix develop # python311 + tmux + git on PATH (see "Nix")
python3 engine/agents.py status # the PO's own agents (reads agents.toml)
python3 scripts/fleet.py status # the fleet registry: projects, enabled, harness@ref
python3 engine/agents.py up # start the PO agent + its watchdog (needs claude on PATH)
python3 engine/agents.py down # stop everything
```
The PO agent itself is one persistent `claude` session (`prompts/orchestrator.md`), kept alive by the
harness watchdog (recover-if-dead) but **never** woken on a timer: it is operator-driven and manages
projects on request rather than watching them live. See `engine/README.md` for the full harness
reference.
## Managing the fleet
Full runbooks in [`docs/manage-projects.md`](docs/manage-projects.md). In short:
```bash
# create a new project: scaffolds engine/ submodule + harness config, NO PO/fleet metadata inside it
scripts/create-project.sh my-new-project --ref v0.1.0 [--register]
# drive an existing fleet project's harness (resolves location via fleet.toml)
scripts/start-project.sh <name>
scripts/stop-project.sh <name>
scripts/update-project.sh <name> <new-ref> # bump that project's engine pin (per-project, opt-in)
# inspect the fleet
python3 scripts/fleet.py list | status | validate | get <name>
```
The fleet registry schema is documented in [`docs/fleet-registry.md`](docs/fleet-registry.md).
## Nix
A `flake.nix` provides a reproducible devShell with the runtime deps (`python311` for stdlib
`tomllib`, plus `tmux` and `git` incl. submodule support):
```bash
nix develop # enter the shell
nix develop -c python3 -c 'import tomllib' # sanity: the runtime python has tomllib
nix develop -c python3 engine/agents.py status # run a command in the shell
nix flake check # evaluate + build the devShell
```
The agent CLI the PO uses (`claude`) is an **external, non-Nix tool** — install it per its own docs
and ensure it is on `PATH` before launching the live agent.
## First-time setup / re-scaffolding
Nothing creates the *first* PO — it is hand-scaffolded once.
[`docs/bootstrap.md`](docs/bootstrap.md) records exactly how (it is how this repo was made), and how
an existing PO can later re-scaffold itself.
## Cloning
Because the harness is a submodule, clone recursively:
```bash
git clone --recurse-submodules https://git.autonomic.zone/recipe-maintainers/project-orchestrator.git
# or, after a plain clone:
git submodule update --init --recursive
```