feat: optional log_tokens — per-phase token + time accounting

When [watchdog].log_tokens (or [loop].log_tokens) is true, the watchdog records
for each phase how many tokens each agent used (and the total) and how long the
phase took, appended to <log_dir>/token-log.jsonl. Tokens are summed from each
agent's session transcript, attributed by working dir. View with `agents.py
tokens`. Baseline snapshot at phase start + delta at phase advance/complete;
robust across watchdog restarts. Validated: the transcript sum matches an
independent external collector exactly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-14 21:48:17 +00:00
parent e0425e6108
commit 924874aafa
2 changed files with 149 additions and 0 deletions

View File

@ -63,6 +63,23 @@ heavy_interval = 300 # seconds between heal + phase-advance checks
limit_probe_fallback = 300 # re-probe cadence for a usage-limited agent when reset time is unparsable
limit_reset_slack = 45 # seconds to wait past a parsed reset before probing
stall_grace = 180 # seconds of slack past a WAITING-UNTIL marker before a stall reboot
log_tokens = false # opt-in: record per-phase token + time usage (see below)
```
**Per-phase token + time logging (`log_tokens`).** Set `log_tokens = true` (under `[watchdog]` or
`[loop]`) and the watchdog records, for **each phase**, how many tokens **each agent** used and how
long the phase took — appended as one JSON object per phase to `<log_dir>/token-log.jsonl`. Tokens
are summed from each agent's Claude Code session transcript and attributed **by working dir**, so
give each agent its own `dir` (the Builder/Adversary loop pair already uses separate clones) for
accurate per-agent numbers. The watchdog snapshots a baseline when a phase starts and writes the
delta (per agent, and the total) when the phase advances or the sequence completes — robust across
watchdog restarts. Pretty-print it with `agents.py tokens`:
```
phase dur(s) builder adversary TOTAL
-----------------------------------------------------
lex 372.0 3,910,118 3,221,447 7,131,565
parse 410.5 ...
```
### `[defaults]` — inherited by every agent
@ -240,6 +257,7 @@ agents.py status table of every agent: kind, backend, model, w
agents.py watchdog the supervisor loop (what the <prefix>watchdog session runs)
agents.py logs <name> tail that session's log
agents.py phase [show|next|set N] inspect / move the loop phase index
agents.py tokens per-phase token + time report (when [watchdog].log_tokens = true)
agents.py selftest regression-test the backend activity detector (needs no config)
agents.py init [dir] scaffold a starter agents.toml + prompts/ in a project dir
--config PATH use a specific config (default: ./agents.toml)