feat: campaign mode — repeat each variant N times, aggregate distributions

run-harness-bench.sh now loops VARIANTS × BENCH_REPEATS (default 5), writes each
run's row to RESULTS-campaign.md.data immediately (survives interruption), and
aggregates per-variant median/mean/min/max/stdev + median duration into
RESULTS-campaign.md. Frees each run's repo/transcripts after tallying.
This commit is contained in:
2026-06-14 22:19:10 +00:00
parent b46dca003c
commit 37032ee363
2 changed files with 95 additions and 97 deletions

2
.gitignore vendored
View File

@ -4,3 +4,5 @@ __pycache__/
*.pyc
*.tmp
RESULTS-harness.md.tmp
RESULTS-campaign.md.data
RESULTS-campaign.md.data.hdr