Revert "feat(lvl5): P1 — 5-rung ladder (L5=abra recipe lint) + de-capped level semantics"
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This reverts commit e219a7891d.
This commit is contained in:
@ -1,22 +1,20 @@
|
||||
"""Structured run results + results.json (Phase 3 §4.2 R1/R3; level semantics: phase lvl5).
|
||||
"""Phase 3 — structured run results + results.json (plan-phase3-results-ux.md §4.2, R1/R3).
|
||||
|
||||
Turns a run's per-tier pytest outcomes into a single `results.json` artifact carrying:
|
||||
Turns a run's per-tier pytest outcomes into a single `results.json` artifact carrying, per the plan:
|
||||
{ recipe, version, pr, ref, run_id, finished, stages:[{name,status,tests:[{name,status,ms}]}],
|
||||
level, rungs, lint:{status,detail,rules_failed},
|
||||
level, level_cap_reason, level_cap_rung, rungs,
|
||||
skips:{intentional:{rung:reason}, unintentional:[rung]},
|
||||
flags:{clean_teardown,no_secret_leak}, screenshot, summary_card }
|
||||
|
||||
Rung statuses (phase lvl5, operator-decided — see harness.level + DECISIONS.md): every rung is
|
||||
"pass" | "fail" | "skip" (INTENTIONAL — a declared/structural fact says the rung does not apply)
|
||||
| "unver" (UNINTENTIONAL — the rung should have run and wasn't verified; blocks the level like a
|
||||
fail). `derive_rungs` is the single place every N/A source is classified; anything it cannot
|
||||
attribute to a declared/structural fact defaults to "unver" (conservative). `skips` mirrors that
|
||||
split into results.json: intentional {rung: reason} / unintentional [rung] (= the unver rungs).
|
||||
`skips` splits the N/A (skipped) rungs by a simple rule: a skip is INTENTIONAL iff the recipe lists
|
||||
it (with a reason) in `recipe_meta.EXPECTED_NA = {rung: reason}`; any rung skipped but not listed is
|
||||
UNINTENTIONAL (a coverage gap to fill or declare). Skips still cap the level either way — the harness
|
||||
never claims a rung it did not verify; this only labels *why* a skip happened.
|
||||
|
||||
The per-test breakdown comes from JUnit XML emitted by each tier's pytest invocation (`--junitxml`),
|
||||
parsed here with the stdlib (no new dep). The integer **level** is computed by harness.level from a
|
||||
rung-status dict derived here (`derive_rungs`) from the tier results + structural signals the
|
||||
orchestrator holds; the classification table is in DECISIONS.md (phase lvl5).
|
||||
rung-status dict derived here (`derive_rungs`) from the tier results + deps/SSO signals the
|
||||
orchestrator holds; that mapping is documented in DECISIONS.md (Phase 3).
|
||||
|
||||
This module is import-pure (no side effects at import). `write_results` is the only writer; the
|
||||
orchestrator calls the build/write path inside a try/except so a results failure NEVER changes the
|
||||
@ -140,90 +138,53 @@ def derive_rungs(
|
||||
results: dict[str, str],
|
||||
*,
|
||||
backup_capable: bool,
|
||||
has_upgrade_target: bool,
|
||||
expected_na: dict | None = None,
|
||||
lint_status: str | None = None,
|
||||
has_custom: bool,
|
||||
) -> dict[str, str]:
|
||||
"""Translate the orchestrator's tier results + structural signals into the rung-status dict
|
||||
harness.level consumes — the FIVE essential rungs. This is the SINGLE place every N/A source
|
||||
is classified intentional ("skip") vs unintentional ("unver"); the table lives in DECISIONS.md
|
||||
(phase lvl5). Conservative by design: never reports "pass" it can't substantiate, and any
|
||||
rung that did not produce a pass/fail and has NO declared/structural reason is "unver".
|
||||
"""Translate the orchestrator's tier results into the rung-status dict harness.level consumes —
|
||||
the FOUR essential rungs only. Conservative by design — never reports a rung 'pass' it can't
|
||||
substantiate (cardinal guardrail: presentation never inflates).
|
||||
|
||||
L1 install : install tier pass. Always applies — never "skip" (non-run → unver).
|
||||
L2 upgrade : upgrade tier. Tier skipped + no upgrade target (only one published
|
||||
version, structural) → "skip"; declared in EXPECTED_NA → "skip";
|
||||
anything else non-pass/fail (prior-stage abort, tier excluded) → "unver".
|
||||
L3 backup/res : backup AND restore tiers pass. Not backup-capable (declared/structural)
|
||||
→ "skip"; EXPECTED_NA → "skip"; unverified-while-capable → "unver".
|
||||
L4 functional : the custom tier. No custom tests / tier skipped → EXPECTED_NA-declared
|
||||
"skip", else "unver" (absent functional coverage is a gap, not an
|
||||
intentional property of the recipe).
|
||||
L5 lint : from the lint executor (harness.lint). pass/fail only — every recipe can
|
||||
be linted, so there is NO intentional-skip escape hatch: a lint that
|
||||
could not run (timeout, abra missing, executor error) is "unver".
|
||||
L1 install : install tier pass.
|
||||
L2 upgrade : upgrade tier (skip → N/A: only one published version).
|
||||
L3 backup/res : backup AND restore tiers pass (N/A if not backup-capable).
|
||||
L4 functional : recipe-specific functional tests pass — the custom tier. N/A if none ran.
|
||||
|
||||
Integration (SSO/OIDC) and recipe-local are OPTIONAL and intentionally NOT rungs here — they
|
||||
never affect the level (SSO is still enforced for the run VERDICT in run_recipe_ci.py).
|
||||
never cap the level (SSO is still enforced for the run VERDICT in run_recipe_ci.py).
|
||||
"""
|
||||
expected = set((expected_na or {}).keys())
|
||||
rungs: dict[str, str] = {}
|
||||
rungs["install"] = level_mod.tier_to_rung(results.get("install"))
|
||||
|
||||
up = results.get("upgrade")
|
||||
if up in ("pass", "fail"):
|
||||
rungs["upgrade"] = up
|
||||
elif up == "skip" and not has_upgrade_target:
|
||||
# The orchestrator skipped the tier for the structural reason: nothing to upgrade from.
|
||||
rungs["upgrade"] = "skip"
|
||||
elif "upgrade" in expected:
|
||||
rungs["upgrade"] = "skip"
|
||||
else:
|
||||
rungs["upgrade"] = "unver"
|
||||
|
||||
br = level_mod.backup_restore_status(
|
||||
rungs["upgrade"] = level_mod.tier_to_rung(results.get("upgrade"))
|
||||
rungs["backup_restore"] = level_mod.backup_restore_status(
|
||||
results.get("backup"), results.get("restore"), backup_capable
|
||||
)
|
||||
if br == "unver" and "backup_restore" in expected:
|
||||
br = "skip"
|
||||
rungs["backup_restore"] = br
|
||||
|
||||
custom = results.get("custom")
|
||||
if custom in ("pass", "fail"):
|
||||
rungs["functional"] = custom
|
||||
elif "functional" in expected:
|
||||
rungs["functional"] = "skip"
|
||||
else:
|
||||
rungs["functional"] = "unver"
|
||||
|
||||
rungs["lint"] = lint_status if lint_status in ("pass", "fail") else "unver"
|
||||
if not has_custom or custom == "skip" or custom is None:
|
||||
rungs["functional"] = "na"
|
||||
elif custom == "fail":
|
||||
rungs["functional"] = "fail"
|
||||
else: # custom == "pass"
|
||||
rungs["functional"] = "pass"
|
||||
return rungs
|
||||
|
||||
|
||||
# Reasons attached to STRUCTURAL intentional skips (no EXPECTED_NA declaration needed — the
|
||||
# fact is read off the recipe itself).
|
||||
_STRUCTURAL_REASON = {
|
||||
"upgrade": "only one published version — no upgrade target",
|
||||
"backup_restore": "not backup-capable (no backupbot labels / declared)",
|
||||
}
|
||||
def skips(rungs: dict[str, str], expected_na: dict | None) -> dict:
|
||||
"""Split the SKIPPED (N/A) rungs into intentional vs unintentional (operator model).
|
||||
|
||||
|
||||
def skips(
|
||||
rungs: dict[str, str],
|
||||
expected_na: dict | None,
|
||||
) -> dict:
|
||||
"""Mirror the rung classification into results.json's `skips` block:
|
||||
{ "intentional": {rung: reason, ...}, # status "skip" — declared/structural, with why
|
||||
"unintentional": [rung, ...] } # status "unver" — should have run, wasn't verified
|
||||
The reason is the recipe's EXPECTED_NA declaration when present, else the structural fact
|
||||
derive_rungs skipped on. Purely descriptive — the level math lives in harness.level."""
|
||||
A recipe lists the rungs it intentionally skips, each with a reason, in
|
||||
`recipe_meta.EXPECTED_NA = {rung: reason}`. The rule is dead simple: a skipped rung is
|
||||
**intentional** iff it is in that list; any rung that is skipped and NOT in the list is
|
||||
**unintentional** (a coverage gap someone should either fill or declare). N/A still caps the
|
||||
level either way — the harness never claims a rung it did not verify — this only labels *why* a
|
||||
skip happened. Returns:
|
||||
{ "intentional": {rung: reason, ...}, # skipped AND declared in EXPECTED_NA
|
||||
"unintentional": [rung, ...] } # skipped but NOT declared
|
||||
"""
|
||||
expected = {str(k): str(v) for k, v in (expected_na or {}).items()}
|
||||
intentional = {
|
||||
r: expected.get(r) or _STRUCTURAL_REASON.get(r, "declared intentional")
|
||||
for r, st in rungs.items()
|
||||
if st == "skip"
|
||||
}
|
||||
unintentional = sorted(r for r, st in rungs.items() if st == "unver")
|
||||
na = [r for r, st in rungs.items() if st == "na"]
|
||||
intentional = {r: expected[r] for r in na if r in expected}
|
||||
unintentional = sorted(r for r in na if r not in expected)
|
||||
return {"intentional": intentional, "unintentional": unintentional}
|
||||
|
||||
|
||||
@ -239,8 +200,6 @@ def build_results(
|
||||
clean_teardown: bool,
|
||||
no_secret_leak: bool,
|
||||
finished_ts: float | None,
|
||||
has_upgrade_target: bool = True,
|
||||
lint: dict | None = None,
|
||||
screenshot: str | None = None,
|
||||
summary_card: str | None = None,
|
||||
expected_na: dict | None = None,
|
||||
@ -248,41 +207,17 @@ def build_results(
|
||||
) -> dict:
|
||||
"""Assemble the full results.json dict (no I/O). `finished_ts` is passed in (the orchestrator
|
||||
stamps it) so this stays pure and deterministic for unit tests. `expected_na` is the recipe's
|
||||
declared intentional-skip map (recipe_meta.EXPECTED_NA); `has_upgrade_target` is the structural
|
||||
"a previous published version exists" fact; `lint` is harness.lint.run_lint's result dict
|
||||
(None — e.g. an old caller — derives the lint rung as "unver": never a silent pass)."""
|
||||
declared intentional-skip map (recipe_meta.EXPECTED_NA) used to distinguish a deliberate skip from
|
||||
accidentally-missing coverage."""
|
||||
stages = collect_stages(records)
|
||||
lint = lint or {}
|
||||
lint_status = lint.get("status")
|
||||
rungs = derive_rungs(
|
||||
results,
|
||||
backup_capable=backup_capable,
|
||||
has_upgrade_target=has_upgrade_target,
|
||||
expected_na=expected_na,
|
||||
lint_status=lint_status,
|
||||
)
|
||||
# Surface lint in the per-stage table too (it has no pytest/JUnit tier), so the card's
|
||||
# stage breakdown carries all five rungs.
|
||||
if rungs["lint"] != "skip": # lint is never "skip", but stay defensive
|
||||
stages.append(
|
||||
{
|
||||
"name": "lint",
|
||||
"status": rungs["lint"],
|
||||
"tests": [
|
||||
{
|
||||
"name": "abra recipe lint",
|
||||
"classname": "lint",
|
||||
"source": "harness",
|
||||
"status": rungs["lint"],
|
||||
"ms": 0,
|
||||
"message": str(lint.get("detail") or ""),
|
||||
}
|
||||
],
|
||||
}
|
||||
)
|
||||
lvl = level_mod.compute_level(rungs)
|
||||
has_custom = any(r["tier"] == "custom" for r in records)
|
||||
rungs = derive_rungs(results, backup_capable=backup_capable, has_custom=has_custom)
|
||||
lvl, cap_reason = level_mod.compute_level(rungs)
|
||||
# The rung that capped the climb (lowest non-pass), or None on a full climb — lets a consumer
|
||||
# (card/badge) tell whether the cap was an intentional skip, an unintentional one, or a failure.
|
||||
capped = level_mod.RUNGS[lvl] if cap_reason else None
|
||||
return {
|
||||
"schema": 2,
|
||||
"schema": 1,
|
||||
"run_id": run_id(),
|
||||
"recipe": recipe,
|
||||
"version": version,
|
||||
@ -290,12 +225,9 @@ def build_results(
|
||||
"ref": (ref or "")[:12],
|
||||
"finished": finished_ts,
|
||||
"level": lvl,
|
||||
"level_cap_reason": cap_reason,
|
||||
"level_cap_rung": capped,
|
||||
"rungs": rungs,
|
||||
"lint": {
|
||||
"status": rungs["lint"],
|
||||
"detail": str(lint.get("detail") or ""),
|
||||
"rules_failed": list(lint.get("rules_failed") or []),
|
||||
},
|
||||
"skips": skips(rungs, expected_na),
|
||||
"stages": stages,
|
||||
"results": results,
|
||||
|
||||
Reference in New Issue
Block a user