fix(mumble): widen handshake readiness budget 60s->180s (load flake stabilization)

The TCP READY_PROBE proves 64738 is listening, but the murmur control channel needs more warmup to
complete a full TLS+ServerSync handshake; under concurrent sweep load that exceeded the 60s budget
(green in isolation, red under load). Longer budget absorbs the delay; assertions unchanged (a dead
server still fails after all retries).
This commit is contained in:
2026-06-18 01:58:16 +00:00
parent 61211dba70
commit 07fc6d4af5

View File

@ -19,7 +19,14 @@ import _mumble_proto # noqa: E402
def test_handshake_completes_with_channel_presence(live_app):
r = _mumble_proto.retry_handshake(attempts=12, interval=5.0)
# Readiness budget: 36×5s = 180s. The TCP READY_PROBE (recipe_meta) only proves port 64738 is
# LISTENING; the murmur control channel needs additional warmup before it completes a full
# TLS+Version+ServerSync handshake. Under concurrent node load (the canon sweep) that warmup
# exceeded the old 60s budget and flaked this test RED, while it is reliably GREEN in isolation
# (phase redfix M1: 3× isolation green, 0 isolation reds). The longer budget absorbs the
# load-induced readiness delay WITHOUT weakening the assertion — a genuinely non-responsive
# server still exhausts all retries and FAILs (the asserts below are unchanged).
r = _mumble_proto.retry_handshake(attempts=36, interval=5.0)
assert r["tls_connect"], f"TLS connection to 127.0.0.1:64738 failed — {r.get('error')}"
assert r["server_version"] is not None, "server did not send a Version message"