Biostatistics

Win Ratio's Stress Test: Four Papers, Three SAP Fixes

A concentrated burst of methodology work shows the win ratio can contradict its own component results, is noncollapsible, and needs RMST as a co-report.

Survival & time-to-event methods
Regulatory

The win ratio arrived with a compelling pitch: it respects clinical hierarchy, handles composite endpoints sensibly, and offers an interpretable pairwise contrast. Cardiovascular outcomes trials bought in. So, naturally, the methodology literature chose this moment to stress-test it in public.

Four papers published in a single cycle — in Statistics in Medicine and Journal of Biopharmaceutical Statistics — arrive at findings that should prompt a quiet review of any SAP currently pre-specifying win ratio as a primary endpoint.

The most arresting result is structural. A May 2026 paper in Statistics in Medicine proves, using nothing more elaborate than a 2×2 frequency table, that treatment can have higher marginal success probabilities on both component endpoints and still return a win ratio below 1. The mechanism is baked into the hierarchical structure: the secondary endpoint is evaluated only within primary-tie strata, inducing a reweighting that can amplify strata where treatment looks worse. The paper proposes “minimal reporting diagnostics” — decomposed component-level summaries and tie-stratum breakdowns — that should be pre-specified, not reconstructed post hoc.

Three further findings sharpen the picture. Win statistics are noncollapsible — stratum-specific and marginal estimates diverge even absent confounding, the same structural problem that has already cost sponsors credibility in HR-based submissions. The win odds now has a formal covariate adjustment framework via the marginal probabilistic index, validated on CANTOS and HF-ACTION data, but the method carries a small-sample type I error inflation with no correction offered in the paper. And head-to-head simulations confirm win ratio outperforms RMST when effects emerge early, while RMST dominates for late-onset benefits — making dual reporting the honest pre-specification choice under NPH.

The actionable read: add component-level decomposition diagnostics to the SAP, pre-specify covariate-adjusted win odds with an unadjusted sensitivity, and co-report RMST difference for programs where late effects are plausible. None of these requires a protocol change. The win ratio is not broken — it just has more fine print than its early advocates advertised.

Protocol read: The win ratio remains a defensible primary endpoint for hierarchical composites, but the fine print has caught up — pre-specifying component diagnostics, adjusted-and-unadjusted estimates, and RMST co-reports is now where credibility lives.

What to do now:

Add component-level decomposition diagnostics and tie-stratum breakdowns to any SAP currently pre-specifying win ratio as primary.
Pre-specify covariate-adjusted win odds via the MPI framework with an unadjusted sensitivity; document the small-sample type I error caveat.
Co-report RMST difference in programs where late-onset benefits are plausible — the WR/RMST trade-off is now too well-documented to leave to reviewer discretion.

Win Ratio's Stress Test: Four Papers, Three SAP Fixes

Read next

EMA starts drawing lines around external control arms

Dataset-JSON, USDM and IDMP arrive in the same quarter

Pfizer's $10B Innovent bet meets the China-data reckoning