Biostatistics

AstraZeneca tests the "subgroups cannot save a failed trial" rule

A prespecified 62% survival benefit in a kappa-isotype cohort of a failed Phase 3 is now AstraZeneca's filing strategy — and a live audit of how much subgroup evidence the agencies will actually accept.

Subgroup analysis, biomarkers & precision medicine
Regulatory
Leadership & Strategy

AstraZeneca missed the primary endpoint of the Phase 3 CARES program in AL amyloidosis and is filing anyway, on the strength of a prespecified subgroup — the roughly 20% of patients with a kappa-predominant light-chain isotype, a 72-patient cohort in which 31.3% of anselamimab recipients died versus 58.3% on placebo (Fierce Biotech; NCT04512235). The pitch echoes Prothena’s birtamimab playbook — VITAL failed overall, a subgroup (Mayo stage IIIb) looked striking, and a confirmatory subgroup-restricted trial (AFFIRM-AL) followed — though AZ’s salvage cohort is defined by light-chain isotype rather than disease stage, a biologically distinct rationale. What is unusual is AstraZeneca’s apparent willingness to skip the confirmatory step and ask regulators to accept the isotype subgroup as the pivotal result.

The FDA Working Group on Subgroup Analyses has stated for years that subgroup analyses “could inflate type 1 error” and “cannot save failed trials,” a position consistent with EMA’s subgroup guideline. Prespecification matters, but it does not dissolve the multiplicity problem: the subgroup p-value lives inside a trial whose primary test has already been spent, and no adjustment for that ordering is described publicly. Rare disease and unmet need will buy some regulatory latitude. They will not buy an exemption from the inferential question of which population the effect estimate actually describes.

The methods literature picked an awkward week to converge

The same week’s journals offer a parallel commentary the sponsor would probably prefer readers ignore. A new Statistics in Biopharmaceutical Research paper makes the case for Bayesian hierarchical models as the principled middle path for confirmatory subgroup analyses — partial pooling that shrinks extreme estimates toward the overall mean while letting genuine heterogeneity surface. Applied honestly to CARES, a BHM would pull a 62% subgroup effect closer to the null marginal estimate and widen its credible interval. That is not a refutation of AstraZeneca’s biology; it is a calibration of how much the isotype estimate should be trusted on its own.

Reinforcing the skeptic frame, the first empirical audit of “personalized,” “individualized,” and “precision” RCTs, published in the Journal of Clinical Epidemiology, finds the label is applied to wildly heterogeneous designs with low transparency and high risk of bias. The terminology that sponsors lean on to justify subgroup-driven filings turns out to be, as a body of evidence, considerably weaker than its rhetorical weight suggests. Several frontier methods in the same cluster — latent-class umbrella designs, calibrated spike-and-slab marker-stratified Phase II, and value-guided boosting for subgroup identification — share an honest premise the AZ filing implicitly avoids: subgroup claims are most defensible when the design was built for them from the start.

What the CARES decision will actually settle

The agencies’ response will set a useful precedent regardless of direction. If FDA or EMA accept the kappa-isotype cohort as pivotal evidence on the strength of biological rationale, prespecification, and the rare-disease context, the implicit threshold for subgroup salvage drops, and sponsors with a single failed Phase 3 under the one-pivotal-trial policy will rationally calibrate toward bigger, more granular prespecified subgroup grids. If they require AFFIRM-AL-style confirmation, the Working Group’s stated position holds, and the operative rule becomes: prespecification buys you a hypothesis, not a label.

Biometrics teams should not wait for the verdict to act. Anyone writing an SAP with a credible heterogeneity story should be specifying the subgroup analysis as a Bayesian hierarchical or otherwise multiplicity-aware procedure now, with the prior on between-subgroup variance defended in writing rather than chosen at lock. Anyone running a program where a stratum-restricted confirmatory trial is plausibly the fallback should design the Phase 3 so that fallback is statistically coherent — adaptive enrichment with valid post-selection inference, not a regular two-arm trial with retrospective hopes.

Protocol read: Prespecification is necessary, not sufficient; the CARES filing is the cleanest test in years of whether regulators still mean the second half of that sentence. Plan your subgroup strategy on the assumption they do.

What to do now:

Pre-specify confirmatory subgroup analyses as BHMs or equivalent partial-pooling procedures, with the between-subgroup variance prior justified in the SAP, not negotiated at unblinding.
For programs where stratum-restricted salvage is a realistic contingency, build adaptive enrichment with valid post-selection inference into the Phase 3 from the start.
Watch the CARES regulatory interactions and any AdCom: the operative precedent for subgroup-based filing under one-pivotal-trial is being set in this cycle.

AstraZeneca tests the "subgroups cannot save a failed trial" rule

Read next

EMA starts drawing lines around external control arms

Dataset-JSON, USDM and IDMP arrive in the same quarter

Pfizer's $10B Innovent bet meets the China-data reckoning