Time to event data is one of the most common data types in clinical trials. Traditionally, the log-rank test is used to compares the survival curves of two treatment groups.; the Kaplan Meier survival plot is used to illustrate the totality of time-to-event kinetics, including the estimated median survival time; the Cox-proportional hazards model is employed to provide the estimated relative effect (i.e., hazard ratio) between treatment arms. The performance of these analyses largely depends on the proportional hazards (PH) assumption – that the hazard ratio is constant over time. In other words, the hazard ratio provides an average relative treatment effect over time.
Before the time to event data is analyzed, it is typical for statisticians to check the proportional hazards assumption. Various methods can be used to check the proportional hazards assumptions - see a previous post "Visual Inspection and Statistical Tests for Proportional Hazard Assumption".
Recently we have seen more examples of the time to event data not following the proportional hazards assumption, even more examples in immuno-oncology clinical trials.
It is not the end of the world if the proportional hazards assumption is violated, various approaches have been proposed to handle the time to event data with non-proportional hazards.
In practice, it is pretty common that in the statistical analysis plan, we prespecify the log-rank test to calculate the p-values and then use Cox-proportional hazards regression model to calculate the hazard ratio, its 95% confidence interval, and p-value - I call this 'Splitting p-value and estimate of the treatment difference". Two different p-values will be calculated: one from the log-rank test and one from the Cox regression. If the proportional hazards assumption is met, it is better to use the p-value from the Cox regression since all estimates and p-value are coming from the model. However, When the proportional hazard assumption is violated, the Cox-proportional hazard model may no longer be the optimal approach to determine treatment effect and the Kaplan-Meier estimate of median survival may not be the most valid measure to summarize the results.
In a website post "Testing equality of two survival distributions: log-rank/Cox versus RMST", it stated:“One thing to note is that the log-rank test does not assume proportional hazards per se. It is a valid test of the null hypothesis of equality of the survival functions without any assumptions (save assumptions regarding censoring). It is however most powerful for detecting alternative hypotheses in which the hazards are proportional.”It is true that the log-rank test does not depend on the proportional hazards assumption. The log-rank test is still a valid test of the null hypothesis of equality of the survival functions without any assumptions even though that the log-rank test may not be optimal under non-proportional hazards.
- Roychoudhury et al (2021) Robust Design and Analysis of Clinical Trials With Nonproportional Hazards: A Straw Man Guidance From a Cross-Pharma Working Group. Statistics in Pharmaceutical research
- Lin et al (2020) Alternative Analysis Methods for Time to Event Endpoints Under Nonproportional Hazards: A Comparative Analysis. Statistics in Pharmaceutical Research
- Anderson & Rochoudhury (2018) Design and Analysis of Clinical Trials in the Presence of Non-Proportional Hazards. JSM 2018
- Roychoudhury & Anderson (2020) Robust Design and Analysis of Clinical Trials with Non-proportional Hazards: Methodology and Implementation with R. RISW 2020
- RMST (restricted mean survival time): according to a presentation by Lawrence et al from FDA, The idea of Restricted Mean Survival Time (RMST) goes back to Irwin (1949) and is further implemented in survival analysis by Uno et al. (2014). RMST is defined as the area under the survival curve up to t*, which should be pre-specified for a randomized trial. RMST may be loosely described as the event free expectancy over the restricted period between randomization and a defined, clinically relevant time horizon, called t*. RMST analyses are now built into the SAS procedures with Proc Lifetest and Proc RSMTREG. See a paper by Guo and Liang (2019) "Analyzing Restricted Mean Survival Time Using SAS/STAT®"
- Piecewise exponential regression allows for an early and late effect of treatment comparison. it is especially useful when the non-proportional hazards pattern is cross-over. Piecewise exponential regression can be fitted with SAS Proc MCMC and R package pch
- Estimation via the average hazard ratios (AHR) method of Schemper (2009) and the average regression effects (ARE) method of Xu and O’Quigley (2000) - the method can be implemented using the COXPHW package in R. COXPHW package is described as:
This package implements weighted estimation in Cox regression as proposed by Schemper, Wakounig and Heinze (Statistics in Medicine, 2009, doi: 10.1002/sim.3623). Weighted Cox regression provides unbiased average hazard ratio estimates also in case of non-proportional hazards. The package provides options to estimate time-dependent effects conveniently by including interactions of covariates with arbitrary functions of time, with or without making use of the weighting option. For more details we refer to Dunkler, Ploner, Schemper and Heinze (Journal of Statistical Software, 2018, doi: 10.18637/jss.v084.i02).
in a presentation by Kaur et al "Analytical Methods Under Non-Proportional Hazards: A Dilemma of Choice", the following methods were described:
Earlier this year, Mehrotra and West published a paper to describe their proposed method (5-START) to handle the heterogeneity of the patient population and potential non-proportional hazards (Lin et al (2021) Survival Analysis Using a 5-Step Stratified Testing and Amalgamation Routine (5-STAR) in Randomized Clinical Trials or here ):
"The power of the ubiquitous logrank test for a between-treatment comparison of survival times in randomized clinical trials can be notably less than desired if the treatment hazard functions are non-proportional, and the accompanying hazard ratio estimate from a Cox proportional hazards model can be hard to interpret. Increasingly popular approaches to guard against the statistical adverse effects of non-proportional hazards include the MaxCombo test (based on a versatile combination of weighted logrank statistics) and a test based on a between-treatment comparison of restricted mean survival time (RMST). Unfortunately, neither the logrank test nor the latter two approaches are designed to leverage what we refer to as structured patient heterogeneity in clinical trial populations, and this can contribute to suboptimal power for detecting a between-treatment difference in the distribution of survival times. Stratified versions of the logrank test and the corresponding Cox proportional hazards model based on pre-specified stratification factors represent steps in the right direction. However, they carry unnecessary risks associated with both a potential suboptimal choice of stratification factors and with potentially implausible dual assumptions of proportional hazards within each stratum and a constant hazard ratio across strata.
We have developed and described a novel alternative to the aforementioned current approaches for survival analysis in randomized clinical trials. Our approach envisions the overall patient population as being a finite mixture of subpopulations (risk strata), with higher to lower ordered risk strata comprised of patients having shorter to longer expected survival regardless of treatment assignment. Patients within a given risk stratum are deemed prognostically homogeneous in that they have in common certain pre-treatment characteristics that jointly strongly associate with survival time. Given this conceptualization and motivated by a reasonable expectation that detection of a true treatment difference should get easier as the patient population gets prognostically more homogeneous, our proposed method follows naturally. Starting with a pre-specified set of baseline covariates (Step 1), elastic net Cox regression (Step 2) and a subsequent conditional inference tree algorithm (Step 3) are used to segment the trial patients into ordered risk strata; importantly, both steps are blinded to patient-level treatment assignment. After unblinding, a treatment comparison is done within each formed risk stratum (Step 4) and stratum-level results are combined for overall estimation and inference (Step 5)."
very nice summary. thank you!
ReplyDeleteThank you so much for your thorough summary! This is very helpful for my study :-)
ReplyDeleteThank you very much for the marvelous summary! Really helpful!
ReplyDelete