On Biostatistics and Clinical Trials: Handling of Missing Data: Comparison of MMRM (mixed model repeated measures) versus MI (multiple imputation)

Longitudinal study has become one of the most commonly adopted designs in clinical trials. Since the outcome measures are performed at various visits, it is usually the case that for some subjects in the study, the outcome measures will not be available at some visits (for example after subjects drop out from the study or lost-to-follow-up) - this is where the missing data issue arises. If the outcome measure is a continuous variable, the missing data issues can be handled implicitly through using the mixed-effects repeated measure (MMRM) models or explicitly through multiple imputations (MI).

Both MMRM and MI methods are based on the assumption of missing at random (MAR) and are model-based approaches suggested by EMA's Guideline on Missing Data in Confirmatory Clinical Trials and US National Research Council: The Prevention and Treatment of Missing Data in Clinical Trials. US FDA has not issued any guidance on handling the missing data in clinical trials, but generally follows the guidelines from the National Research Council.

In terms of MMRM and MI, which one should be the primary method for handling the missing data? For a long time, it seems that in the US, the MMRM is the preferred method in handling the missing data and analyzing the longitudinal data with continuous outcome measures. The MI methods are generally used as sensitivity analyses to check the robustness of the primary analyses against the deviation from the MAR assumption. This can be observed by the article by Dr. Siddiqui in FDA "MMRM versus MI in Dealing with Missing Data - a Comparison Based on 25 NDA data sets" and many NDA / BLA reviews (listed below).

FDA Statistical Review for NDA 210655 in the indication of Schizophrenia:

"The primary analysis was conducted on the change from baseline in the total PANSS score at Day 57 (primary time point) based on the ITT population. A mixed-effects model for repeated measures (MMRM) was used with treatment, visit, interaction of treatment and visit as fixed effects and the baseline total PANSS score as a covariate. Data from Days 15, 29, 43, and 57 were used. The unstructured covariance matrix was be used to model the within-subject variance-covariance errors."

"In addition to the model-based missing data approach of the MMRM model, the primary efficacy analysis was also analyzed using a pattern mixture model (PMM) and a multiple imputation approach as sensitivity analyses. "

FDA BLA 761037 Kevzara (sarilumab) in Treatment of rheumatoid arthritis

"The continuous HAQ-DI change from baseline at Week 16 was analyzed with a mixed model for repeated measures (MMRM). The repeated-measures analysis was based on the restricted maximum likelihood method assuming an unstructured covariance structure to model the within-subject errors. The model, including treatment, region, prior biologic use, visit (all visits from week 2 to week 16), and treatment-by-visit interaction as fixed effects and baseline as a covariate, was used to test the difference between each active treatment group versus placebo in the change from baseline in HAQ-DI at Week 16. The data collected after treatment discontinuation or rescue were set to missing. Therefore, the MMRM analysis assumed a missing-at-random (MAR) mechanism for missing data due to dropout and post-rescue data."

FDA NDA 203313/203314S-2 /S-3Tresiba;Ryzodeg 70/30Glycemic Control in Patients with Diabetes

The applicant used a mixed effect model for repeated measure (MMRM) to assess the efficacy of IDegAsp compared with IDet. The MMRM model included treatment, sex, region, age group and visits as factors and baseline as covariate, and interactions between visits and all factors and covariate. An unstructured covariance matrix was utilized for model fitting.

Multiple imputation was performed as sensitivity analysis

SNDA for Merck's Dulera in the treatment of asthma (2019)

"Missing Data Handling and Sensitivity Analyses The primary analysis incorporated a control-based multiple imputation of missing data. Missing data for subjects who discontinued treatment early were estimated using the MF group; that is, the change from baseline AM post-dose ppFEV1 in patients who discontinued treatment and missed study visits was assumed to be similar to the change from baseline in patients who continued study visits through Week 12 in the MF treatment group. The dataset was first multiply imputed to have monotone missing patterns, then for each visit, a regression method was used to impute for missing data on both study drug arm and the control arm based on trend from the control arm. After applying the control-based multiple imputation, the cLDA analysis was performed. MF/F 100/10 mcg BID was considered superior to MF 100 mcg BID with a p-value less than 0.05. "

EMA seems to have a different opinion about missing data handling using MMRM or MI. On several occasions, we have heard that EMA prefers the MI approach in handling the missing data especially the reference-baseline multiple imputation. They are moving towards developing the reference-based multiple imputation into the new standard missing data approach.

Here is a table summarizing some comparisons between the MMRM and MI in handling the missing data.

	MMRM	MI
Missing data mechanism	MAR (missing at random)
Missing data imputation	Not imputed for individual missing values But missing data is implicitly imputed	Individual missing values are explicitly imputed
# of steps for calculations	One step	At least three steps: Imputation model to create multiple data sets with missing values filled in Analysis model to analyze each imputed data set Using Robin’s rule to combine results for inference
Analysis Model	Mixed model with Maximum likelihood-based method	Analysis of Covariance or Mixed model using maximum likelihood-based method
Data points used in analyses	Utilized all observed data points from all visits	Usually, with ANCOVA, only the data points for the corresponding visits (with imputed values) are used.
SAS procedure(s)	Proc Mixed	Imputation model: Proc MI Analysis model: Proc Mixed, Proc GLM, Proc Genmod,… Robin’s rule: Proc MIANALYZE
Results	The two approaches will be approximately equivalent, provided the variables used in the imputation model are the same as those included in the analysis model, and conditionals are accommodated by a single joint model. In such settings, MI essentially provides an approximation to the observed likelihood analysis. If an infinite number of imputations could be performed, then the two approaches would be equivalent. In practice, the level of equivalence will depend on the number of imputations due to the Monte Carlo (simulation) sampling variability of the imputation process (described in more detail below), thus will be stronger for a larger number of imputations.
Auxiliary variables	Can not be used	Auxiliary variables can be used in the imputation model to improve the accuracy of the missing data prediction
Information observed post-randomization	Can not be included in the MMRM model	Can be included in the imputation model to improve the accuracy of the missing data prediction and can’t be included in the analysis model (MI approach allows the differences in the covariates used in the imputation model and in analysis model
Justification of MAR assumption	Not available through MMRM model	Justification of MAR assumption can be performed through the tipping point approach or delta-based imputation
Handling the MNAR (missing not at random)	Not directly available through MMRM	Can be performed through PMM (pattern mixed model), reference-based or control-based multiple imputation
For studies with only one post-baseline measure	Not appropriate	Appropriate to use MI to impute the missing data and then run analysis of covariance model as the analysis model
For outcome measures that are not continuous variables	Like MMRM, there are statistical approaches that handle missing data without employing explicit imputation. As mentioned in the EMA guideline “For categorical responses and count data, the so-called marginal (e.g. generalized estimating equations (GEE)) and random-effects (e.g. generalized linear mixed models (GLMM)) approaches are in use. Likelihood-based methods (MMRM and GLMM) and some extended GEE (i.e. weighted GEE) models are applicable under MCAR and MAR assumptions.”	MI approach can be easily applied to the outcome measures that are categorical responses or count data with missing data. The analysis model may need to be PROC Logistics; PROC GLIMMIX, PROC NLMIXED, or PROC GENMOD
Preferred by regulatory agencies	US FDA but with multiple imputation approaches as sensitivity analyses (for example, reference-based MI, PMM, tipping point)	EMA

REFERENCES:

CRO et al (2020) Sensitivity analysis for clinical trials with missing continuous outcome data using controlled multiple imputation: A practical guide
Michael Kenward (2013) The handling of missing data in clinical trials
Lisa M. LaVange Thomas Permutt (statistics in Medicine, 2015) A regulatory perspective on missing data in the aftermath of the NRC report
Little et al (NEJM 2012) The Prevention and Treatment of Missing Data in Clinical Trials
Ware et al (NEJM 2012) Missing Data
Permutt (Statistics in Medicine, 2015) Sensitivity analysis for missing data in regulatory submissions
LaVange and Permutt (Statistics in Medicine, 2015) A regulatory perspective on missing data in the aftermath of the NRC report
Liu and Pang (Statistics in Biopharmaceutical Research, 2017) Control-Based Imputation and Delta-Adjustment Stress Test for Missing Data Analysis in Longitudinal Clinical Trials
Tang (Statistics in Biopharmaceutical Research, 2017) An Efficient Multiple Imputation Algorithm for Control-Based and Delta-Adjusted Pattern Mixture Models using SAS
Berglund and Heeringa (2014) Multiple Imputation of Missing Data Using SAS Chapter 7 gives examples of missing data imputation for dichotomous outcome variable and count data.

2 comments:

Anonymous said...: Very informative. Thanks for sharing!; 8:24 PM
Joakim Englund said...: Thank you Dr Deng for a very informative blog post (as always)! It very well mimics my own understanding of the topic as well.

However, I do wonder if it is reasonable to (as FDA might prefer) first do MMRM as the primary analysis and then MI with MMRM as the analysis model (now with imputed values) as an extra sensitivity analysis. The table you provide seems to indicate that this is an option (if I read it correctly). But would it be more logical/informative to compare a primary MMRM analysis with an MI-ANCOVA (for a specific timepoint)? What is your opinion on this?; 4:24 PM

On Biostatistics and Clinical Trials

Sunday, November 29, 2020

Handling of Missing Data: Comparison of MMRM (mixed model repeated measures) versus MI (multiple imputation)

2 comments:

About Me

Promoting Statistical Insight