Missing data is one of the classical issues in clinical trials and biostatistics. Since the National Research Council's report on missing data is issued in 2010, the paradigm has been shifted to the prevention of the missing data. Even the prevention has been given the great emphasis, the missing data is still inevitable in pretty much any clinical trial. When analyzing a clinical trial with the missing data, it is common that various sensitivity analyses need to be performed to see how the study result is robust to the handling of the missing data. Handling of the missing data depends on the assumptions.
Missing Data Assumptions and the Corresponding Imputation Methods
No assumption

MCAR

MAR

MNAR


Missing Complete at Random

Missing at Random – ignorability assumption

Missing Not at Random


The missingness is independent of both unobserved and observed data.
The probability of missingness is the same for all units.

Conditional on the observed data, the missingness is independent of
the unobserved measurements.
The probability a variable is missing depends only on available
information.

Not MCAR or MAR.
Missingness that depends on unobserved predictors.
Missingness is no longer at random if it depends on information that
has not been recorded and this information also predicts the missing values.
Missingness that depends on the missing value itself


Ignorable

Ignorable

Nonignorable

LOCF (last observation carried forward)
BOCF (baseline value carried forward)
WOCF (worst observation carried forward)
Imputation based on logical rules

CC (Completecase Analysis)  listwise deletion
Pairwise Deletion
Available Case analysis
Singlevalue Imputation (for example, mean replacement, regression prediction (conditional mean imputation), regression prediction plus error (stochastic regression imputation )
– under MCAR, throwing out
cases with missing data does not bias your inferences. However, there are many drawbacks

Maximum Likelihood using the EM algorithm – FIML (full information
maximum likelihood)
MMRM (mixed model repeated measurement) – REML (restricted maximum
likelihood)
Multiple Imputation
Two assumptions: the joint distribution of the data is multivariate
normal and the missing data mechanism is ignorable
Under MAR, it is acceptable to exclude the missing cases, as long as
the regression controls for all the variables that affect the probability of
missingness

PMM (Patternmixture modeling)
Jump to Reference
Last Mean Carried Forward.
Copy Differences in Reference
Copy Reference
Tipping Point Approach
Selection model (Heckman)

Web resources are available in discussing the missing data and the handling of the missing data. Some of the recent materials are listed below. For people who are using SAS, SAS procedures MI and MIANALYZE are handy for use in performing the multiple imputation and pattern mixture model:
 Missing Data Issues in Regulatory Clinical Trials by Lisa LaVange, FDA  2015
 Guideline on Missing Data in Confirmatory Clinical Trials  EMA  2011
 EU regulatory guidance on multiplicity issues and missing data
 RecentAdvances in missing Data Methods: Imputation and Weighting  Elizabeth Stuart  Youtube Video  2014
 Treatment of Missing Data in Randomized Clinical Trials
 Pattern Mixture Models for Missing Data by Mike Kenward  2012
 Missing Date Techniques With SAS by UCLA IDRE Statistical Consulting Group
 SAS/STAT®14.1 User’s Guide The MI Procedure  (HTML) 2015
 SAS/STAT®14.1 User’s Guide The MIANALYZEProcedure  ( HTML ) 2015