Sunday, November 29, 2020

Handling of Missing Data: Comparison of MMRM (mixed model repeated measures) versus MI (multiple imputation)

Longitudinal study has become one of the most commonly adopted designs in clinical trials. Since the outcome measures are performed at various visits, it is usually the case that for some subjects in the study, the outcome measures will not be available at some visits (for example after subjects drop out from the study or lost-to-follow-up) - this is where the missing data issue arises. If the outcome measure is a continuous variable, the missing data issues can be handled implicitly through using the mixed-effects repeated measure (MMRM) models or explicitly through multiple imputations (MI).

Both MMRM and MI methods are based on the assumption of missing at random (MAR) and are model-based approaches suggested by EMA's Guideline on Missing Data in Confirmatory Clinical Trials and US National Research Council: The Prevention and Treatment of Missing Data in Clinical Trials. US FDA has not issued any guidance on handling the missing data in clinical trials, but generally follows the guidelines from the National Research Council. 

In terms of MMRM and MI, which one should be the primary method for handling the missing data? For a long time, it seems that in the US, the MMRM is the preferred method in handling the missing data and analyzing the longitudinal data with continuous outcome measures. The MI methods are generally used as sensitivity analyses to check the robustness of the primary analyses against the deviation from the MAR assumption. This can be observed by the article by Dr. Siddiqui in FDA "MMRM versus MI in Dealing with Missing Data - a Comparison Based on 25 NDA data sets" and many NDA / BLA reviews (listed below). 

FDA Statistical Review for NDA 210655 in the indication of Schizophrenia:
"The primary analysis was conducted on the change from baseline in the total PANSS score at Day 57 (primary time point) based on the ITT population. A mixed-effects model for repeated measures (MMRM) was used with treatment, visit, interaction of treatment and visit as fixed effects and the baseline total PANSS score as a covariate. Data from Days 15, 29, 43, and 57 were used. The unstructured covariance matrix was be used to model the within-subject variance-covariance errors."

"In addition to the model-based missing data approach of the MMRM model, the primary efficacy analysis was also analyzed using a pattern mixture model (PMM) and a multiple imputation approach as sensitivity analyses. "

FDA BLA 761037 Kevzara (sarilumab) in Treatment of rheumatoid arthritis
"The continuous HAQ-DI change from baseline at Week 16 was analyzed with a mixed model for repeated measures (MMRM). The repeated-measures analysis was based on the restricted maximum likelihood method assuming an unstructured covariance structure to model the within-subject errors. The model, including treatment, region, prior biologic use, visit (all visits from week 2 to week 16), and treatment-by-visit interaction as fixed effects and baseline as a covariate, was used to test the difference between each active treatment group versus placebo in the change from baseline in HAQ-DI at Week 16. The data collected after treatment discontinuation or rescue were set to missing. Therefore, the MMRM analysis assumed a missing-at-random (MAR) mechanism for missing data due to dropout and post-rescue data."
FDA NDA 203313/203314S-2 /S-3Tresiba;Ryzodeg 70/30Glycemic Control in Patients with Diabetes
The applicant used a mixed effect model for repeated measure (MMRM) to assess the efficacy of IDegAsp compared with IDet. The MMRM model included treatment, sex, region, age group and visits as factors and baseline as covariate, and interactions between visits and all factors and covariate. An unstructured covariance matrix was utilized for model fitting.

Multiple imputation was performed as sensitivity analysis
SNDA for Merck's Dulera in the treatment of asthma (2019)

"Missing Data Handling and Sensitivity Analyses The primary analysis incorporated a control-based multiple imputation of missing data. Missing data for subjects who discontinued treatment early were estimated using the MF group; that is, the change from baseline AM post-dose ppFEV1 in patients who discontinued treatment and missed study visits was assumed to be similar to the change from baseline in patients who continued study visits through Week 12 in the MF treatment group. The dataset was first multiply imputed to have monotone missing patterns, then for each visit, a regression method was used to impute for missing data on both study drug arm and the control arm based on trend from the control arm. After applying the control-based multiple imputation, the cLDA analysis was performed. MF/F 100/10 mcg BID was considered superior to MF 100 mcg BID with a p-value less than 0.05. "

EMA seems to have a different opinion about missing data handling using MMRM or MI. On several occasions, we have heard that EMA prefers the MI approach in handling the missing data especially the reference-baseline multiple imputation. They are moving towards developing the reference-based multiple imputation into the new standard missing data approach. 

Here is a table summarizing some comparisons between the MMRM and MI in handling the missing data. 




Missing data mechanism

MAR (missing at random)

Missing data imputation

Not imputed for individual missing values

But missing data is implicitly imputed

Individual missing values are explicitly imputed

# of steps for calculations

One step

At least three steps:

Imputation model to create multiple data sets with missing values filled in

Analysis model to analyze each imputed data set

Using Robin’s rule to combine results for inference

Analysis Model

Mixed model with Maximum likelihood-based method

Analysis of Covariance or Mixed model using maximum likelihood-based method

Data points used in analyses

Utilized all observed data points from all visits

Usually, with ANCOVA, only the data points for the corresponding visits (with imputed values) are used.

SAS procedure(s)

Proc Mixed

Imputation model: Proc MI

Analysis model: Proc Mixed, Proc GLM, Proc Genmod,…

Robin’s rule: Proc MIANALYZE


The two approaches will be approximately equivalent, provided the variables used in the imputation model are the same as those included in the analysis model, and conditionals are accommodated by a single joint model. In such settings, MI essentially provides an approximation to the observed likelihood analysis. If an infinite number of imputations could be performed, then the two approaches would be equivalent. In practice, the level of equivalence will depend on the number of imputations due to the Monte Carlo (simulation) sampling variability of the imputation process (described in more detail below), thus will be stronger for a larger number of imputations.

Auxiliary variables

Can not be used

Auxiliary variables can be used in the imputation model to improve the accuracy of the missing data prediction

Information observed post-randomization

Can not be included in the MMRM model

Can be included in the imputation model to improve the accuracy of the missing data prediction and can’t be included in the analysis model (MI approach allows the differences in the covariates used in the imputation model and in analysis model

Justification of MAR assumption

Not available through MMRM model

Justification of MAR assumption can be performed through the tipping point approach or delta-based imputation

Handling the MNAR (missing not at random)

Not directly available through MMRM

Can be performed through PMM (pattern mixed model), reference-based or control-based multiple imputation

For studies with only one post-baseline measure

Not appropriate

Appropriate to use MI to impute the missing data and then run analysis of covariance model as the analysis model

For outcome measures that are not continuous variables

Like MMRM, there are statistical approaches that handle missing data without employing explicit imputation. As mentioned in the EMA guideline “For categorical responses and count data, the so-called marginal (e.g. generalized estimating equations (GEE)) and random-effects (e.g. generalized linear mixed models (GLMM)) approaches are in use. Likelihood-based methods (MMRM and GLMM) and some extended GEE (i.e. weighted GEE) models are applicable under MCAR and MAR assumptions.”

MI approach can be easily applied to the outcome measures that are categorical responses or count data with missing data. The analysis model may need to be PROC Logistics; PROC GLIMMIX, PROC NLMIXED, or


Preferred by regulatory agencies


but with multiple imputation approaches as sensitivity analyses (for example, reference-based MI, PMM, tipping point)



Saturday, November 14, 2020

Words Ended with '-demic': Pandemic, Epidemic, Endemic, Twindemic, and Infodemic

This year during the COVID-19, we hear enough words ended with '-demic', some are old and some are new. A usual saying is that "the epidemic (or outbreak) is inevitable, but the pandemic is optional".

An epidemic is defined as “an outbreak of disease that spreads quickly and affects many individuals at the same time.” According to the CDC, "Epidemic refers to an increase, often sudden, in the number of cases of a disease above what is normally expected in that population in that area."

Epidemics occur when an agent and susceptible hosts are present in adequate numbers, and the agent can be effectively conveyed from a source to the susceptible hosts. More specifically, an epidemic may result from:
  • A recent increase in amount or virulence of the agent,
  • The recent introduction of the agent into a setting where it has not been before,
  • An enhanced mode of transmission so that more susceptible persons are exposed,
  • A change in the susceptibility of the host response to the agent, and/or
  • Factors that increase host exposure or involve introduction through new portals of entry.
A pandemic is a type of epidemic (one with greater range and coverage), an outbreak of a disease that occurs over a wide geographic area and affects an exceptionally high proportion of the population. Pandemic refers to an epidemic that has spread over several countries or continents, usually affecting a large number of people. WHO simply defined "A pandemic is the worldwide spread of a new disease."

While a pandemic may be characterized as a type of epidemic, you would not say that an epidemic is a type of pandemic.

WHO classifies the pandemic as six phases + post-peak period and post-pandemic period and each phases should require different actions. 

The WHO is responsible for announcing the emergence of a new pandemic based on how the spread of the disease fits into the following 6 phasesTrusted Source:
  • Phase 1. Viruses circulating among animal populations haven’t been shown to transmit to human beings. They’re not considered a threat and there’s little risk of a pandemic.
  • Phase 2. A new animal virus circulating among animal populations has been shown to transmit to human beings. This new virus is considered a threat and signals the potential risk of a pandemic.
  • Phase 3. The animal virus has caused disease in a small cluster of human beings through animal to human transmission. However, human to human transmission is too low to cause community outbreaks. This means that the virus places humans at risk but is unlikely to cause a pandemic.
  • Phase 4. There has been human-to-human transmission of the new virus in considerable enough numbers to lead to community outbreaks. This kind of transmission among humans signals a high risk of a pandemic developing.
  • Phase 5. There has been transmission of the new virus in at least two countries within the WHO regionTrusted Source. Even though only two countries have been affected by the new virus at this point, a global pandemic is inevitable.
  • Phase 6. There has been transmission of the new virus in at least one additional country within the WHO region. This is known as the pandemic phase and signals that a global pandemic is currently occurring.

Endemic refers to the constant presence and/or usual prevalence of a disease or infectious agent in a population within a geographic area. Endemic is a characteristic of a particular population, environment, or region. Examples of endemic diseases include chicken pox that occurs at a predictable rate among young school children in the United States and malaria in some areas of Africa. The disease is present in a community at all times but in relatively low frequency. An endemic disease may become pandemic - for example, HIV infection/AIDS used to be endemic in sub-Saharan Africa region and grown to pandemic in 1980's. 

Hyperendemic refers to persistent, high levels of disease occurrence.

A twindemic is the new word proposed this year and refers to the possibility of a severe flu season coinciding with a surge in COVID-19 cases. Even a mild flu season is concerning, given that the inevitable serious cases of the flu tax the medical system each year.

An infodemic is an overabundance of information, both online and offline. It includes deliberate attempts to disseminate wrong information to undermine the public health response and advance alternative agendas of groups or individuals. Mis- and disinformation can be harmful to people’s physical and mental health; increase stigmatization; threaten precious health gains; and lead to poor observance of public health measures, thus reducing their effectiveness and endangering countries’ ability to stop the pandemic.

Misinformation costs lives. Without the appropriate trust and correct information, diagnostic tests go unused, immunization campaigns (or campaigns to promote effective vaccines) will not meet their targets, and the virus will continue to thrive.


Saturday, November 07, 2020

The Saga of Biogen’s Alzheimer Drug Aducanumab

Aducanumab is an investigational compound being studied for the treatment of early Alzheimer’s disease co-developed by Biogen and Eisai. Aducanumab is a human immunoglobulin gamma 1 (IgG1) anti‐amyloid beta monoclonal antibody (mAb) targeting aggregated forms of amyloid beta - a fundamental pathological hallmark of the disease.

After the ‘successful’ phase I study (PRIME trial) to demonstrated that Aducanumab had an acceptable safety and tolerability profile and reduced brain amyloid-beta accompanied by a slowing of clinical decline measured by Clinical Dementia Rating-Sum of Boxes (CDR-SB) and Mini-Mental State Examination (MMSE) scores, Biogen designed two pivotal studies (ENGAGE and EMERGE) to evaluate the efficacy and safety of aducanumab in patients with mild cognitive impairment due to Alzheimer’s disease and mild Alzheimer’s disease dementia.

Then the saga began,… a roller-coaster year in 2019.

In March 2019, Biogen and Eisai announced to discontinue Phase 3 ENGAGE and EMERGE trials of aducanumab in Alzheimer’s disease after the interim analyses found that the futility boundaries were crossed. The independent data monitoring committee advised that aducanumab would be unlikely to meet primary endpoints even the studies would be continued to the completion.

While everybody thought that aducanumab was dead in the water, Biogen unexpectedly announced that they would plan regulatory filing for aducanumab in Alzheimer’s disease based on a new analysis of larger data set from Phase 3 studies.

On July 08, 2020, Biogen announced that they had completed the submission of a Biologics License Application (BLA) to the U.S. Food and Drug Administration (FDA) for the approval of aducanumab, an investigational treatment for Alzheimer's disease.

Given that their BLA submission was based on the re-analyses from studies that had been discontinued due to futility, the common understanding would be that the FDA would reject their BLA and require them to do another Phase 3 study.

Then came the last week,…a roller-coaster week last week

For a drug application with controversies, FDA will usually organize an advisory committee meeting to seek the opinions from outside experts including representatives from patients’ organization, patient advocate group, and consumer citizen groups. There is no exception to BLA for aducanumab. Peripheral and Central Nervous System (PCNS) Drugs Advisory Committee Meeting was scheduled for November 6, 2020 and the meeting materials were posted online two days before the meeting on November 4, 2020.

The documents released by FDA came as shocking to outsides. With one negative trial and one very positive trial in one of the dose groups, one would expect that the FDA would give a negative tone and demand a third trial to confirm the efficacy. Usually, the sponsor would try everything to convince FDA that the drug was efficacious and safe while FDA would be conservative and poke and probe the data to identify the issues to discredit the sponsor’s claim about the efficacy and safety.

However, this time for aducanumab, FDA is on the sponsor’s side. The briefing book from the FDA (actually combined FDA and Biogen Briefing Information) depicted a very rosy picture for aducanumab’s efficacy and safety.

With FDA’s backing, one would think that aducanumab is on its way to get a positive opinion from advisory committee members and eventually to be the first FDA approved novel medication for the treatment of Alzheimer’s disease since 2004.

Then the shocking news continued,…

At Friday’s advisory committee meeting, committee members resoundingly concluded Friday that clinical data did not support the approval of Biogen’s much-watched Alzheimer’s drug, aducanumab, while providing a rebuke to the Food and Drug Administration, whose reviewers had given the medicine a glowing appraisal.

Here are the voting results:

FDA’s Questions to the Advisory Committee

Voting Results




Does Study 302, Viewed independently and without regard for Study 301, provide strong evidence that supports the effectiveness of aducanumab for the treatment of AD?




Does Study 103 provide supportive evidence of the effective of aducanumab for the treatment of AD?




Has the applicant presented strong evidence of a pharmacodynamic effect on AD pathophysiology?




In light of the understanding provided by the exploratory analyses of Study 301 and Study 302, along with the results of Study 103 and evidence of a pharmacodynamic effect on AD pathophysiology, is it reasonable to consider Study 302 as primary evidence of effectiveness of aducanumab for the treatment of AD?




Note: Study 301 was the pivotal study (ENGAGE trial) with the negative outcome; Study 302 was the pivotal study (EMERGE trial) with a positive outcome in high dose group; Study 103 was the phase I proof-of-concept study (PRIME trial).


Not sure where aducanumab will go from here. It will be another shocking if FDA approves aducanumab for the treatment of Alzheimer’s disease given the extremely negative view/voting outcome from the advisory committee panel even though everybody understands there is a huge, urgent, unmet need for a new AD drug. As one of the experts said, "with FDA's reputation already in a precarious position, it could be difficult -- maybe impossible -- to go against an expert panel at this time no matter how badly they want this". The best path forward would be to conduct a third pivotal study if Biogen has confidence in aducanumab. In Chinese idiom, true gold fears no fire.

The briefing book from the combined FDA and Biogen Briefing Information revealed the discordance in viewers about the aducanumab efficacy and the study issues among FDA reviewers. The conclusions from the clinical reviewer and the statistical reviewers were dramatically different. As one of the panel members commented on this, “It feels like the audio and video on TV are out of sync”. Unfortunately, in the overall conclusion and the FDA’s presentation, the clinical reviewer’s opinion trumped the statistical reviewer’s opinion. The statistical reviewer wasn’t even given an opportunity to present at the steering committee meeting.

Here is the conclusion from the clinical reviewer:

“……the applicant has provided substantial evidence of effectiveness to support approval. Study 302 provided the primary evidence of effectiveness as robust and exceptionally persuasive study demonstrating a treatment effect on a clinically meaningful endpoint and reinforced by effects on secondary endpoints, biomarkers, and in relevant sugroups. Study 103 was an adequate and well-controlled study that included design components consistent with Study 302 and demonstrated a persuative treatment effect on both clinical endpoints. The dose-response relationship for Aβ reduction provides support for the positive finding in the 10 mgkg treatment arm to the apparently dose-related effects observed on clinical outcomes in Studies 103 and 302. Study 301 does not contribute to the evidence of effectiveness. The results of exploratory analyses, however, contribute to the overall understanding of Study 301 and together do not meaningfully detract from the persuasiveness of Study 302."

 Here is the conclusion from the statistical reviewers:

“In summary, the totality of the data does not seem to support the efficacy of the high dose. There is only one positive study at best and a second study which directly conflicts with the positive study. Both studies were not fully completed as they were terminated early for futility and had sporadic unblinding for dose management of ARIA cases which was much higher in the drug group(s). The Amyloid PET sub-study data suggested a larger effect in APOE- (non-carriers) which is the opposite of what was observed for the clinical outcome data. Within the high dose group at the patient level, there is no correlation between the Week 78 change in the primary biomarker Ab in the cerebellum and the Week 78 Change from baseline in CDR-SB. In study 302, the on-face positive study, the raw correlation had the wrong +/- sign to support a realistic link between biomarker and long-term clinical change in cognition/function as measured by CDR-SB. For these reasons, the reviewer believes there is no compelling substantial evidence of treatment effect or disease slowing and that another study is needed to confirm or deny the positive study and the negative study. " 

Monday, November 02, 2020

Cluster Randomized Clinical Trials (CRTs) and Group Randomized Trials (GRTs)

Cluster randomized trials are experiments in which intact social units or clusters of individuals rather than independent individuals are randomly allocated to intervention groups. In cluster randomized trials, the unit of randomization is different from the unit of observation or unit of analysis. In typical clinical trials, the unit of randomization is the individual subject (or patient/participant) while in cluster randomized clinical trials, the unit of randomization is a cluster. The cluster can be a community, a hospital, a school,...


  • Medical practices selected as the randomization unit in trials evaluating the efficacy of disease screening programs 
  • Communities selected as the randomization unit in trials evaluating the effectiveness of new vaccines in developing countries 
  • Hospitals selected as the randomization unit in trials evaluating educational guidelines directed at physicians and/or administrators

Cluster randomized trials may be called Group-Randomized Trials and these two terms can be used interchangeably. Dr. David M. Murray from NIH prefers to use the term 'group-randomized trials' (GRT in short). He had a 7-part online course to help researchers design and analyze group-randomized trials (GRTs). It includes video presentations, slide sets, suggested reading materials, guided activities, and a list of course references (PDF). He provided the definition and the distinguishing characteristics for CRTs or GRTs:
  • Groups randomized to study conditions with some connection among participants before and after randomization. 
  • Many trials conducted in communities, worksites, schools, etc.
  • The unit of assignment is an identifiable group. 
  • Different groups are allocated to each condition. 
  • The units of observation are members of the groups. 
  • The number of groups allocated to each condition is usually limited.
Cluster randomized trial is not a new concept and is probably mainly used in public health field than in drug development field. Cluster randomization is a type of pragmatic trial and focus on evaluating the effectiveness of difference interventions (treatment policy, regimen,...) comparing to the typical clinical trials focusing on evaluating the efficacy. 

Back in 90's, we had used the cluster randomization trial in an iron-deficiency anemia prevention and treatment study in Children in China. We selected several model counties and within each model county, we randomly select townships to implement three different intervention strategies: health education, drug treatment for children with severe anemia; drug treatment for children with moderate and severe anemia. 

While cluster randomized trials are under-used, we started to see more discussions about the cluster randomized trials and also see the results from studies using cluster randomization published. 

In a paper by Mitja et al (NEJM 2021) "A Cluster-Randomized Trial of Hydroxychloroquine for Prevention of Covid-19", the cluster is the ring of the contacts of the index case (Covid-19 patients). "In the ring vaccination trial, a person newly diagnosed with the disease becomes the index case around whom an epidemiologically defined ring is formed. This ring is then randomized to either immediate vaccination (intervention) or delayed vaccination (control) in a 1:1 ratio on an open-label basis." The inclusion criteria specified the eligible subjects are "asymptomatic individuals exposed to a PCR confirmed COVID19 case within 5 days as either a healthcare worker or household contact".

In a paper by Victor et al (NEJM 2018) "A Cluster-Randomized Trial of Blood-Pressure Reduction in Black Barbershops", "the barbershops were assigned to a pharmacist-led intervention (in which barbers encouraged meetings in barbershops with specialty-trained pharmacists who prescribed drug therapy under a collaborative practice agreement with the participants’ doctors) or to an active control approach (in which barbers encouraged lifestyle modification and doctor appointments). The primary outcome was reduction in systolic blood pressure at 6 months."

In a paper by Katzmarzyk et al (NEJM 2020) "Weight Loss in Underserved Patients — A Cluster-Randomized Trial", they "randomly assigned 18 clinics to provide patients with either an intensive lifestyle intervention, which focused on reduced caloric intake and increased physical activity,
or usual care."

Arrossi et al (Lancet Global Health 2015) described a study about "Effect of self-collection of HPV DNA offered by community health workers at home visits on uptake of screening for cervical cancer (the EMA study): a population-based cluster-randomised trial" where 200 community health workers were randomly allocated in a 1:1 ratio to either the intervention group (offered women the chance to self-collect a sample for cervical screening during a home visit) or the control group (advised women to attend a health clinic for cervical screening).

In a recent paper by Mitja et al (NEJM 2020), "A Cluster-Randomized Trial of Hydroxychloroquine for Prevention of Covid-19". The cluster was also defined as the rings from the contact tracing. "We defined trial clusters (called rings) of healthy persons (contacts) who were epidemiologically linked to a PCR-positive case patient with Covid-19 (index case patient). All the contacts in a ring simultaneously underwent cluster randomization (in a 1:1 ratio) to either the hydroxychloroquine group or the usual-care group."

Cluster randomized clinical trials may be good for some vaccine trials. Henao-Restrepo et al (Lancet 2016 published a vaccine study against Ebola virus "Efficacy and effectiveness of an rVSV-vectored vaccine in preventing Ebola virus disease: final results from the Guinea ring vaccination, open-label, cluster-randomised trial (Ebola Ça Suffit!)". Based on the results from this cluster-randomized trial, FDA approval of the vaccine for the prevention of disease caused by Zaire ebolavirus in individuals 18 years of age and older. In this phase 3 open-label, cluster-randomized study comparing immediate versus delayed vaccination against EVD, index cases (persons newly diagnosed with EVD) were identified by the Guinean national surveillance system. A cluster (or ring) definition team defined the cluster population by creating a list of all contacts and contacts of contacts (CCC), relative to the index case, regardless of eligibility for vaccination, including absent CCCs.

The "cluster randomized clinical trials" is the topic for the next UPENN annual conference on statistical issues in clinical trials. I hope that the COVID-19 situation will be under the control and the conference can be held according to the plan. 

References and Further Readings: