Monday, December 28, 2020

120-Day Safety Update or 4-Month Safety Update - The Requirement for NDA/BLA

After the new drug application (NDA) or biological license application (BLA) is submitted by the sponsor and is accepted by FDA, FDA reviewers will take 10 months (regular review) or 6 months (expedited review) to review the submission package and issue a decision on or before the decision date (PDUFA date). FDA reviewers will evaluate marketing applications for efficacy and safety and consider benefit and risk and will expect to receive a complete application at the time of filing (exclusive of the 120-day safety update).

It is very possible that during the 6-10 month review time, the sponsor will have additional data to supplement the already submitted NDA/BLA package. The regulatory pathway for providing additional data to the FDA is through so-called ‘120-Day Safety Update’, also referred to as ‘4-Month Safety Update, 4MSU).

The ‘120-Day Safety Update’ or ‘4-Month Safety Update” is specified as a requirement in Code of Federal Regulations - 21CFR314.50.


The 120-Day Safety Update contains any new safety information learned about the drug that may reasonably affect the statement of contraindications, warnings, precautions, and adverse reactions in the draft drug labeling.

The report must be received by the FDA within 120 days of drug approval submission (receipt by the FDA of the New Drug Application (NDA), comprising the CTD/Integrated Summary Report) to avoid triggering an extension of the review clock.

The 120-Day Safety Update Report is mandated for submission to the FDA 120 days after submission of the NDA/BLA, and is intended to provide a summary update of any new safety data gathered by the sponsor since the data cut-off for the NDA submission documents, which could have been as far back as 6 months prior to the NDA submission date. In effect, the 120-Day Safety Update report could represent almost 1 year’s worth of new safety data, which needs to be reviewed by the authorities to ensure there has been no change in the product’s recorded safety profile. This is particularly important for medications intended for long-term treatment.

The 120-Day Safety Update is focused on additional safety data. If additional data is collected for efficacy variables, the efficacy information can also be included - but in general, it is for the summary propose and there is no inferential statistics needed. 

The data to be included in the 120 Day Safety Update can include:

  • The long-term follow-up data from the on-going clinical trials
  • Open-label extension studies with patients rolled over from the pivotal studies (usually the double-blinded controlled studies)
  • Additional data from later time points and from newly enrolled patients
  • Newly initiated clinical trials

Depending on the type of data to be included in the 120 Day Safety Update, the submission package could be just a written report or a full submission package (including the report; post-text tables, listings, figures; the data sets; define documents; SDRG/ADRG, etc.).

There are a lot of examples of 120 Day Safety Update from market applications. Here are some examples:

Briefing Document for Advisory Committee Meeting on Novo Nordisk’s Insulin degludec/liraglutide (IDegLira) for Treatment to Improve Glycemic Control in Adults with Type 2 Diabetes Mellitus. The NDA submission was based on two pivotal trials. Two pivotal trials (Trial 3697 in patients inadequately controlled on OAD treatment and Trial 3912 in patients inadequately controlled on basal insulin treatment) were designed to assess the contribution of the individual components of the combination to its primary efficacy effect (i.e., overall glycemic control). Additional data from other ongoing studies and the studies initiated after the data cut for NDA submission were submitted to the NDA as ‘120 Day Safety Update’:

The NDA submitted to the FDA had a data cut-off of 31 March 2015. Additional blinded safety data from two phase 3 trials that were ongoing at the time of NDA submission (Trials 4119 and 4056) as well as from a trial that was subsequently initiated (Trial 4185) was submitted to the FDA in a 120 Day Safety Update with a cut-off date of 30 September 2015. A brief overview of the ongoing trials included in the 120-day safety update is provided in Table 1–1. The 120-day safety update included available blinded safety data from these trials on deaths, other serious adverse events, pregnancies (including updates for pregnancies reported as ongoing in the NDA) and adverse events leading to withdrawal. 

Sunovion Pharmaceuticals NDA of Latuda for treatment of major depressive episodes associated with bipolarI disorder in pediatric patients aged 10 and older. The NDA submission was mainly based on the pivotal study (Study D1050326). Subjects who completed Study D0150326 was recruited into an open-label study (D1050302). As a 120-day safety date, the date from the open label study was submitted to support the NDA.

Study D1050302 is a 104-week open-label trial designed to assess the long-term safety profile of lurasidone (dosed 20-80 mg per day) in pediatric patients recruited from the pediatric schizophrenia (Study D1050301), bipolar depression (Study D1050326), and autism trials. This study was scheduled for completion last December, 2017. The Applicant submitted preliminary data for 619 patients participating in this trial with a cutoff date of October, 2016. Additionally, the 120-day safety update submitted with this application focused on the available safety data from 305 patients recruited from Study D1050326 with a cutoff date of May, 2017. It should be noted that although the final report for Study D1050302 has not been submitted for review, the Applicant presented acceptable long-term data to make an approval determination for this sNDA, including lurasidone exposure of 153 patients for ≥ 52 weeks.

Clinical Review for BLA for Mepolizumab for Add-on maintenance treatment of severe asthma. The data from long-term open-label studies were not available at the time of BLA preparation but was submitted to FDA as a 120-Day Safety Update.

This safety review primarily relies on data from three placebo-controlled studies in a severe asthma population: MEA112997 (Study 97), MEA115588 (Study 88) and MEA115575 (Study 75) as these studies most closely approximate the patient population to receive mepolizumab in the clinical practice. Within this review, the pooled database for these studies is referred to as the Placebo-Controlled Severe Asthma Studies (PCSA). Longer term safety data are provided by two open-label studies, MEA115666 (Study 66), MEA115661 (Study 61). These studies were ongoing at the time of the BLA submission with updated data provided to the Division in a 120-day safety update. The data from this safety update used a cutoff date of October 27, 2014 and provides cumulative review of the data for the studies ongoing at the time of BLA submission25 .

Tuesday, December 08, 2020

COA (Clinical Outcome Assessment): PRO, ClinRO, PerfRO, ObsRO, eCOA, and TECOA

Clinical Outcome Assessment (COA) has triggered multiple acronyms: PRO, ClinRO, PerfRO, and ObsRO

According to FDA's website, these acronyms are defined as the following: 

PRO - patient-reported outcome
A type of clinical outcome assessment. A measurement based on a report that comes directly from the patient (i.e., study subject) about the status of a patient’s health condition without amendment or interpretation of the patient’s response by a clinician or anyone else. A PRO can be measured by self-report or by interview provided that the interviewer records only the patient’s response. Symptoms or other unobservable concepts known only to the patient can only be measured by PRO measures. PROs can also assess the patient perspective on functioning or activities that may also be observable by others. PRO measures include:
  • Rating scales (e.g., numeric rating scale of pain intensity or Minnesota Living with Heart Failure Questionnaire for assessing heart failure)
  • Counts of events (e.g., patient-completed log of emesis episodes or micturition episodes)
Specifically for PRO, FDA has a guidance "Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims". PRO can be further separated into generic (such as SF-36, EQ-5D, ...) and disease-specific PROs (SGRQ for COPD, PAH-SYMPACT for pulmonary arterial hypertension, ...). 

ObsRO - observer-reported outcome

A type of . A  based on a report of observable signs, events or behaviors related to a patient’s health condition by someone other than the patient or a health professional. Generally, ObsROs are reported by a parent, caregiver, or someone who observes the patient in daily life and are particularly useful for patients who cannot report for themselves (e.g., infants or individuals who are cognitively impaired). An  measure does not include medical judgment or interpretation. ObsRO measures include:

  • Rating scales, such as:
    • Acute Otitis Media Severity of Symptoms scale (AOM-SOS), a measure used to assess signs and behaviors related to acute otitis media in infants
    • Face, Legs, Activity, Cry, Consolability scale (FLACC), a measure used to assess signs and behaviors related to pain
  • Counts of events (e.g., observer-completed log of seizure episodes)

ObsRO is often used in rare diseases, in pediatric diseases, or in diseases that the patients may lose self-control and patients can not detect the signs/symptoms on their own (such as seizure, stroke).  For patients who cannot respond for themselves (e.g., infants or cognitively impaired), observer reports should include only those events or behaviors that can be observed. As an example, observers cannot validly report an infant’s pain intensity (a symptom) but can report infant behavior thought to be caused by pain (e.g., crying). For example, in the assessment of a child’s functioning in the classroom, the teacher is the most appropriate observer. Examples of ObsROs include a parent report of a child’s vomiting episodes or a report of wincing thought to be the result of pain in patients who are unable to report for themselves.

Additional examples are OsRO-Celiac Disease Daily Symptom Diary (ObsRO-CDSD©), the Pediatric Quality of Life Inventory™,  the Edmonton Symptom Assessment System Revised (ESAS-r), ....

ClinRO - clinician-reported outcome

A type of . A  based on a report that comes from a trained health-care professional after observation of a patient’s health condition. Most  measures involve a clinical judgment or interpretation of the observable signs, behaviors, or other manifestations related to a disease or condition. ClinRO measures cannot directly assess symptoms that are known only to the patient. ClinRO measures include:

  • Reports of particular clinical findings (e.g., presence of a skin lesion or swollen lymph nodes) or clinical events (stroke, heart attack, death, hospitalization for a particular cause), which can be based on clinical observations together with  data, such as electrocardiogram (ECG) and creatine phosphokinase (CPK) results supporting a myocardial infarction
  • Rating scales, such as:
    • Psoriasis Area and Severity Index (PASI) for  of severity and extent of a patient’s psoriasis
    • Hamilton Depression Rating Scale (HAM-D) for  of depression

The majority of neurological assessment tools are falling into this category. additional examples are INCAT (Inflammatory Neuropathy Cause and Treatment), Guillian-Barre Syndrome disability score, MRC sum score...

PerfRO - performance outcome

A type of clinical outcome assessment. A  based on standardized task(s) actively undertaken by a patient according to a set of instructions. A  assessment may be administered by an appropriately trained individual or completed by the patient independently. PerfO assessments include:

  • Measures of gait speed (e.g., timed 25 foot walk test using a stopwatch or using sensors on ankles)
  • Measures of memory (e.g., word recall test) 
For example, the frequently used outcome measures such as the six-minute walking test (6MWT), cardio-pulmonary exercise test (CPET), Grip strength, ... are falling into PerfRO. 

FDA created a division "Division of Clinical Outcome Assessment (CDOA)" with a mission of Integrating the patient voice into drug development through COA endpoints that are meaningful to patients, valid, reliable and responsive to treatment."

For a scale that has not bee validated before and is intended to be used as the primary efficacy outcome measure in a clinical development program, CDER has two pathways for reviewing COAs:
  • The CDER COA Qualification Program or
  • Under an individual drug development program
With new technology development, we are now coming up with new terms: eCOA - electronic Clinical Outcome Assessment and TECOA - Technology Enabled COA. 

eCOA is to use electronic means to collect the COA data for example, electronic diaries. TECOA is a measurement that comes directly but passively from the patient using technology. 

eCOA has the following benefits:
  • Improved protocol compliance
  • Improved data integrity (real-time, timestamped entries, encrypted, protected data, and ability to integrate sources)
  • decrease hidden cost of paper (monitoring & querying data entry, recruitment due to non-compliance of paper, regulatory submission risk)
  • Better patient experience (intuitive user interface, ability to work offline/online, can be used on patients' devices to fit their lifestyle)
  • Improved operational efficiencies (shortened study timelines, reduced manual efforts, easily make mid-study changes, gather real-time insights)
  • Better regulatory guidance (regulators support and encourage eCOA use, willing to include outcomes in product labels). 


Sunday, December 06, 2020

Multiple Imputation: Imputation Model versus Analysis Model

Multiple imputation has become more and more popular in handling the missing data in clinical trials. Multiple imputation inference involves three distinct phases:

  • The missing data are filled in m times to generate m complete data sets. This step is through the imputation model and can be implemented using SAS Proc MI
  • The m complete data sets are analyzed by using standard procedures. This step is through the analysis model – depending on nature of the outcome variable, the analysis model can be ANCOVA (analysis of covariance), MMRM (mixed model repeated measures), Logistic regression, GEE (generalized estimating equation), GENMOD (generalized linear model),… The analysis model is also the primary model for analyzing the corresponding outcome variable.
  • The results from the m complete data sets are combined for the inference. This step is using Robin’s rule and can be implemented with SAS Proc MIANALYZE

For both the imputation model and the analysis model will need to include a list of explanatory or independent variables, but for different purposes. The list of explanatory or independent variables in the imputation model is to impute the missing values; the list of explanatory or independent variables in the analysis model are covariates as part of the standard statistical models. Here are some comparisons for the variables used in the imputation model and analysis model:

  • The covariates included in the analysis model must also be included in imputation model
  • The imputation model can include additional auxiliary variables including those variables that are not used as covariates in the analysis model
  • The number of variables used in imputation model is greater than or equal to the number of variables in analysis model
  • The imputation model can include variables measured after the randomization (such as secondary outcomes, concomitant medication use, compliance data). However, for analysis model, “variables measured after randomisation and so potentially affected by the treatment should not be included as covariates in the primary analysis.”
  • For longitudinal data or repeated measures, the outcome measures at early time points will be included in the imputation model.
  • If the variables used in the analysis model are transformed, the transformed variable should also be used in the imputation model
  • If the interaction term is used in the analysis model, it should also be included in the imputation model - this can make the imputation model pretty complicated though. 

In many publications, multiple imputation was stated as the method for handling the missing data, however, the details about the imputation model (i.e., which variables are included in the imputation model) were not usually described. 

While there is no clear guidance about the variables included in the imputation model, it is important to pre-specify the list of variables included in the imputation model especially if the auxiliary variables or variables not included in the analysis model. 

Below are some excerpts from the literature about the imputation model and analysis model.


Imputation Model, Analytic Model and Compatibility :

When developing your imputation model, it is important to assess if your imputation model is “congenial” or consistent with your analytic model. Consistency means that your imputation model includes (at the very least) the same variables that are in your analytic or estimation model. This includes any transformations to variables that will be needed to assess your hypothesis of interest. This can include log transformations, interaction terms, or recodes of a continuous variable into a categorical form, if that is how it will be used in later analysis. The reason for this relates back to the earlier comments about the purpose of multiple imputation. Since we are trying to reproduce the proper variance/covariance matrix for estimation, all relationships between our analytic variables should be represented and estimated simultaneously. Otherwise, you are imputing values assuming they have a correlation of zero with the variables you did not include in your imputation model. This would result in underestimating the association between parameters of interest in your analysis and a loss of power to detect properties of your data that may be of interest such as non-linearities and statistical interactions. 

Auxiliary variables are variables in your data set that are either correlated with a missing variable(s) (the recommendation is r > 0.4) or are believed to be associated with missingness. These are factors that are not of particular interest in your analytic model , but they are added to the imputation model to increase power and/or to help make the assumption of MAR more plausible. These variables have been found to improve the quality of imputed values generate from multiple imputation. Moreover, research has demonstrated their particular importance when imputing a dependent variable and/or when you have variables with a high proportion of missing information (Johnson and Young, 2011; Young and Johnson, 2010; Enders , 2010).

You may a priori know of several variables you believe would make good auxiliary variables based on your knowledge of the data and subject matter. Additionally, a good review of the literature can often help identify them as well. However, if your not sure what variables in the data would be potential candidates (this is often the case when conducting secondary data analysis), you can uses some simple methods to help identify potential candidates.

In a presentation of “multiple imputations” by Adrienne D. Woods

Which variables should you include as predictors in the imputation model?

  • Any variables you plan to use in later analyses (including controls)
  • General advice: use as many as possible (could get unwieldy!)
  • Although, some (i.e., Kline, 2005; Hardt, Herke, & Leonhart, 2012) believe that this introduces more imprecision, especially if the auxiliary variable explains less than 10% of the variance in missingness on Y… thoughts?
  • Know your analysis model beforehand and include at least all analysis variables in imputation model (including interaction terms)

FDA’s Statistical Review for Vantrela (hydrocodone bitartrate) extended-release tablets in Management of pain severe

Analysis model:

"The primary efficacy endpoint of trial 3103 was change from baseline to week 12 in the weekly average of worst pain intensity (WPI). The primary analysis was ANCOVA model with baseline WPI, randomized treatment, opioid status, and center as covariates. The intent-to-treat analysis population, defined as all randomized patients, was used for the primary efficacy analysis."

Imputation model:

"The applicant performed multiple imputation on the week 12 missing data for the primary analysis. The imputation model included randomized treatment, opioid status, baseline and postbaseline WPI values while subjects in the active-drug treatment group who discontinued study drug because of an adverse event, were treated as if they were in the placebo group and their missing data were imputed based on the observed placebo subjects' data."

FDA's Statistical Review for EUCRISA™ (crisaborole) topical ointment, 2% for Atopic Dermatitis mentioned the imputation model for missing dichotomized outcome variable. 

The protocol specified the primary imputation method to be the multiple imputation (MI) approach. For each treatment arm separately, missing data was imputed using the Markov Chain Monte Carlo (MCMC) method. The protocol specified the following two sensitivity analyses for the handling of missing data:

· Repeated-measures logistic regression model (GEE), with dichotomized ISGA success as the dependent variable and treatment, analysis center, and visit (i.e., Days 8, 15, 22, and 29) as independent factors. In this analysis, data from all post-baseline visits will be included with no imputation for missing data.

· Model-based multiple imputation method to impute missing data for the dichotomized ISGA data. The imputation model (i.e., logistic regression) will include treatment and analysis center.

Kaifeng Lu et al (2010) Multiple Imputation Approaches for the Analysis of Dichotomized Responses in Longitudinal Studies with Missing Data pointed out the issue if the analysis model is different from the imputation model. 

Despite its conceptual simplicity and flexibility, the above MI procedure is not valid for the analysis of dichotomized responses because Rubin’s variance estimator is biased when the analysis model is different from the imputation model (Meng, 1994; Robins and Wang, 2000). This is true even when the imputation and analysis models are compatible, e.g. when the treatment is the only effect in the logistic regression model.

Ian R. White  et al (2012) Including all individuals is not enough: lessons for intention-to-treat analysis

In some cases, an MI procedure can be improved by including in the imputation model ‘auxiliary variables’ that are not in the analysis model [36, Chapter 4]: auxiliary variables in a randomised trial might be secondary outcomes or compliance summaries. MI then produces estimates of the treatment effect that are genuinely different from a likelihoodbased analysis, by incorporating information on individuals with missing outcome but observed values of auxiliary variables. However, in our experience, the contribution to such an analysis of individuals missing the outcome of interest is moderate unless correlations between the outcome and one or more auxiliary variables are substantial [37].

Michael Spratt et al (2010) Strategies for Multiple Imputation in Longitudinal Studies

Where there are nontrivial amounts of missing data in covariates, both preliminary analyses and imputation models will become more complex. An MAR assumption may often become more plausible after the inclusion in the imputation model of additional variables that are not in our analysis model (because they are on the causal pathway, for example). Thus, multiple imputation models should typically be more complex than the analysis model. Including variables that are not related to the variable being imputed in the imputation models may slightly decrease efficiency but should not cause bias (29, 31). Model diagnostics should be used to highlight any implausibility in the imputed values. For example, the distributions of observed and imputed data should be compared and the plausibility of any differences examined. Imputation models should also preserve the structure of the analysis model (32). For example, where the substantive analysis exploits the hierarchical nature of longitudinal data (e.g., using a multilevel model), the imputation model should be similarly structured. Here, the longitudinal nature of the data allowed us to include variables (previous wheezing) that predicted the values of the variable with the most missing data (wheeze at 81 months) in imputation models.

Jochen Hard et al (2012) Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research

  • An additional advantage of MI over CC (complete-case analysis) is the possibility of including information from auxiliary variables into the imputation model. Auxiliary variables are variables within the original data that are not included in the analysis, but are correlated to the variables of interest or help to keep the missing process random [MAR: 1]. Little [6] has calculated the amount of decrease in variance of a regression coefficient Y on X1 when a covariate X2 is added that has no missing data. White and Carlin [7] have extended this proof to more than one covariate. In practice however, it is likely that auxiliary variables themselves will have missing data.

EMA Guideline on Missing Data in Confirmatory Clinical Trials mentioned the multiple imputation as an approach to handle the missing data with MAR assumption, however, it did not mention anything about the imputation model.   

Panel on Handling Missing Data in Clinical Trials; National Research Council  (2010) The Prevention and Treatment of Missing Data in Clinical Trials

Multiple imputation methods address concerns about (b) “simple imputation is generally not true because the methods do not always yield conservative effect estimators, and standard errors and confidence interval widths can be underestimated when uncertainty about the imputation process is neglected.”  and enable the use of large amounts of auxiliary information.

An important advantage of multiple imputation in the clinical trial setting is that auxiliary variables that are not included in the final analysis model can be used in the imputation model. For example, consider a longitudinal study of HIV, for which the primary outcome Y is longitudinal CD4 count and that some CD4 counts are missing. Further, assume the presence of auxiliary information V in the form of longitudinal viral load. If V is not included in the model, the MAR condition requires the analysis to assume that, conditional on observed CD4 history, missing outcome data are unrelated to the CD4 count that would have been measured; this assumption may be unrealistic. However, if the investigator can confidently specify the relationship between CD4 count and viral load (e.g., based on knowledge of disease progression dynamics) and if viral load values are observed for all cases, then MAR implies that the predictive distribution of missing CD4 counts given the observed CD4 counts and viral load values is the same for cases with CD4 missing as for cases with CD4 observed, which may be a much more acceptable assumption.

Meyer et al (2020) Statistical Issues and Recommendations for Clinical Trials Conducted During the COVID-19 Pandemic

Multiple imputation (MI) methodology (Rubin, 1987) may be helpful in this respect as it allows inclusion of auxiliary variables (both pre- and post-randomization) in the imputation model while utilizing the previously planned analysis model. Multiple imputation with auxiliary variables may be used for various types of endpoints, including continuous, binary, count, and time-to-event and coupled with various inferential methods in the analysis step.

Thomas R Sullivan et al (2018) Should multiple imputation be the method of choice for handling missing data in randomized trials?

In the first stage of MI, multiple values (m > 1) for each missing observation are independently simulated from an imputation model. For missing data restricted to the outcome, the imputation model would typically regress observed values of Y on X and T. Additional auxiliary variables that are not in the analysis model can also be added to the imputation model to improve the prediction of missing values.

In applying MI, the repeated measurements of the outcome are usually treated as distinct variables in the imputation model. Where interest lies in the treatment effect at the final time point, the analysis model need not include the intermediate outcome measures; following imputation a comparison of final time point results is sufficient. In this case, the intermediate measures operate as auxiliary variables, assisting with the prediction of missing values at the final time point and making the MAR assumption more plausible. Other auxiliary variables, for instance measures of compliance or related outcomes, can also be added to the imputation model as required. If data are collected but more likely to be missing following treatment discontinuation, an indicator variable for discontinuation may also be valuable as an auxiliary variable. The ability to incorporate auxiliary variables, both for univariate and multivariate outcomes, is considered one of the key strengths of MI.

Thus in settings where MI is adopted, we recommend imputing by randomized group; compared to MI overall, this approach offers greater robustness at little cost. The approach is also consistent with general recommendations for over- rather than under-specifying imputation models. It should be noted that imputing by group only protects against bias in estimating the ATE if effect modifiers are included in the imputation model.

One of the strengths of MI is its ability to easily incorporate variables of different types (e.g. continuous, binary) in the imputation model, whether for univariate or multivariate data. An added benefit of including all outcomes in a single imputation model is that associations between related outcomes can aid imputation. Another appealing feature of MI is its ability to be implemented under an assumption that data are MNAR. This property makes MI well suited to undertaking sensitivity analyses around a primary assumption that data are MAR, and as a primary method of analysis in settings where data are believed to be MNAR. One such setting is RCTs where participants cannot followed up after discontinuing treatment. If all observed data are ‘on-treatment’, a MAR assumption entails estimating the effect of treatment had all participants remained on their assigned treatment.27 However, for a de facto type estimand (such as ITT), it may be more appropriate to assume that data are MNAR. In this situation, reference based sensitivity analyses have been proposed, which at present require the use of MI.2

Interaction terms are not suggested.

Although the bias of MI overall could be eliminated by including the interaction term in the imputation model (results not shown), this may not be an obvious strategy if subgroup analyses are not of interest.

Simon Grund et al (2018) Multiple Imputation of Missing Data for Multilevel Models: Simulations and Recommendations

A crucial point in the application of MI to multilevel data is that the imputation model not only includes all relevant variables, but also that it “matches” the model of interest (i.e., the substantive analysis model; see Meng, 1994; Schafer, 2003). In other words, the imputation model must capture the relevant aspects of the analysis model, making the imputation model at least as general as (or more general than) the analysis model. If the imputation model is more restrictive than the analysis
model, then imputations are generated under a simplified set of assumptions, and the results of subsequent analyses may be misleading.

Protocol for: Hatemi G, Mahr A, Ishigatsubo Y, et al. Trial of apremilast for oral ulcers in Behçet’s syndrome. N Engl J Med 2019;381:1918-28. DOI: 10.1056/NEJMoa1816594


Sunday, November 29, 2020

Handling of Missing Data: Comparison of MMRM (mixed model repeated measures) versus MI (multiple imputation)

Longitudinal study has become one of the most commonly adopted designs in clinical trials. Since the outcome measures are performed at various visits, it is usually the case that for some subjects in the study, the outcome measures will not be available at some visits (for example after subjects drop out from the study or lost-to-follow-up) - this is where the missing data issue arises. If the outcome measure is a continuous variable, the missing data issues can be handled implicitly through using the mixed-effects repeated measure (MMRM) models or explicitly through multiple imputations (MI).

Both MMRM and MI methods are based on the assumption of missing at random (MAR) and are model-based approaches suggested by EMA's Guideline on Missing Data in Confirmatory Clinical Trials and US National Research Council: The Prevention and Treatment of Missing Data in Clinical Trials. US FDA has not issued any guidance on handling the missing data in clinical trials, but generally follows the guidelines from the National Research Council. 

In terms of MMRM and MI, which one should be the primary method for handling the missing data? For a long time, it seems that in the US, the MMRM is the preferred method in handling the missing data and analyzing the longitudinal data with continuous outcome measures. The MI methods are generally used as sensitivity analyses to check the robustness of the primary analyses against the deviation from the MAR assumption. This can be observed by the article by Dr. Siddiqui in FDA "MMRM versus MI in Dealing with Missing Data - a Comparison Based on 25 NDA data sets" and many NDA / BLA reviews (listed below). 

FDA Statistical Review for NDA 210655 in the indication of Schizophrenia:
"The primary analysis was conducted on the change from baseline in the total PANSS score at Day 57 (primary time point) based on the ITT population. A mixed-effects model for repeated measures (MMRM) was used with treatment, visit, interaction of treatment and visit as fixed effects and the baseline total PANSS score as a covariate. Data from Days 15, 29, 43, and 57 were used. The unstructured covariance matrix was be used to model the within-subject variance-covariance errors."

"In addition to the model-based missing data approach of the MMRM model, the primary efficacy analysis was also analyzed using a pattern mixture model (PMM) and a multiple imputation approach as sensitivity analyses. "

FDA BLA 761037 Kevzara (sarilumab) in Treatment of rheumatoid arthritis
"The continuous HAQ-DI change from baseline at Week 16 was analyzed with a mixed model for repeated measures (MMRM). The repeated-measures analysis was based on the restricted maximum likelihood method assuming an unstructured covariance structure to model the within-subject errors. The model, including treatment, region, prior biologic use, visit (all visits from week 2 to week 16), and treatment-by-visit interaction as fixed effects and baseline as a covariate, was used to test the difference between each active treatment group versus placebo in the change from baseline in HAQ-DI at Week 16. The data collected after treatment discontinuation or rescue were set to missing. Therefore, the MMRM analysis assumed a missing-at-random (MAR) mechanism for missing data due to dropout and post-rescue data."
FDA NDA 203313/203314S-2 /S-3Tresiba;Ryzodeg 70/30Glycemic Control in Patients with Diabetes
The applicant used a mixed effect model for repeated measure (MMRM) to assess the efficacy of IDegAsp compared with IDet. The MMRM model included treatment, sex, region, age group and visits as factors and baseline as covariate, and interactions between visits and all factors and covariate. An unstructured covariance matrix was utilized for model fitting.

Multiple imputation was performed as sensitivity analysis
SNDA for Merck's Dulera in the treatment of asthma (2019)

"Missing Data Handling and Sensitivity Analyses The primary analysis incorporated a control-based multiple imputation of missing data. Missing data for subjects who discontinued treatment early were estimated using the MF group; that is, the change from baseline AM post-dose ppFEV1 in patients who discontinued treatment and missed study visits was assumed to be similar to the change from baseline in patients who continued study visits through Week 12 in the MF treatment group. The dataset was first multiply imputed to have monotone missing patterns, then for each visit, a regression method was used to impute for missing data on both study drug arm and the control arm based on trend from the control arm. After applying the control-based multiple imputation, the cLDA analysis was performed. MF/F 100/10 mcg BID was considered superior to MF 100 mcg BID with a p-value less than 0.05. "

EMA seems to have a different opinion about missing data handling using MMRM or MI. On several occasions, we have heard that EMA prefers the MI approach in handling the missing data especially the reference-baseline multiple imputation. They are moving towards developing the reference-based multiple imputation into the new standard missing data approach. 

Here is a table summarizing some comparisons between the MMRM and MI in handling the missing data. 




Missing data mechanism

MAR (missing at random)

Missing data imputation

Not imputed for individual missing values

But missing data is implicitly imputed

Individual missing values are explicitly imputed

# of steps for calculations

One step

At least three steps:

Imputation model to create multiple data sets with missing values filled in

Analysis model to analyze each imputed data set

Using Robin’s rule to combine results for inference

Analysis Model

Mixed model with Maximum likelihood-based method

Analysis of Covariance or Mixed model using maximum likelihood-based method

Data points used in analyses

Utilized all observed data points from all visits

Usually, with ANCOVA, only the data points for the corresponding visits (with imputed values) are used.

SAS procedure(s)

Proc Mixed

Imputation model: Proc MI

Analysis model: Proc Mixed, Proc GLM, Proc Genmod,…

Robin’s rule: Proc MIANALYZE


The two approaches will be approximately equivalent, provided the variables used in the imputation model are the same as those included in the analysis model, and conditionals are accommodated by a single joint model. In such settings, MI essentially provides an approximation to the observed likelihood analysis. If an infinite number of imputations could be performed, then the two approaches would be equivalent. In practice, the level of equivalence will depend on the number of imputations due to the Monte Carlo (simulation) sampling variability of the imputation process (described in more detail below), thus will be stronger for a larger number of imputations.

Auxiliary variables

Can not be used

Auxiliary variables can be used in the imputation model to improve the accuracy of the missing data prediction

Information observed post-randomization

Can not be included in the MMRM model

Can be included in the imputation model to improve the accuracy of the missing data prediction and can’t be included in the analysis model (MI approach allows the differences in the covariates used in the imputation model and in analysis model

Justification of MAR assumption

Not available through MMRM model

Justification of MAR assumption can be performed through the tipping point approach or delta-based imputation

Handling the MNAR (missing not at random)

Not directly available through MMRM

Can be performed through PMM (pattern mixed model), reference-based or control-based multiple imputation

For studies with only one post-baseline measure

Not appropriate

Appropriate to use MI to impute the missing data and then run analysis of covariance model as the analysis model

For outcome measures that are not continuous variables

Like MMRM, there are statistical approaches that handle missing data without employing explicit imputation. As mentioned in the EMA guideline “For categorical responses and count data, the so-called marginal (e.g. generalized estimating equations (GEE)) and random-effects (e.g. generalized linear mixed models (GLMM)) approaches are in use. Likelihood-based methods (MMRM and GLMM) and some extended GEE (i.e. weighted GEE) models are applicable under MCAR and MAR assumptions.”

MI approach can be easily applied to the outcome measures that are categorical responses or count data with missing data. The analysis model may need to be PROC Logistics; PROC GLIMMIX, PROC NLMIXED, or


Preferred by regulatory agencies


but with multiple imputation approaches as sensitivity analyses (for example, reference-based MI, PMM, tipping point)



Saturday, November 14, 2020

Words Ended with '-demic': Pandemic, Epidemic, Endemic, Twindemic, and Infodemic

This year during the COVID-19, we hear enough words ended with '-demic', some are old and some are new. A usual saying is that "the epidemic (or outbreak) is inevitable, but the pandemic is optional".

An epidemic is defined as “an outbreak of disease that spreads quickly and affects many individuals at the same time.” According to the CDC, "Epidemic refers to an increase, often sudden, in the number of cases of a disease above what is normally expected in that population in that area."

Epidemics occur when an agent and susceptible hosts are present in adequate numbers, and the agent can be effectively conveyed from a source to the susceptible hosts. More specifically, an epidemic may result from:
  • A recent increase in amount or virulence of the agent,
  • The recent introduction of the agent into a setting where it has not been before,
  • An enhanced mode of transmission so that more susceptible persons are exposed,
  • A change in the susceptibility of the host response to the agent, and/or
  • Factors that increase host exposure or involve introduction through new portals of entry.
A pandemic is a type of epidemic (one with greater range and coverage), an outbreak of a disease that occurs over a wide geographic area and affects an exceptionally high proportion of the population. Pandemic refers to an epidemic that has spread over several countries or continents, usually affecting a large number of people. WHO simply defined "A pandemic is the worldwide spread of a new disease."

While a pandemic may be characterized as a type of epidemic, you would not say that an epidemic is a type of pandemic.

WHO classifies the pandemic as six phases + post-peak period and post-pandemic period and each phases should require different actions. 

The WHO is responsible for announcing the emergence of a new pandemic based on how the spread of the disease fits into the following 6 phasesTrusted Source:
  • Phase 1. Viruses circulating among animal populations haven’t been shown to transmit to human beings. They’re not considered a threat and there’s little risk of a pandemic.
  • Phase 2. A new animal virus circulating among animal populations has been shown to transmit to human beings. This new virus is considered a threat and signals the potential risk of a pandemic.
  • Phase 3. The animal virus has caused disease in a small cluster of human beings through animal to human transmission. However, human to human transmission is too low to cause community outbreaks. This means that the virus places humans at risk but is unlikely to cause a pandemic.
  • Phase 4. There has been human-to-human transmission of the new virus in considerable enough numbers to lead to community outbreaks. This kind of transmission among humans signals a high risk of a pandemic developing.
  • Phase 5. There has been transmission of the new virus in at least two countries within the WHO regionTrusted Source. Even though only two countries have been affected by the new virus at this point, a global pandemic is inevitable.
  • Phase 6. There has been transmission of the new virus in at least one additional country within the WHO region. This is known as the pandemic phase and signals that a global pandemic is currently occurring.

Endemic refers to the constant presence and/or usual prevalence of a disease or infectious agent in a population within a geographic area. Endemic is a characteristic of a particular population, environment, or region. Examples of endemic diseases include chicken pox that occurs at a predictable rate among young school children in the United States and malaria in some areas of Africa. The disease is present in a community at all times but in relatively low frequency. An endemic disease may become pandemic - for example, HIV infection/AIDS used to be endemic in sub-Saharan Africa region and grown to pandemic in 1980's. 

Hyperendemic refers to persistent, high levels of disease occurrence.

A twindemic is the new word proposed this year and refers to the possibility of a severe flu season coinciding with a surge in COVID-19 cases. Even a mild flu season is concerning, given that the inevitable serious cases of the flu tax the medical system each year.

An infodemic is an overabundance of information, both online and offline. It includes deliberate attempts to disseminate wrong information to undermine the public health response and advance alternative agendas of groups or individuals. Mis- and disinformation can be harmful to people’s physical and mental health; increase stigmatization; threaten precious health gains; and lead to poor observance of public health measures, thus reducing their effectiveness and endangering countries’ ability to stop the pandemic.

Misinformation costs lives. Without the appropriate trust and correct information, diagnostic tests go unused, immunization campaigns (or campaigns to promote effective vaccines) will not meet their targets, and the virus will continue to thrive.


Saturday, November 07, 2020

The Saga of Biogen’s Alzheimer Drug Aducanumab

Aducanumab is an investigational compound being studied for the treatment of early Alzheimer’s disease co-developed by Biogen and Eisai. Aducanumab is a human immunoglobulin gamma 1 (IgG1) anti‐amyloid beta monoclonal antibody (mAb) targeting aggregated forms of amyloid beta - a fundamental pathological hallmark of the disease.

After the ‘successful’ phase I study (PRIME trial) to demonstrated that Aducanumab had an acceptable safety and tolerability profile and reduced brain amyloid-beta accompanied by a slowing of clinical decline measured by Clinical Dementia Rating-Sum of Boxes (CDR-SB) and Mini-Mental State Examination (MMSE) scores, Biogen designed two pivotal studies (ENGAGE and EMERGE) to evaluate the efficacy and safety of aducanumab in patients with mild cognitive impairment due to Alzheimer’s disease and mild Alzheimer’s disease dementia.

Then the saga began,… a roller-coaster year in 2019.

In March 2019, Biogen and Eisai announced to discontinue Phase 3 ENGAGE and EMERGE trials of aducanumab in Alzheimer’s disease after the interim analyses found that the futility boundaries were crossed. The independent data monitoring committee advised that aducanumab would be unlikely to meet primary endpoints even the studies would be continued to the completion.

While everybody thought that aducanumab was dead in the water, Biogen unexpectedly announced that they would plan regulatory filing for aducanumab in Alzheimer’s disease based on a new analysis of larger data set from Phase 3 studies.

On July 08, 2020, Biogen announced that they had completed the submission of a Biologics License Application (BLA) to the U.S. Food and Drug Administration (FDA) for the approval of aducanumab, an investigational treatment for Alzheimer's disease.

Given that their BLA submission was based on the re-analyses from studies that had been discontinued due to futility, the common understanding would be that the FDA would reject their BLA and require them to do another Phase 3 study.

Then came the last week,…a roller-coaster week last week

For a drug application with controversies, FDA will usually organize an advisory committee meeting to seek the opinions from outside experts including representatives from patients’ organization, patient advocate group, and consumer citizen groups. There is no exception to BLA for aducanumab. Peripheral and Central Nervous System (PCNS) Drugs Advisory Committee Meeting was scheduled for November 6, 2020 and the meeting materials were posted online two days before the meeting on November 4, 2020.

The documents released by FDA came as shocking to outsides. With one negative trial and one very positive trial in one of the dose groups, one would expect that the FDA would give a negative tone and demand a third trial to confirm the efficacy. Usually, the sponsor would try everything to convince FDA that the drug was efficacious and safe while FDA would be conservative and poke and probe the data to identify the issues to discredit the sponsor’s claim about the efficacy and safety.

However, this time for aducanumab, FDA is on the sponsor’s side. The briefing book from the FDA (actually combined FDA and Biogen Briefing Information) depicted a very rosy picture for aducanumab’s efficacy and safety.

With FDA’s backing, one would think that aducanumab is on its way to get a positive opinion from advisory committee members and eventually to be the first FDA approved novel medication for the treatment of Alzheimer’s disease since 2004.

Then the shocking news continued,…

At Friday’s advisory committee meeting, committee members resoundingly concluded Friday that clinical data did not support the approval of Biogen’s much-watched Alzheimer’s drug, aducanumab, while providing a rebuke to the Food and Drug Administration, whose reviewers had given the medicine a glowing appraisal.

Here are the voting results:

FDA’s Questions to the Advisory Committee

Voting Results




Does Study 302, Viewed independently and without regard for Study 301, provide strong evidence that supports the effectiveness of aducanumab for the treatment of AD?




Does Study 103 provide supportive evidence of the effective of aducanumab for the treatment of AD?




Has the applicant presented strong evidence of a pharmacodynamic effect on AD pathophysiology?




In light of the understanding provided by the exploratory analyses of Study 301 and Study 302, along with the results of Study 103 and evidence of a pharmacodynamic effect on AD pathophysiology, is it reasonable to consider Study 302 as primary evidence of effectiveness of aducanumab for the treatment of AD?




Note: Study 301 was the pivotal study (ENGAGE trial) with the negative outcome; Study 302 was the pivotal study (EMERGE trial) with a positive outcome in high dose group; Study 103 was the phase I proof-of-concept study (PRIME trial).


Not sure where aducanumab will go from here. It will be another shocking if FDA approves aducanumab for the treatment of Alzheimer’s disease given the extremely negative view/voting outcome from the advisory committee panel even though everybody understands there is a huge, urgent, unmet need for a new AD drug. As one of the experts said, "with FDA's reputation already in a precarious position, it could be difficult -- maybe impossible -- to go against an expert panel at this time no matter how badly they want this". The best path forward would be to conduct a third pivotal study if Biogen has confidence in aducanumab. In Chinese idiom, true gold fears no fire.

The briefing book from the combined FDA and Biogen Briefing Information revealed the discordance in viewers about the aducanumab efficacy and the study issues among FDA reviewers. The conclusions from the clinical reviewer and the statistical reviewers were dramatically different. As one of the panel members commented on this, “It feels like the audio and video on TV are out of sync”. Unfortunately, in the overall conclusion and the FDA’s presentation, the clinical reviewer’s opinion trumped the statistical reviewer’s opinion. The statistical reviewer wasn’t even given an opportunity to present at the steering committee meeting.

Here is the conclusion from the clinical reviewer:

“……the applicant has provided substantial evidence of effectiveness to support approval. Study 302 provided the primary evidence of effectiveness as robust and exceptionally persuasive study demonstrating a treatment effect on a clinically meaningful endpoint and reinforced by effects on secondary endpoints, biomarkers, and in relevant sugroups. Study 103 was an adequate and well-controlled study that included design components consistent with Study 302 and demonstrated a persuative treatment effect on both clinical endpoints. The dose-response relationship for Aβ reduction provides support for the positive finding in the 10 mgkg treatment arm to the apparently dose-related effects observed on clinical outcomes in Studies 103 and 302. Study 301 does not contribute to the evidence of effectiveness. The results of exploratory analyses, however, contribute to the overall understanding of Study 301 and together do not meaningfully detract from the persuasiveness of Study 302."

 Here is the conclusion from the statistical reviewers:

“In summary, the totality of the data does not seem to support the efficacy of the high dose. There is only one positive study at best and a second study which directly conflicts with the positive study. Both studies were not fully completed as they were terminated early for futility and had sporadic unblinding for dose management of ARIA cases which was much higher in the drug group(s). The Amyloid PET sub-study data suggested a larger effect in APOE- (non-carriers) which is the opposite of what was observed for the clinical outcome data. Within the high dose group at the patient level, there is no correlation between the Week 78 change in the primary biomarker Ab in the cerebellum and the Week 78 Change from baseline in CDR-SB. In study 302, the on-face positive study, the raw correlation had the wrong +/- sign to support a realistic link between biomarker and long-term clinical change in cognition/function as measured by CDR-SB. For these reasons, the reviewer believes there is no compelling substantial evidence of treatment effect or disease slowing and that another study is needed to confirm or deny the positive study and the negative study. "