Monday, January 11, 2021

Single Imputation Methods for Missing Data: LOCF, BOCF, LRCF (Last Rank Carried Forward), and NOCB (Next Observation Carried Backward)

The missing data is always an issue when analyzing the data from clinical trials. The missing data handling has been moved toward the model-based approaches (such as multiple imputation and mixed model repeated measures (MMRM)). The single imputation methods, while being heavily criticized and cast out, remain as practical approaches for handling the missing data, especially for sensitivity analyses.

Single imputation methods replace a missing data point by a single value and analyses are conducted as if all the data were observed. The single value used to fill in the missing observation is usually coming from the observed values from the same subject - Last Observation Carried Forward (LOCF), Baseline Observation Carried Forward, and Next Observation Carried Backward (NOCB, the focus of this post). The single value used to fill in the missing observation can also be derived from other sources: Last Rank Carried Forward (LRCF), Best or Worst Case Imputation (assigning the worst possible value of the outcome to dropouts for a negative reason (treatment failure) and the best possible value to positive dropouts (cures)), Mean value imputation, trimmed mean,…Single imputation approaches also include regression imputation, which imputes the predictions from a regression of the missing variables on the observed variables; and hot deck imputation, which matches the case with missing values to a case with values observed that is similar with respect to observed variables and then imputes the observed values of the respondent.

In this post, we discussed the single imputation method of LOCF, BOCF, LRCF, and NOCB (the focus of this post). 

Last Observation Carried Forward (LOCF): A single imputation technique that imputes the last measured outcome value for participants who either drop out of a clinical trial or for whom the final outcome measurement is missing. LOCF is usually used in the longitudinal study design where the outcome is measured repeatedly at pre-specified intervals. LOCF usually requires there is at least one post-baseline measure. The LOCF is the widely used single imputation method.

Baseline Observation Carried Forward (BOCF): A single imputation technique that imputes the baseline outcome value for participants who either drop out of a clinical trial or for whom the final outcome measurement is missing. BOCF is usually used in a study design with perhaps only one post-baseline measure (i.e., the outcome is only measured at the baseline and at the end of the study).

Last Rank Carried Forward (LRCF): The LRCF method carries forward the rank of the last observed value at the corresponding visit to the last visit and is the non-parametric version of LOCF. However, unlike the LOCF that is based on the observation from the same subject, for the LRCF method, the ranks come from all subjects with non-missing observations at a specific visit.  From the early visits to the later visits, the number of missing values will be different, the constant ranking, carried forward, and re-ranking will be needed. Here are some good references for LRCF:

LRCF is thought to have the following features:

In a paper by Jing et al, the LRCF was used for missing data imputation: 

"...The last rank carried forward or last observation carried forward was assigned to patients who withdrew prematurely from the study or study drug for other reasons or who did not perform the 6-minute walk test for any reason not mentioned above (eg, missed visit), provided that the patient performed at least 1 postbaseline 6-minute walk test.
Next Observation Carried Backward (NOCB): NOCB is a similar approach to LOCF but works in the opposite direction by taking the first observation after the missing value and carrying it backward. NOCB may also be called Next Value Carried Backward (NVCB) or Last Observation Carried Backward (LOCB).

NOCB may be useful in handling the missing data arising from the external control group, from Real-World Data (RWD), Electronic health records (EHRs) where the outcome data collection is usually not structured and not according to the pre-specified visit schedule. 

I can foresee that the NOCB may also be an approach in handing the missing data due to the COVID-19 pandemic. Due to the COVID-19 pandemic, subjects may not be able to come to the clinic for the outcome measure at the end of the study. The outcome measure may be performed at a later time beyond the visit window allowance. Instead of having a missing observation for the end of the study visit, the NOCB approach can be applied to carry the next available outcome measure backward. 

The NOCB approach, while not popular, can be found in some publications and regulatory approval documents. Here are some examples: 


In an article by Wyles et al (2015, NEJM) Daclatasvir plus Sofosbuvir for HCV in Patients Coinfected with HIV-1, "Missing response data at post-treatment week 12 were inferred from the next available HCV RNA measurement with the use of a next-value-carried-backward approach."

In BLA 761052 of Brineura (cerliponase alfa) Injection Indication(s) for Late-Infantile Neuronal Ceroid Lipofuscinosis Type 2 (CLN2)- Batten Disease, the NOCB was used to handle the missing data for comparison to the data from a natural history study. 

Because intervals between clinical visits vary a lot in Study 901, the agency recommended performing analyses using both the last available Motor score and next observation carried backward (NOCB) for the intermediate data points although the former one is determined as the primary. 

In FDA Briefing Document for Endocrinologic and Metabolic Drugs Advisory Committee Meeting for NDA 210645, Waylivra (volanesorsen) injection for the treatment of familial chylomicronemia syndrome, NOCF was used as one of the sensitivity analyses:

Similar planned (prespecified) analyses using different variables, such as slightly different endpoint definitions (e.g. worst maximum pain intensity versus average maximum pain intensity), or imputation methods for missing data (next observation carried backward versus imputation of zero for missing values) did not demonstrate treatment differences.

 Missing values were pre-specified to be imputed using Next Observation Carried Back (NOCB); i.e., if a patient did not complete the questionnaire for several weeks, the next value entered was assumed to have occurred during all intervening (missing) weeks.

 Missing data for any post-baseline visit will be imputed by using Next Observation Carried Back (NOCB) if there is a subsequent score available. Missing data after the last available score of each patient will not be imputed.

in NDA 212157 of Celecoxib Oral Solution for Treatment of acute migraine, the NOCB was used for sensitivity analysis

Headache Pain Freedom at 2 hours - Sensitivity Analysis

To analyze the missing data for the primary endpoint, Dr. Ling performed an analysis analyzing patients who took rescue medications as nonresponders and then also imputing missing data at the 2-hour time point using the next available time point of information (Next Observation Carried Backward (NOCB)) or a worst-case type of imputation (latter not shown in table).

Single imputation methods are generally not recommended for the primary analysis because of the following disadvantages (issues): 

  • Single imputation usually does no provides an unbiased estimate
  • Inferences (tests and confidence intervals) based on the filled-in data can be distorted by bias if the assumptions underlying the imputation method are invalid
  • Statistical precision is overstated because the imputed values are assumed to be true.
  • Single imputation methods risk biasing the standard error downwards by ignoring the uncertainty of imputed values. Therefore, the confidence intervals for the treatment effect calculated using single imputation methods may be too narrow and give an artificial impression of precision that does not really exist.  
  • the single imputation method such as LOCF, NOCB, and BOCF do not reflect MAR (missing at random) data mechanisms.

Further Readings:

No comments: