I recently noticed that for the approval of the new drugs (NDA) and biological products (BLA), the information about the approval process was published on FDA's website and in very timely fashion. Just a year ago, the FDA review/approval process regarding a new product was still not transparent to the public. We may be able to find some information on label, approval letter, and SBA (summary basis of approval); however, there were typically months or years after the approval.
Now, for the new approvals, not only the label, approval letter, and SBA, but also reviews from different perspectives (medical, statistical, pharmacology, enrivomental, CMC,...) as well as the administrative documents and correspondence between the FDA and the sponsor may be posted on FDA's website. Also listed or published is the list of FDA's officers who participated in the review and the decision making. The individuals from the sponsor's side may also be listed in some documents or correpondence.
This is obviously the outcome of FDA's initiative on transparency. "In June 2009, Food and Drug Administration (FDA) Commissioner Dr. Margaret Hamburg launched FDA's Transparency Initiative and formed an internal task force to develop recommendations for making useful and understandable information about FDA activities and decision-making more readily available to the public, in a timely manner and in a user-friendly format."
To feel these changes, we can just take a look at two products recently approved by FDA: one by CDER and one by CBER. Don't forget to visit "Administrative Document(s) and Correspondence" or "Approval History, Letters, Reviews, and Related Documents".
This is a good sign that FDA's drug approval process is being demystified and moving to the transparency.
CQ's web blog on the issues in biostatistics and clinical trials.
Sunday, August 29, 2010
Tuesday, August 17, 2010
LOCF, BOCF, WOCF, and MVTF
In clinical trials, subjects are usually followed up for a period of time with multiple measurements / assessments at various time points. It is very common that some subjects will discontinue from the study early due to the reasons like 'lost to follow-up', 'withdraw consent', 'adverse events',...
With intention to treat population, imputation technique is needed to deal with the early termination subjects. While the fancy technique such as multiple imputation may be more statistically sound, some practical imputation techniques may be more popular. Here are some of them that I have used.
LOCF (last observation carried forward): this is probably the most common technique used in the practice in handling the missing data (especially for continuous measures). This is also the technique mentioned in ICH E9 "Statistical principles for clinical trials". It states "...Imputation techniques, ranging from the carrying forward of the last observation to the use of complex mathematical models, may also be used in an attempt to compensate for missing data..."
LOCF can be easily implemented in SAS. See a SUGI paper titled "The DOW (not that DOW!!!) and the LOCF in Clinical Trials"
BOCF (baseline observation carried forward): this approach may be more conservative if the symptoms are gradually improving over the course of the study. I used this technique in several clinical trials testing the analgesic drug (pain killer) in dental surgery patients. At the baseline right after the dental surgery, the pain scale is the worst. With time, the pain intensity is supposed to decrease. In this situation, BOCF technique is more conservative than LOCF. There is a web article to look at the feature of BOCF. BOCF along with LOCF and a modified BOCF are discussed in a most recent FDA advisory committee on Cymbalta for the Treatment of Chronic Pain
WOCF (Worst observation carried forward): this approach is the most conservative comparing to LOCF and BOCF. This technique has been used in analgesia drug as well as the trials with laboratory results as endpoint. For example, WOCF technique is mentioned in FDA Summary on Durolane.
LOCF, BOCF, and WOCF are handy technique for continuous measures. For a trial with endpoint as dichotomous variable (success vs failure; responder vs. non-responder),a technique called MVTF can be used. MVTF stands for missing value treated as failure. For example, this technique is mentioned in Statistical Review of NDA 21-385 in dermatology indication. In one of studies I participated, we employed the same technique (even though we did not use the term MVTF) to treat all subjects who discontinued from the study early as non-responders. This is a very conservative approach. The treatment effect may be neutralized a little bit during the implementation of this technique.
There are many other techniques used in the practice. Some of them may be just different terms for the same technique. In FDA Executive Summary Prepared for the
July 30, 2010 meeting of the Ophthalmic Devices Panel P080030, the following imputation techniques are mentioned.
it must be pointed out that these practical missing data handling techniques have no statistical basis and have been criticized by many professionals especially in academic setting. These techniques that seem to be very conservative, may not be conservative in some situations.
Since LOCF is a technique used most, the critics are usually centered on the comparison of LOCF and other model based techniques (for example, Mixed-Effect Model Repeated Measure (MMRM) model). Some of the comparisons and discussions can be found at:
With intention to treat population, imputation technique is needed to deal with the early termination subjects. While the fancy technique such as multiple imputation may be more statistically sound, some practical imputation techniques may be more popular. Here are some of them that I have used.
LOCF (last observation carried forward): this is probably the most common technique used in the practice in handling the missing data (especially for continuous measures). This is also the technique mentioned in ICH E9 "Statistical principles for clinical trials". It states "...Imputation techniques, ranging from the carrying forward of the last observation to the use of complex mathematical models, may also be used in an attempt to compensate for missing data..."
LOCF can be easily implemented in SAS. See a SUGI paper titled "The DOW (not that DOW!!!) and the LOCF in Clinical Trials"
BOCF (baseline observation carried forward): this approach may be more conservative if the symptoms are gradually improving over the course of the study. I used this technique in several clinical trials testing the analgesic drug (pain killer) in dental surgery patients. At the baseline right after the dental surgery, the pain scale is the worst. With time, the pain intensity is supposed to decrease. In this situation, BOCF technique is more conservative than LOCF. There is a web article to look at the feature of BOCF. BOCF along with LOCF and a modified BOCF are discussed in a most recent FDA advisory committee on Cymbalta for the Treatment of Chronic Pain
WOCF (Worst observation carried forward): this approach is the most conservative comparing to LOCF and BOCF. This technique has been used in analgesia drug as well as the trials with laboratory results as endpoint. For example, WOCF technique is mentioned in FDA Summary on Durolane.
LOCF, BOCF, and WOCF are handy technique for continuous measures. For a trial with endpoint as dichotomous variable (success vs failure; responder vs. non-responder),a technique called MVTF can be used. MVTF stands for missing value treated as failure. For example, this technique is mentioned in Statistical Review of NDA 21-385 in dermatology indication. In one of studies I participated, we employed the same technique (even though we did not use the term MVTF) to treat all subjects who discontinued from the study early as non-responders. This is a very conservative approach. The treatment effect may be neutralized a little bit during the implementation of this technique.
There are many other techniques used in the practice. Some of them may be just different terms for the same technique. In FDA Executive Summary Prepared for the
July 30, 2010 meeting of the Ophthalmic Devices Panel P080030, the following imputation techniques are mentioned.
- Last Observation Carried Forward (LOCF) analysis
- Best Reasonable Case analysis
- Worst Reasonable Case analysis
- Non-Responder analysis
- Best Case analysis
- Worst Case analysis
it must be pointed out that these practical missing data handling techniques have no statistical basis and have been criticized by many professionals especially in academic setting. These techniques that seem to be very conservative, may not be conservative in some situations.
Since LOCF is a technique used most, the critics are usually centered on the comparison of LOCF and other model based techniques (for example, Mixed-Effect Model Repeated Measure (MMRM) model). Some of the comparisons and discussions can be found at:
- MMRM vs. LOCF: A Comprehensive Comparison Based on Simulation Study and 25 NDA Datasets
- Recommendations for the Primary Analysis of Continuous Endpoints in Longitudinal Clinical Trials
- LOCF and MMRM: Thoughts on Comparisons
Saturday, August 07, 2010
R-square for regression without intercept?
Sometimes, simple linear regression may not be very simple. One of the issues is to decide whether or not to fit the regression with the intercept or without the intercept. For regression without intercept, the regression line goes through the origin. for regression with intercept, the regression line does not go through the origin.
In clinical trials, we may need to fit the regression models about the drug concentration vs. dose; AUC vs. trough concentration,...Regression with or without a intercept relies on the scientific background, not purely the statistics. Using the drug concentration vs dose as an example, if there is no endogenous drug concentration, a regression model without intercept makes sense. If there is a endogenous drug concentration, a regression model with intercept will be more appropriate - when there is no dose given, the drug concentration is not zero.
In some situation, regression models are purely data-driven or empirical. Choosing a model with or without an intercept may not be easy to decide. We recently had a real experience in this. With the same set of data, we fitted the models with intercept and without intercept. We thought we could judge which model was better by comparing the R-square values - an indicator for goodness of fit. Surprisely, the models without intercept were always much better than the models with intercept by comparing the R-squares. However, when we thought twice about this, we realized that in this situation, the R-square was no longer a good indicator of the goodness of fit.
The problem is that the regression model without intercept will always give a very high R-square. This is related to the way how the sum of squares are calculated. There are two excellent articles discussing this issue.
In clinical trials, we may need to fit the regression models about the drug concentration vs. dose; AUC vs. trough concentration,...Regression with or without a intercept relies on the scientific background, not purely the statistics. Using the drug concentration vs dose as an example, if there is no endogenous drug concentration, a regression model without intercept makes sense. If there is a endogenous drug concentration, a regression model with intercept will be more appropriate - when there is no dose given, the drug concentration is not zero.
In some situation, regression models are purely data-driven or empirical. Choosing a model with or without an intercept may not be easy to decide. We recently had a real experience in this. With the same set of data, we fitted the models with intercept and without intercept. We thought we could judge which model was better by comparing the R-square values - an indicator for goodness of fit. Surprisely, the models without intercept were always much better than the models with intercept by comparing the R-squares. However, when we thought twice about this, we realized that in this situation, the R-square was no longer a good indicator of the goodness of fit.
The problem is that the regression model without intercept will always give a very high R-square. This is related to the way how the sum of squares are calculated. There are two excellent articles discussing this issue.
Comparing treatment difference in slopes?
In regulatory setting, can we show the treatment difference by comparing the slopes between two treatment groups?
In a COPD study (e.g., a two arm, parallel group with primary efficacy variable measured at baseline and every 6 months thereafter), one can fit the random coefficient model and compare the treatment difference between two slopes. Also we can compare the treatment difference in terms of change from baseline to the endpoint (the last measure).
To test the difference in slopes, we would need to test whether or not the treatment*time interaction term is statistically significant. The assumption is that at the beginning of the trial, the intercept for both groups are the same - both groups start at the same level. Then if the treatment can slow the disease progression, the treatment group should show a smaller slope comparing with the placebo group. If all patients are followed up to the end of the study, if the slopes are different, the endpoint (change from baseline) analysis should also be statistically different. However, if the sample size is not sufficiently large, the results could be inconsistent by using slope comparison approach vs. endpoint analysis approach. For a given study, the decision has to be made which approach is considered as the primary endpoint. If we analyze the data using both approaches, we will then need to deal with the adjustment for multiplicity issue.
I used to make a comment saying "some regulatory authorities may prefer the simpler endpoint analysis"; I was then asked to provide the references to suport this statement. I did quite extensive research, but could not find any real relevant reference. However, by reviewing 'statistical reviews' in the BLA and NDA in US, it is very rare to see any product approval based on the comparison of the slopes. Many product approvals are based on the comparison of 'change from baseline'.
Every indication has its own accepted endpoints so the tradition takes precedence. For example, in Alzheimer's disease, there is a movement to look at differences in slopes, but this is based on trying to claim disease modification. Similarly, in the COPD area, some products are based on disease modification, the treatment differnces can be shown by comparing the differences in slopes between treatment groups.
It seems to be true that that the slope model (random coefficient model) may be preferred in academic setting, but endpoint approach - change from baseline (with last value carried forward) may be more practical in the industry setting.
From the statistical point of view, the slope approach makes a lot of sense, however, we need to be cautious about some potential issues: 1. In some efficacy measures, there might be some type of plateau. If the plateau is reached prior to the end of the study, there will be a loss of power comparing slopes.2. If the slope comparison is used as the primary efficacy measure, the # of measurements per year on the primary efficacy variable is relevant. One may think that the more frequent measures will increase the power to show the treatmetn difference in slopes. The question arise when designing the study: should we choose a shorter trial with more frequent measures? or should we choose a longer trial with less frequent measures?
In a COPD study (e.g., a two arm, parallel group with primary efficacy variable measured at baseline and every 6 months thereafter), one can fit the random coefficient model and compare the treatment difference between two slopes. Also we can compare the treatment difference in terms of change from baseline to the endpoint (the last measure).
To test the difference in slopes, we would need to test whether or not the treatment*time interaction term is statistically significant. The assumption is that at the beginning of the trial, the intercept for both groups are the same - both groups start at the same level. Then if the treatment can slow the disease progression, the treatment group should show a smaller slope comparing with the placebo group. If all patients are followed up to the end of the study, if the slopes are different, the endpoint (change from baseline) analysis should also be statistically different. However, if the sample size is not sufficiently large, the results could be inconsistent by using slope comparison approach vs. endpoint analysis approach. For a given study, the decision has to be made which approach is considered as the primary endpoint. If we analyze the data using both approaches, we will then need to deal with the adjustment for multiplicity issue.
I used to make a comment saying "some regulatory authorities may prefer the simpler endpoint analysis"; I was then asked to provide the references to suport this statement. I did quite extensive research, but could not find any real relevant reference. However, by reviewing 'statistical reviews' in the BLA and NDA in US, it is very rare to see any product approval based on the comparison of the slopes. Many product approvals are based on the comparison of 'change from baseline'.
Every indication has its own accepted endpoints so the tradition takes precedence. For example, in Alzheimer's disease, there is a movement to look at differences in slopes, but this is based on trying to claim disease modification. Similarly, in the COPD area, some products are based on disease modification, the treatment differnces can be shown by comparing the differences in slopes between treatment groups.
It seems to be true that that the slope model (random coefficient model) may be preferred in academic setting, but endpoint approach - change from baseline (with last value carried forward) may be more practical in the industry setting.
From the statistical point of view, the slope approach makes a lot of sense, however, we need to be cautious about some potential issues: 1. In some efficacy measures, there might be some type of plateau. If the plateau is reached prior to the end of the study, there will be a loss of power comparing slopes.2. If the slope comparison is used as the primary efficacy measure, the # of measurements per year on the primary efficacy variable is relevant. One may think that the more frequent measures will increase the power to show the treatmetn difference in slopes. The question arise when designing the study: should we choose a shorter trial with more frequent measures? or should we choose a longer trial with less frequent measures?