Saturday, August 07, 2010

Comparing treatment difference in slopes?

In regulatory setting, can we show the treatment difference by comparing the slopes between two treatment groups?

In a COPD study (e.g., a two arm, parallel group with primary efficacy variable measured at baseline and every 6 months thereafter), one can fit the random coefficient model and compare the treatment difference between two slopes. Also we can compare the treatment difference in terms of change from baseline to the endpoint (the last measure).

To test the difference in slopes, we would need to test whether or not the treatment*time interaction term is statistically significant. The assumption is that at the beginning of the trial, the intercept for both groups are the same - both groups start at the same level. Then if the treatment can slow the disease progression, the treatment group should show a smaller slope comparing with the placebo group. If all patients are followed up to the end of the study, if the slopes are different, the endpoint (change from baseline) analysis should also be statistically different. However, if the sample size is not sufficiently large, the results could be inconsistent by using slope comparison approach vs. endpoint analysis approach. For a given study, the decision has to be made which approach is considered as the primary endpoint. If we analyze the data using both approaches, we will then need to deal with the adjustment for multiplicity issue.

I used to make a comment saying "some regulatory authorities may prefer the simpler endpoint analysis"; I was then asked to provide the references to suport this statement. I did quite extensive research, but could not find any real relevant reference. However, by reviewing 'statistical reviews' in the BLA and NDA in US, it is very rare to see any product approval based on the comparison of the slopes. Many product approvals are based on the comparison of 'change from baseline'.

Every indication has its own accepted endpoints so the tradition takes precedence. For example, in Alzheimer's disease, there is a movement to look at differences in slopes, but this is based on trying to claim disease modification. Similarly, in the COPD area, some products are based on disease modification, the treatment differnces can be shown by comparing the differences in slopes between treatment groups.

It seems to be true that that the slope model (random coefficient model) may be preferred in academic setting, but endpoint approach - change from baseline (with last value carried forward) may be more practical in the industry setting.

From the statistical point of view, the slope approach makes a lot of sense, however, we need to be cautious about some potential issues: 1. In some efficacy measures, there might be some type of plateau. If the plateau is reached prior to the end of the study, there will be a loss of power comparing slopes.2. If the slope comparison is used as the primary efficacy measure, the # of measurements per year on the primary efficacy variable is relevant. One may think that the more frequent measures will increase the power to show the treatmetn difference in slopes. The question arise when designing the study: should we choose a shorter trial with more frequent measures? or should we choose a longer trial with less frequent measures?

4 comments:

  1. For depression, FDA was very specific that sponsors should treat time as a categorical variable when applying the repeated measure/MMRM analysis.

    ReplyDelete
  2. It's interesting that you mentioned COPD. Well, both UPLIFT and TORCH, two mega trials failed to show statistically significant difference in slopes. When we design a new mega trial, should we not use it as a primary endpoint, or even a key secondary? Anythought?

    ReplyDelete
  3. It's interesting that you mentioned COPD. Well, both UPLIFT and TORCH, two mega trials failed to show statistically significant difference in slopes. When we design a new mega trial, should we not use it as a primary endpoint, or even a key secondary? Anythought?

    ReplyDelete
  4. Comparing the difference in slopes as primary endpoint may not be a good strategy. If the measurement over time deviates from the linear trend, comparison of two slopes may be insensitive. Mixed Models for. Repeated Measures (MMRM) may be better.

    ReplyDelete