Saturday, May 27, 2017

Clinical Trial with Insufficient Sample Size: under power or detecting a trend?

When planning for a clinical trial, an important step is to estimate the sample size (the of patients needed to detect the treatment difference) for the study. In calculating the sample size, it is conventional to have significant level set at 0.05 and statistical power set at 80% or above. Sometimes, we need to design a clinical trial with insufficient sample size. This occurs pretty often in early phase clinical trials, in investigator initiated trials (IITs), and in rare disease drug development process due to the constraints in resource, budget, and available patients who can participate in the study. We could design a study without formal sample size calculation and we would simply state that the sample size of xxx is from the clinical consideration even though we don’t what it means exactly ‘the clinical consideration’.

If there are biomarkers or surrogate endpoints and treatment effects for biomarkers and surrogate endpoints are easier to detect than the clinical endpoints, we could design a proof-of-concept study or early phase study using the biomarkers or surrogate endpoints. The sample size can be formally calculated based on the treatment effect in biomarkers or surrogate endpoints. For example, in solid tumor clinical trials, we could design a study with smaller sample size based on the effect in shrinking the tumor size. In studies of inhaled antibiotics in non-CF Bronchiectasis, the early phase study could use the sputum density of the bacteria count as the endpoint so that the smaller sample size is required to demonstrate the effect before the late stage study where the clinical meaningful endpoint such as exacerbations should be used.

We can run into the situation where there is no good or reliable biomarkers or surrogate endpoints and the clinical endpoint is the only one available. The endpoint for the early phase study and the late phase study is the same. In order to design an early phase study with smaller sample size, we will need to do one of the followings:
  • Increase the significant level (alpha level) to allow greater type I error. Instead of testing the hypothesis at the conventional alpha = 0.05, we can test the hypothesis at alpha = 0.10 or 0.20 – we would say that we are trying to detect a trend. 
  • Lower the statistical power to allow greater type II error – design an underpowered study.

While both approaches have been used in literature, I would prefer the approach with increasing the significant level to detect a trend. Intentionally designing an underpowered study seems to have the ethical concern.

Here are some examples that the clinical trial is to detect a trend using alpha = 0.20 (or one-sided alpha=0.10):

“…It was estimated that for the study to have 90% power to test the hypothesis at a one-sided 0.10 significance level, the per-protocol population would need to include 153 participants in each group. The failure rate was estimated with binomial proportion and 95% confidence intervals. One-sided 90% exact confidence intervals were used to estimate the difference in the failure rates between the two treatments, which is appropriate for a noninferiority study and which is consistent with the one-sided significance level of 0.10 that was used for the determination of the sample size. “

 “A three-outcome (promising, inconclusive, not promising), one-stage modified Simon optimal phase II clinical trial study design with an interim analysis was chosen so that there would be a 90% chance of detecting a tumor response rate of at least 20% when the true tumor response rate was at least 5% at a 0.10 significance level, deeming that a RECIST response rate of less than 20% would be of little clinical importance in ATC.”

 “Assuming a mPFS of 3.5 months for GP and 5.7 months for GV (HR=0.61), a sample size of 106 subjects (53 per group) provided 85% power to detect this difference, using a one-sided test at the 0.10 significance level.”

 “The primary null hypothesis was that CoQ10 reduces the mean ALSFRSr decline over 9 months by at least 20% compared to placebo—in short, that CoQ10 is “promising.” It was tested against the alternative that CoQ10 reduces the mean ALSFRSr decline by less than 20% over 9 months compared to placebo, at one-sided alpha = 0.10

Here are some studies with insufficient power (less than 80% power). Notice that these studies still have 70% power. I can't image people's reaction if we design a study with 50% power. 

"A detailed calculation of sample size was difficult, since few studies have evaluated medications intended to augment local osseous repair in periodontal therapy. However, in one study of a selective cyclooxygenase-2 inhibitor in periodontal therapy, a sample of 22 patients per group was sufficient for the study to have 70% power to detect a 1-mm difference between the groups in the gain in clinical attachment level and reduction in probing depth, with a type I error rate of 5%."
“We estimated that a sample size of 600 would provide at least 70% power to detect a 33% reduction in the rate of the composite of the following serious adverse fetal or neonatal outcomes”

“With the sample of 99 patients, the study would have 70% power at a two-sided significance level of 0.05” 

1 comment:

Carol Tracy said...

Thanks for this great post!- This provides good insight. You might also be interested to know more about generating more leads and getting the right intelligence to engage prospects. E-Health Care Lists E-Health Care Lists implements new lead gen ideas and strategies for generating more leads and targeting the right leads and accounts.E-Health Care Lists is one of the global suppliers of healthcare mailing list & email list.Marketing to the healthcare industry, reaching the doctors and other healthcare decision makers is often an impossibletask.Medical Groups Mailing list & Email list