On Biostatistics and Clinical Trials: October 2015

Sometimes, we may need to calculate the sample size to estimate a population proportion or a population mean with a precision or margin of error. Here we use the terms ‘precision’ and ‘margin of error’ interchangeably. The precision may also be referred as “half of the confidence interval”, “half of the width of CI”, and “Distance from mean to limit” depending on the sample size calculation software.

Statistician may need to estimate the sample sizes for the following situations:

Example 1: A survey estimated that 20% of all Americans aged 16 to 20 drove under the influence of drugs or alcohol. A similar survey is planned for New Zealand. The researchers want to estimate a sample size for the survey and they want a 95% confidence interval to have a margin of error of 0.04.

Example 2: an immunogenicity study is planned to investigate the occurrence of the antibody to a therapeutic protein. There is no prior information about the percentage patients who may develop the antibody to the therapeutic protein. How many patients are needed for the study with a 95% confidence interval and a precision of 10%?

Example 3: A tax assessor wants to assess the mean property tax bill for all homeowners in Madison, Wisconsin. A survey ten years ago got a sample mean and standard deviation of $1400 and $1000. How many tax records should be sampled for a 95% confidence interval to have a margin of error of $100?

These are set of situations where the sample size estimation is based on the confidence interval and the margin of error. The examples #1 and #2 are dealing with the one-sample proportion where we would like to estimate the sample size in order to obtain an estimate for population proportion with certain precision. The example #3 is dealing with one-sample mean where we would like to estimate the sample size in order to obtain an estimate for population mean with certain precision.

Sample Size to Estimate A Proportion With a Precision

The usually formula is:

N = z^2 p(1-q) / d^2

where p is the proportion (may be obtained from the previous study or and d is the precision or margin of error. Z is the Z-score e.g. 1.645 for a 90% confidence interval, 1.96 for a 95% confidence interval, 2.58 for a 99% confidence interval

For example #1, the sample size will be calculated as:
N = 1.96^2 x 0.2 x 0.8/0.04^2 = 384.2 round up to 385

Similarly, if we use PASS, the input parameters will be

Confidence Interval: Simple Asynptotic

Interval Type: Two-sided

Confidence level (1-alpha): 0.95

Confidence Interval Width (two-sided): 0.08 (note: 0.04 x 2)

P (Proportion): 0.2

For example #2, since there is no prior information about the proportion, the practical way is that if no estimate of p is available, assume p = 0.50 to obtain a sample that is big enough to ensure precision.

If we use formula, the sample size will be calculated as:

N = 1.96^2 x 0.5 x 0.5 / 0.1^2 = 96

Similarly, if we use PASS, the input parameters will be

Confidence Interval: Simple Asymptotic

Interval Type: Two-sided

Confidence level (1-alpha): 0.95

Confidence Interval Width (two-sided): 0.2 (note: 0.1 x 2)

P (Proportion): 0.5

Sample Size to Estimate A Proportion With a Precision

The usually formula for is:

N = (s t/d)^2

Where s is the standard deviation, t is the t-score (approximate to Z-score if assuming normal) and d is the precision or margin of error.

For example #3:

N=(1000 x 1.96/100)^2 = 385

Similarly, if we use PASS, the input parameters will be:
Solved for: Sample size
Interval type: two-sided
Population size: infinite
Confidence Interval (1-alpha): 0.95
Distance from mean to limits: 100
S (standard deviation): 1000

The sample size calculation based on the precision is population in survey in epidemiology studies and polling in political science. In clinical trials, it seems to be common in immunogenicity studies. In immunogenicity studies, it is not just for one sample situation, it may also be used in the two sample situation. In a book “Biosimilars: Design and Analysis of Follow-on Biologics” by Dr Chow, sample size section mentioned the calculation based on precision:

In immunogenicity studies, the incidence rate of immune response is expected to be low. In this case, the usual pre-study power analysis for sample size calculation for detecting a clinically meaningful difference may not be feasible. Alternatively, we may consider selecting an appropriate sample size based on precision analysis rather than power analysis to provide some statistical inference.

The half of the width of the CI by w=Z(1-alpha)/2*sigma hat which is usually referred to as the maximum error margin allowed for a given sample size n. In practice, the maximum error margin allowed represents the precision that one would expect for the selected sample size. The precision analysis for sample size determination is to consider the maximum error margin allowed. In other words, we are confident that the true difference signma=pR-pr would fall within the margin of w=Z(1-alpha)/2*sigma for a given sample size of n. Thus, the sample size required for achieving the desired precision can be chosen.

This approach, based on the interest in only the type I error, is to specify precision while estimating the true delta for selecting n.
Under a fixed power and significance level, the sample size based on power analysis is much larger than the sample size based on precision analysis with extremely low infection rate difference or large allowed error margin.

SAS Proc Power can also calculate the sample size. The exact method is used for sample size calculation in SAS. The obtained sample size is usually greater that the ones calculated by hand (formula) or using PASS.

For confidence interval for one-sample proportion situation, the SAS codes will be something like this:

proc power;

onesamplefreq ci=wilson

halfwidth = 0.1

proportion = 0.3

ntotal = 70

probwidth = .;

run;

For confidence interval for one-sample mean, refer to an example provided in SAS online document: SAS 9.22 User’s Guide Example 68.7 Confidence Interval PrecisionExample

References:

The development of the orphan drugs has been red hot in recent years due to the favorable regulatory environment and the government incentives. It is well know that it is challenging to conduct the clinical trials in orphan diseases. The rigorous statistical rules and principles may not be feasible to be followed in drug trials in rare diseases. The sample size in rare diseases are usually small, but how low in sample size can we go? A recent FDA approval gives an answer.

Recently, FDA approved a new orphan drug to treat rare autosomal recessive disorder. The drug called Xuriden was approved for treating patients with Hereditary orotic aciduria – an ultral orphan disease. According to FDA’s announcement:

“The safety and effectiveness of Xuriden were evaluated in a single arm, six-week, open-label trial in four patients with hereditary orotic aciduria, ranging in age from three to 19 years of age, and in a six-month extension phase of the trial. The study assessed changes in the patients’ pre-specified hematologic parameters during the trial period. At both the six-week and six-month assessments, Xuriden treatment resulted in stability of the hematologic parameters in all four clinical trial patients.”

It looks like that the company (Wellstat Therapeutics) initially planned to enroll 10 subjects. I guess that due to the enrollment challenge, only 4 subjects were enrolled. Wellstat presented the data from 4 subjects and got the green light from FDA. According to the clinicaltrials.gov, the study protocol is titled "Open-Label Study of Uridine Triacetate in Pediatric Patients With Hereditary Orotic Aciduria" and the primary and secondary outcome measures are biomarkers or PD markers:

Primary Outcome Measures:

Stability of predetermined principal hematologic parameters

Secondary Outcome Measures:

Levels of orotic acid and orotidine in urine
Levels of uridine in the plasma

With the results from four patients, it is not possible to perform any inferential statistical analysis. The approval is essentially based on the summary results or data listings from the outcome measures in 4 patients. See Clinical Studies section in Xuriden package insert.

As part of the approval, the manufacturer of Xuriden (Wellstat therapeutics) was also granted a rare pediatric disease priority review voucher. The priority review voucher itself worthies millions dollars and probably will be more than enough to recoup all the investment costs that the company has invested in the development of Xuriden.

With the emphasis on precision medicine and personalized medicine, we will see a trend of conducting more smaller trials. In the future, every drug is so specifically targeted that it would apply to only a few people worldwide. If we can really achieve the precision medicine, more and more medicines will be classified as orphan drugs.

Some backgrounds about the orphan drug is summarized in one of my previous articles "Drug for Treating Rare Diseases: Orphan Drug, Orphan Disease, Orphan Subset". Similarly a presentation by Dr Lawrence J. Lesko from FDA gave a good introduction about the rare diseases and orphan drugs. Rare and ultra-rare diseases, often referred to as orphan and ultra-orphan diseases, affect very small numbers of patients. In the United States, a disease is defined as rare if it affects fewer than 650 patients per million of population, and the European definition of a rare disease is one that affects fewer than 500 patients per million of population. In contrast, a disease is generally considered to be ultra rare if it affects fewer than 20 patients per million of population (or, one patient per 50,000 people)—and most ultra-rare diseases affect far fewer than this—as few as one per million or less.

On Biostatistics and Clinical Trials

Monday, October 12, 2015

Sample Size Estimation Based on Precision for Survey and Clinical Studies such as Immunogenicity Studies

Saturday, October 03, 2015

How Low in Sample Size Can We Go? FDA approves ultra-orphan drug on a 4-patient trial