Tuesday, September 13, 2011

Confidence Interval for Difference in Two Proportions

In many clinical trials, the outcome is binomial and a 2 x 2 table can be constructed. The analysis can be based on the difference in two proportions (treatment group vs. control group). SAS Proc Freq can be used to obtain the difference between the proportions and the asymptotic confidence interval can be calculated for the difference between two proportions. The formula is (p1-p2) +/- Z(alpha/2)*sqrt((p1*q1/n1)+p2*q2/n2)).
However, the asymptotic confidence interval produced by PROC FREQ requires a somewhat large sample size (say cell counts of at least 12) - this is the case at least for SAS version up to 9.2. For moderately small sample size, it is better to use the formula provided in Fleiss (1981, page 29) Stokes (2000, page 29-30) where the confidence interval is adjusted by 0.5*(1/n1 + 1/n2) - therefore a little wider.  The confidence interval directly from SAS Proc FREQ is a little narrower than those using the formula. In practice, the statistician needs to make the choice which one to use in calculating the confidence interval for difference in proportions depending on the sample size situation.

Fleiss, JL (1981) Statistical Methods for Rates and Proportions. New York: John Wiley & Sons, Inc.
Stokes, Davis, and Kock (2000) Categorical Data Analysis using the SAS System, 2nd edition
FDA Draft Guidance on Tazarotene detailed the calculation of the 90% confidence interval for establishing the bioequivalence for the clinical endpoint using the second approach mentioned above.
The example from Stocks book can be implemented in SAS using the following SAS codes:

data respire2;
  input treat $ outcome $ count @@;
test    f 40
test    u 20
placebo f 16
placebo u 48

*** the confidence interval directly from SAS PROC FREQ;
proc freq order=data;
  weight count;
  tables treat*outcome / riskdiff;

*** the confidence interval calculated from the formula (See section 2.4 Difference in Proportions
     in Stokes et al 'Categorical Data Analysis Using the SAS System' 2nd edition;
proc freq data=respire2 order=data;
    weight count;
    tables treat/noprint out=tots (drop=percent rename=(count=bign));
proc freq data=respire2;
    weight count;
    tables treat*outcome/noprint out=outcome (drop=percent);
proc sort data=tots;
  by treat;
proc sort data=outcome;
    by treat;
data prop;
    merge outcome tots;
    by treat;
    if treat='test' then p1=count/bign;
    if treat='placebo' then p2=count/bign;

data prop1(rename=(count=count1 bign=bign1)) prop2(rename=(count=count2 bign=bign2));
     set prop;
     if treat='test' then output prop1 ;
     if treat='placebo' then output prop2;

data proportion;
  merge prop1(drop= p2 treat) prop2(drop = p1 treat);

***Calculate the difference in proportions, SE, and 95% confidence interval using formula by Fleiss;
data cal;
  set proportion;
    variance=(p1*(1-p1)/(bign1)) + (p2*(1-p2)/(bign2));
    lower=(diff - ((1.96*(sqrt(variance)) + .5*(1/bign1 + 1/bign2))));
    upper=(diff + ((1.96*(sqrt(variance)) + .5*(1/bign1 + 1/bign2))));


proc print;
  format p1 p2 variance diff lower upper se 5.3;

Friday, September 09, 2011

Is it time to change the clinical monitoring practice in clinical trials?

In industry, the current monitoring practice relies on ‘on-site monitoring’ and 100% source data verification (on all data fields). This process is very costly and is one of the main reasons that the clinical trials now become so expensive. This process is really the most conservative interpretation of ICH E-6 guidance on
Guideline for Good Clinical Practice and the 1988 FDA’s “Guidance for the Monitoring of Clinical Investigations”. These guidance only require “the sponsor should ensure that the trials are adequately monitored” and leave the door open in terms of the frequency of the monitoring and the approaches of the clinical monitoring. In industry, the conduct of the clinical trials are highly regulated. Sponsors are usually take the most conservative approaches no matter how costly these approaches are.

Will ‘on-site monitoring’ be really effective? Will 100% source data verification really be needed? Should we identify the new ways to conduct the cost-effective clinical monitoring?

Last month, FDA withdrew its 1988 guidance on “Guidance for the Monitoring of Clinical Investigations” and issued its draft guidance “Oversight of Clinical Investigations - A Risk-based Approach to Monitoring”. The newly issued guidance suggested it is acceptable to use alternative approaches (such as remote monitoring, centralized monitoring, risk-based monitoring). The guidance also suggested that the source data verification should be focused on critical fields (key efficacy and safety variables) and less than 100% source data verification on less important fields may be acceptable. The guidance gives a clear signal that the Sponsors are encouraged to explore the cost-effective ways to conduct the clinical monitoring instead of solely relying on the on-site monitoring.

If this guidance gets implemented, we may expect the increasing role of statisticians in clinical monitoring, especially the centralized monitoring. Currently, statisticians will identify the issues in the late stage of the clinical trials when statisticians or statistical programmers start to perform the data analyses. The new guidance says:

“…notably, the advancement in EDC systems enabling centralized access to both trial and source data and the growing appreciation of the ability of statistical assessments to identify clinical sites that require additional training and/or monitoring.”

“Centralized monitoring is a remote evaluation carried out by sponsor personnel or representatives (e.g., data management personnel, statisticians, or clinical monitors) at a location other than the site(s) at which the clinical investigation is being conducted.”