On Biostatistics and Clinical Trials: 2011

Friday, December 02, 2011

Serious Adverse Events (SAE) vs Severe Adverse Events

Professionals who are new to the clinical trial field are often confused with the concept of 'Serious Adverse Events (SAEs)' and 'Severe Adverse Events". Severity is not synonymous with seriousness. SAE is based on patient/event outcome or action criteria usually associated with events that pose a threat to a patient's life or functioning. Seriousness (not severity) serves as a guide for defining regulatory reporting obligations. In other words, the SAEs need to be filfill additional reporting process (reported to corporate global drug safety group or pharmacovigilence group, regulatory authorities, EC/IRBs). Severe AE is one class of AEs with severity (old term intensity) classified as 'severe'. Severe AE is one of the AE classifications – AE severity (other classifications are relationships/causality).

The FDA defines a serious adverse event (SAE) as one when the patient outcome is one of the following:

Death
Life-threatening
Hospitalization (initial or prolonged)
Disability - significant, persistent, or permanent change, impairment, damage or disruption in the patient's body function/structure, physical activities or quality of life.
Congenital anomaly
Requires intervention to prevent permanent impairment or damage

On the other hand, Severity of an AE is a point on an arbitrary scale of intensity of the adverse event in question. The terms "severe" and "serious" when applied to adverse events are technically very different. They are easily confused but can not be used interchangeably, require care in usage.

A headache is severe, if it causes intense pain. There are scales like "visual analog scale" that help us assess the severity. On the other hand, a headache is not usually serious (but may be in case of subarachnoid haemorrhage, subdural bleed, even a migraine may temporally fit criteria), unless it also satisfies the criteria for seriousness listed above. Similarly, a severe rash is not likely to be an SAE. However, mild chest pain may result in a day’s hospitalization and thus is an SAE.

Classifications of the AE sevirity often include the following:

Mild: Awareness of signs or symptoms, but easily tolerated and are of minor irritant type causing no loss of time from normal activities. Symptoms do not require therapy or a medical evaluation; signs and symptoms are transient.
Moderate: Events introduce a low level of inconvenience or concern to the participant and may interfere with daily activities, but are usually improved by simple therapeutic measures; moderate experiences may cause some interference with functioning
Severe: Events interrupt the participant’s normal daily activities and generally require systemic drug therapy or other treatment; they are usually incapacitating

The guidelines for AE severity assessment is based on:

Other sources about this topic may be useful:

Sunday, November 27, 2011

Reporting/recording the Serious Adverse Events (SAE) vs. Adverse Event (AE) Outcomes

In clinical trials, the serious adverse event reporting is critical to the safety assessment and to fulfill the regulatory requirements. The criteria for defining an SAE have been documented in many regulatory guidelines. However, in clinical trial implementation, the confusion could arise whether or not an event should be reported as an SAE or outcome of an SAE. Misinterpretation of the regulatory guidelines could cause in the inappropriate reporting of SAEs.

Acording to ICH E2A “CLINICAL SAFETY DATA MANAGEMENT: DEFINITIONS AND STANDARDS FOR EXPEDITED REPORTING”

                  A serious adverse event (experience) or reaction is any untoward medical occurrence that at any dose:                          * results in death,
                         * is life-threatening,
                             NOTE: The term "life-threatening" in the definition of "serious" refers to an event in which the patient was at
                             risk of death at the time of the event; it does not refer to an event which hypothetically might have caused death
                             if it were more severe.
                         * requires inpatient hospitalisation or prolongation of existing hospitalisation,
                         * results in persistent or significant disability/incapacity, or
                         * is a congenital anomaly/birth defect.
FDA website has provided a little bit more detail descriptions on SAE
"What is a Serious Adverse Event?

An adverse event is any undesirable experience associated with the use of a medical product in a patient. The event is serious and should be reported to FDA when the patient outcome is:
Death

Report if you suspect that the death was an outcome of the adverse event, and include the date if known.
Life-threatening

Report if suspected that the patient was at substantial risk of dying at the time of the adverse event, or use or continued use of the device or other medical product might have resulted in the death of the patient.
Hospitalization (initial or prolonged)

Report if admission to the hospital or prolongation of hospitalization was a result of the adverse event.

Emergency room visits that do not result in admission to the hospital should be evaluated for one of the other serious outcomes (e.g., life-threatening; required intervention to prevent permanent impairment or damage; other serious medically important event).
Disability or Permanent Damage

Report if the adverse event resulted in a substantial disruption of a person's ability to conduct normal life functions, i.e., the adverse event resulted in a significant, persistent or permanent change, impairment, damage or disruption in the patient's body function/structure, physical activities and/or quality of life.
Congenital Anomaly/Birth Defect

Report if you suspect that exposure to a medical product prior to conception or during pregnancy may have resulted in an adverse outcome in the child.
Required Intervention to Prevent Permanent Impairment or Damage (Devices)

Report if you believe that medical or surgical intervention was necessary to preclude permanent impairment of a body function, or prevent permanent damage to a body structure, either situation suspected to be due to the use of a medical product.
Other Serious (Important Medical Events)

Report when the event does not fit the other outcomes, but the event may jeopardize the patient and may require medical or surgical intervention (treatment) to prevent one of the other outcomes. Examples include allergic brochospasm (a serious problem with breathing) requiring treatment in an emergency room, serious blood dyscrasias (blood disorders) or seizures/convulsions that do not result in hospitalization. The development of drug dependence or drug abuse would also be examples of important medical events."

The standard coding dictionary for adverse events is MedDRA (Medical Dictionary for Regulatory Activities). The guidance document MedDRA® TERM SELECTION: POINTS TO CONSIDER gives clear explanation how death and other patient outcomes should be handled.

3.2 – Death and Other Patient Outcomes

Death, disability, and hospitalization are considered outcomes in the context of safety reporting and not usually considered ARs/AEs. Outcomes are typically recorded in a separate manner (data field) from AR/AE information. A term for the outcome should be selected if it is the only information reported or provides significant clinical information.

(For reports of suicide and self-harm, see Section 3.3).

3.2.1 Death with ARs/AEs

Death is an outcome and not usually considered an AR/AE. If ARs/AEs are reported along with death, select terms for the ARs/AEs. Record the fatal outcome in an appropriate data field.

3.2.4 Other patient outcomes (non-fatal)

Hospitalization, disability and other patient outcomes are not generally considered ARs/AEs.

There are many other examples in terms of recording the outcome instead of AE/SAE. Adverse events represent the untoward medical event, not the intervention to treat that event. For example, if a subject has appendectomy, the AE is appendicitis not the surgical procedure; if a subject has an limb amputation, the AE is the cause for amputation (perhaps, the worsening of the ischemia in the peripheral artery) and limb amputation should be reported as the outcome of the AE/SAE; If a patient is hospitalized due to congestive heart failure, congestive heart failure should be reported SAE and hospitalization should be reported as an outcome for congestive heart failure.
We should also be aware that not every hospitalization will have an associated SAE to be reported. Any AE leading to hospitalization or prolongation of hospitalization meets ONE of the followings should not be reported as SAE.
A hospitalization admission is pre-planned (ie, elective or scheduled surgery arranged prior to the start of the study). European Commission’s guidelines on medical devices “CLINICAL INVESTIGATIONS: SERIOUS ADVERSE EVENT REPORTING “ indicated that a planned hospitalization for pre-existing condition, or a procedure required by the Clinical Investigation Plan, without a serious deterioration in health, is not considered to be a serious adverse event.

A hospitalization admission is clearly not associated with an AE (eg, social hospitalization for purposes of respite care). If a patient wants to be stay in the hospital during the drug treatment because of the fear that something bad could happen, this should not be reported as SAE just because of the hospital stay if nothing else happens

According to these definitions, the events with outcome of death, hospitalization, disability or permanent damage, congenital anomaly/birth defect, … should be reported as SAE while death, hospitalization, disability or permanent damage, congenital anomaly/birth defect…should be reported as the outcome of the corresponding SAE. To be crystal clear, the Death, Hospitalization should not be reported as SAE and the causes leading to the death and hospitalization should be reported as SAE.

Thursday, November 24, 2011

Studentized residual for detecting outliers

Last time, I discussed the outliers and a simple approach of Dixon’s Q test for detecting a single outlier. When there are multiple outliers, we can detect the outliers using the standard deviation (for data that is normal distributed) or using percentiles (for the skewed data). A box plot may be useful to visually check the data for potential outliers.

In regression setting, there are several approaches in detecting the outliers. One of the approaches is to utilize the ‘standardized residual’ or ‘studentized resitual’. In linear regression, an outlier is an observation with large residual. In other words, it is an observation whose dependent-variable value is unusual given its values on the predictor variables.

The studentized residual is the quotient resulting from division of a residual by an estimate of its standard deviation. Just like the standard deviation, the studentized residual is very useful in detecting the outliers. For values outside the 3, 4, or 5 times standard deviation, we may have reasonable doubt that the values are outliers. In regression setting, observed values outside 3, 4, or 5 times the studentized residual are the targets for outliers.

In SAS, two regression procedures can be easily utilized to compute the studendized residual for detecting outliers. PROC REG and PROC GLM. The studentized residual is labelled as RSTUDENT in Output statement. Other regression procedure (such as PROC MIXED) also compute studentized residual as part of Influence test.

output out=newdata rstudent=xxx;
Further readings:

Regression with SAS - Regression Diagnostics
SAS version 9.3 PROC REG
SAS version 9.3 PROC GLM

Saturday, November 12, 2011

Outliers in clinical trial, Dixon's Q test for a single outlier

In clinical trials, we deal with the outlier issue differently from other fields. During the clinical trial, for the suspected ‘outliers’, every effort should be taken to query the investigator sites, to repeat measures, or to re-test the samples in order to get the correct information. Typically those suspected ‘outliers’ can be clarified during the data cleaning process. It is just not very common to throw away the data (even it is suspected to be ‘outlier’) in clinical trials. In one of pharmacokinetics studies, I did have to deal with the suspected outliers (we used the term ‘exceptional value’ instead of ‘outliers’). After the sample re-test, we still had one value very high. Instead of throwing away this exceptional value, we had to perform the analysis with and without this exceptional value.

In one of the presentations by a FDA officer, the term ‘outliers’ vs anomalous are used.

Outlier subjects may be “real” results and are therefore very valuable in making a correct BE conclusion
Anomalous results are data that are not correct due to some flaw in study conduct or analysis

In many situations, it is very difficult to know for sure whether or not an exceptional value is a outlier or an anomalous result.

In ICH E9 "Statistical Principles for Clinical Trials", the handling of outliers was discussed in the section of "missing values and outliers".

5.3 Missing Values and Outliers

Missing values represent a potential source of bias in a clinical trial. Hence, every effort should be undertaken to fulfil all the requirements of the protocol concerning the collection and management of data. In reality, however, there will almost always be some missing data. A trial may be regarded as valid, nonetheless, provided the methods of dealing with missing values are sensible, and particularly if those methods are pre-defined in the protocol. Definition of methods may be refined by updating this aspect in the statistical analysis plan during the blind review. Unfortunately, no universally applicable methods of handling missing values can be recommended. An investigation should be made concerning the sensitivity of the results of analysis to the method of handling missing values, especially if the number of missing values is substantial.

A similar approach should be adopted to exploring the influence of outliers, the statistical definition of which is, to some extent, arbitrary. Clear identification of a particular value as an outlier is most convincing when justified medically as well as statistically, and the medical context will then often define the appropriate action. Any outlier procedure set out in the protocol or the statistical analysis plan should be such as not to favour any treatment group a priori. Once again, this aspect of the analysis can be usefully updated during blind review. If no procedure for dealing with outliers was foreseen in the trial protocol, one analysis with the actual values and at least one other analysis eliminating or reducing the outlier effect should be performed and differences between their results discussed.

I was recently asked for help to test an outlier for the data from a lab experiment (not a clinical trial).

The titer for the same sample was measured for 20 times. The titer is 25 for 7 times, 125 for 12 times. However, for one time, the title is 625. Is there any way to test (statistically) whether the titer of 625 is an outlier?

Titer	25	125	625
N	7	12	1

There is a simple test for outlier called Dixon's Q-test. Dixon’s Q-test calculates the Q value that is the ratio of the Gap (the difference between the extreme value and the immediately adjacent value) and the Range (the difference between the extreme value and the maximal or minimal value)

In the case above, the titer value needs to be log-transferred first, therefore, with Log10 data transfer, data will be listed as the following (in order):

1.39794 1.39794 1.39794 1.39794 1.39794 1.39794 1.39794 2.09691 2.09691 2.09691
2.09691 2.09691 2.09691 2.09691 2.09691 2.09691 2.09691 2.09691 2.09691 2.79588

The gap = 2.79588 - 2.09691 = 0.69897
The range = 2.79588 - 1.39794 = 1.39794
The Q value = 0.69897 / 1.39794 = 0.5

The Q value will then be compared with the critical value. The critical value can be found at difference web sources or from the original paper. The critical value for N=(7+12+1) = 20 is 0.342.
Since Q value is larger than 0.342, we can reject 2.79588 and conclude that the original value 625 (log-transferred value of 2.79588) is a outlier.

If we use a Log5 data transfer, the calculation will be easier and conclusion is the same.

This approach can only be used for detecting a single outlier. If there are more than one values in 625 titer group, Dixon's Q test will not be an appropriate approach.

Typically, identifying of the outliers is against a continuous variable (ie, the data is continuous). The data above contains many ties (due to the design). Therefore, the results from the Dixon’s Q-test needs to be interpreted in caution. The determination of the outliers should always be based on the understanding of the experimental data.

For further reading about the outlier issues:

Saturday, October 29, 2011

Story of Xigris (Protein C) for Sepsis

Xigris, also called Drotrecogin Alfa (Activated) or Protein C, was the only approved drug for severe sepsis indication and it was withdrawn from the market last week (Oct 25, 2011). In a recently completed clinical trial (PROWESS-SHOCK trial), Xigris failed to show a survival benefit. Due to the early controversies over Xigris’s approval and the continuous debate on Xigris’s risk benefit, PROWESS-SHOCK trial has been under watch since its start. The study design, statistical analysis plan, and unblinding plan have all been published way before the completion of the trial.

A decade ago, prior to the approval of Xigris for sepsis indication, the risk-benefit had been debated quite a bit. Xigris was know to be linked to the increased risk of serious bleeding in patients. there was "controversy surrounds both the drug study itself and the FDA approval," wrote NEJM editor-at-large Richard P. Wenzel, MD in 2002. FDA held the anti-infective advisory committee meeting for Xigris in treating sepsis. The FDA approved the drug despite the advisory committee's split vote (10 to 10) due to concerns about the validity of the claimed efficacy and safety findings on the basis of a single trial. At that time, Xigris was approved based on a single pivotal trial (PROWESS trial) that was also stopped early for efficacy. At that time, the FDA reviewers certainly believed that Xigris was beneficial and could save a lot of lives.

PROWESS trial has been the model for other clinical trials in Sepsis even though the PROWESS trial itself has been criticized for changes in the protocol during the trial. According to the NEJM article by H. Shaw Warren, MD, from Massachusetts General Hospital in Boston, and fellow consultants to the FDA, the study protocol changed during the PROWESS trial, shifting the study population composition toward patients with less severe underlying disease and more acute infectious illnesses. Other changes included use of a different placebo and elimination of protein C deficiency status as a primary variable. Around the same time, Lilly began producing the drug using a new master cell bank. Cumulative mortality curves suggest an improvement in protective efficacy of Xigris after these changes were made.

Subsequent trials have now shown that Xigris has no benefit and has unfavorable risk-benefit profiles. The ADDRESS trial (published in 2005) showed the absence of a beneficial treatment effect, coupled with an increased incidence of serious bleeding complications. The result indicates that Xigris should not be used in patients with severe sepsis who are at low risk for death, such as those with single-organ failure or an APACHE II score less than 25. Now the PROWESS-SHOCK trial further confirmed that the risk of bleeding outweigh the benefit in reducing the mortality – unfortunately it is a decade after Xigris has been on the market.

The market practice for Xigris has also been criticized. Several years ago, there were a lot of talks about Lilly’s influence on a committee in defining the sepsis treatment guidelines which was in favor of using Xigris.

Retrospectively, we can have something to learn from the Xigris story: 1) a single pivotal trial may be insufficient in confirming the treatment benefit; 2) change the protocol during the trial could have bias to the trial results; 3) stop a trial for efficacy may be risky.

New drugs for life-threatening disease such as sepsis are desperately needed, however, to demonstrate the benefit of any drug in the complicated sepsis treatment is a challenging task. The diversities in sepsis treatment in various institutes make the clinical trials in sepsis very difficult and the sample size for sepsis trials need to be sufficiently large to show the benefit.

Thursday, October 27, 2011

A medical joke to share

Not sure where the origin is. It is circulated quite a bit.

Best friends graduated from medical school at the same time and decided that, in spite of two different specialties, they would open a practice together to share office space and personnel.

Dr. Smith was the psychiatrist and Dr. Jones was the proctologist; they put up a sign reading: "Dr. Smith and Dr. Jones: Hysterias and Posteriors". The town council was livid and insisted they change it.

So, the docs changed it to read: "Schizoids and Hemorrhoids". This was also not acceptable, so they again changed the sign. "Catatonics and High Colonics" - No go.

Next, they tried "Manic Depressives and Anal Retentives" - thumbs down again. Then came "Minds and Behinds" - still no good. Another attempt resulted in "Lost Souls and Butt Holes" - unacceptable again! So they tried "Analysis and Anal Cysts" - not a chance. "Nuts and Butts" - no way. "Freaks and Cheeks" - still no good. "Loons and Moons" - forget it.

Almost at their wit's end, the docs finally came up with: "Dr. Smith and Dr. Jones - Specializing in Odds and Ends". Everyone loved it.

Saturday, October 15, 2011

Will Electronic Data Capture be always better than Paper-CRF?

The traditional way to do the clinical trial data management is to use the paper based case report forms (CRFs). The blank paper CRFs are distributed to the investigator sites. The investigator or study coordinator fills out the CRFs. CRFs will then be monitored and collected from the investigator sites. CRFs will subsequently be handled by a centralized group - data management group where the activities include the clinical database building, data entry, data cleaning, data clarification,...

The industry trend has been gradually moving away from the paper-based CRFs and moving toward to the electronic data capture (EDC). In EDC world, the database was built prior to the study start (significant longer leading time prior to the study study is needed) . The data will be directly entered into the database by the investigator site (investigator or study coordinator). EDC has been touted by many vendors as the preferred way for conducting clinical trials: getting the data fast, saving timeline, saving cost, minimizing data transcription errors... While this is generally true, it is not universal.

In some situations, the trial using the traditional paper-based CRFs is a better way than EDC. For example, in a clinical trial for a rare disease, there are many investigator sites and each site may only enroll very few subjects or not enroll any subject. The EDC will not be an efficient way in data collection. Many site staff will be trained on EDC and never have chance to enroll any patient into the study and never have a chance to use EDC. When a site finally has a chance to enroll a subject, the initial training on using EDC may be a distant memory.

The EDC trial is not always cheap. With EDC trial, significant cost could be spent on the EDC system hosting and EDC system help desk support. Imagining a slow enrollment trial running for 7-8 years, the cost for hosting EDC system and providing the help desk support will be too much comparing to a paper-based study.

While EDC is a trend, the adoption of EDC is not universal. In some situations, the traditional paper CRFs may be better.

Tuesday, September 13, 2011

Confidence Interval for Difference in Two Proportions

In many clinical trials, the outcome is binomial and a 2 x 2 table can be constructed. The analysis can be based on the difference in two proportions (treatment group vs. control group). SAS Proc Freq can be used to obtain the difference between the proportions and the asymptotic confidence interval can be calculated for the difference between two proportions. The formula is (p1-p2) +/- Z(alpha/2)*sqrt((p1*q1/n1)+p2*q2/n2)).
However, the asymptotic confidence interval produced by PROC FREQ requires a somewhat large sample size (say cell counts of at least 12) - this is the case at least for SAS version up to 9.2. For moderately small sample size, it is better to use the formula provided in Fleiss (1981, page 29) Stokes (2000, page 29-30) where the confidence interval is adjusted by 0.5*(1/n1 + 1/n2) - therefore a little wider. The confidence interval directly from SAS Proc FREQ is a little narrower than those using the formula. In practice, the statistician needs to make the choice which one to use in calculating the confidence interval for difference in proportions depending on the sample size situation.

Fleiss, JL (1981) Statistical Methods for Rates and Proportions. New York: John Wiley & Sons, Inc.
Stokes, Davis, and Kock (2000) Categorical Data Analysis using the SAS System, 2nd edition
FDA Draft Guidance on Tazarotene detailed the calculation of the 90% confidence interval for establishing the bioequivalence for the clinical endpoint using the second approach mentioned above.

The example from Stocks book can be implemented in SAS using the following SAS codes:

data respire2;
input treat $ outcome $ count @@;
datalines;
test    f 40
test    u 20
placebo f 16
placebo u 48
;

*** the confidence interval directly from SAS PROC FREQ;
proc freq order=data;
weight count;
tables treat*outcome / riskdiff;
run;

*** the confidence interval calculated from the formula (See section 2.4 Difference in Proportions
     in Stokes et al 'Categorical Data Analysis Using the SAS System' 2nd edition;
proc freq data=respire2 order=data;
    weight count;
    tables treat/noprint out=tots (drop=percent rename=(count=bign));
run;

proc freq data=respire2;
    weight count;
    tables treat*outcome/noprint out=outcome (drop=percent);
    run;

proc sort data=tots;
by treat;
run;

proc sort data=outcome;
    by treat;
run;

data prop;
    merge outcome tots;
    by treat;
    if treat='test' then p1=count/bign;
    if treat='placebo' then p2=count/bign;
run;

data prop1(rename=(count=count1 bign=bign1)) prop2(rename=(count=count2 bign=bign2));
     set prop;
     if treat='test' then output prop1 ;
     if treat='placebo' then output prop2;
run;

data proportion;
merge prop1(drop= p2 treat) prop2(drop = p1 treat);
run;

***Calculate the difference in proportions, SE, and 95% confidence interval using formula by Fleiss;
data cal;
set proportion;
    variance=(p1*(1-p1)/(bign1)) + (p2*(1-p2)/(bign2));
    diff=(p1-p2);
    lower=(diff - ((1.96*(sqrt(variance)) + .5*(1/bign1 + 1/bign2))));
    upper=(diff + ((1.96*(sqrt(variance)) + .5*(1/bign1 + 1/bign2))));
    se=(sqrt(variance));

run;

proc print;
format p1 p2 variance diff lower upper se 5.3;
run;

Friday, September 09, 2011

Is it time to change the clinical monitoring practice in clinical trials?

In industry, the current monitoring practice relies on ‘on-site monitoring’ and 100% source data verification (on all data fields). This process is very costly and is one of the main reasons that the clinical trials now become so expensive. This process is really the most conservative interpretation of ICH E-6 guidance on
Guideline for Good Clinical Practice and the 1988 FDA’s “Guidance for the Monitoring of Clinical Investigations”. These guidance only require “the sponsor should ensure that the trials are adequately monitored” and leave the door open in terms of the frequency of the monitoring and the approaches of the clinical monitoring. In industry, the conduct of the clinical trials are highly regulated. Sponsors are usually take the most conservative approaches no matter how costly these approaches are.

Will ‘on-site monitoring’ be really effective? Will 100% source data verification really be needed? Should we identify the new ways to conduct the cost-effective clinical monitoring?

Last month, FDA withdrew its 1988 guidance on “Guidance for the Monitoring of Clinical Investigations” and issued its draft guidance “Oversight of Clinical Investigations - A Risk-based Approach to Monitoring”. The newly issued guidance suggested it is acceptable to use alternative approaches (such as remote monitoring, centralized monitoring, risk-based monitoring). The guidance also suggested that the source data verification should be focused on critical fields (key efficacy and safety variables) and less than 100% source data verification on less important fields may be acceptable. The guidance gives a clear signal that the Sponsors are encouraged to explore the cost-effective ways to conduct the clinical monitoring instead of solely relying on the on-site monitoring.

If this guidance gets implemented, we may expect the increasing role of statisticians in clinical monitoring, especially the centralized monitoring. Currently, statisticians will identify the issues in the late stage of the clinical trials when statisticians or statistical programmers start to perform the data analyses. The new guidance says:

“…notably, the advancement in EDC systems enabling centralized access to both trial and source data and the growing appreciation of the ability of statistical assessments to identify clinical sites that require additional training and/or monitoring.”

“Centralized monitoring is a remote evaluation carried out by sponsor personnel or representatives (e.g., data management personnel, statisticians, or clinical monitors) at a location other than the site(s) at which the clinical investigation is being conducted.”

Sunday, August 21, 2011

Odds ratio and risk ratio in clinical trials #2

In my previous article, I discussed the odds ratio and risk ratio (or relative risk ratio). In clinical trials with binary outcome, both odds ratio and relative risk ratio are used. Since the clinical trials are similar to the cohort studies in epidemiology field, it seems to be more reasonable to use relative risk ratio in clinical trials. However, the odds ratio may be more commonly used in practice. This may be due to the fact that the odds ratio can be easily modeled using logistic regression. This could also be due to the fact that the odds ratio is typically larger than relative risk ratio that may be desired by the researcher.

For a non-inferiority or equivalence trials with binary outcome, one may desire to have a smaller standard error, therefore a narrower confidence interval – in this case, the relative risk ratio may be better than odds ratio.

Wikipedia gives an excellent comparison between relative risk ratio and odds ratio.

In an article “How can I estimate relative risk in SAS using proc genmod for common outcomes in cohort studies?”, the calculation of odds ratio, relative risk ratio, and their confidence intervals are illustrated.

Using SAS Proc Genmod, both odds ratio, relative risk ratio, and their confidence intervals can be easily calculated:

For odds ratio:

Proc genmod data = xxx descending;
    class treatment;
model outcomevariable = treatment
                                                  / dist = binomial link = logit;
estimate 'Beta' treatment 1 -1/ exp;
run;

Here, the “link=logit” can be omitted since the logit link function is default when distribution is binomial.

For relative risk ratio,
proc genmod data = xxx descending;
    class treatment;
    model outcome = treatment
                              / dist = binomial link = log;
    estimate 'Beta' treatment 1 -1/ exp;
run;

Here, the “link=log” can NOT be omitted since the log link function is NOT default when distribution is binomial.

Relative risk ratio can also be estimated using poisson regression especially when the event ratio is small.

Proc genmod data = eyestudy;
    class id;
    model outcome = treatment
                          / dist = poisson link = log;
    repeated subject = id/ type = unstr;
    estimate 'Beta' treatment 1 -1/ exp;
run;

Here, the “link=log” can be omitted since the log link function is default when distribution is poisson.

There are several advantages of using Proc Genmod to calculate the odds ratio and risk ratio. Adjusted odds ratio and adjusted relative risk ratio can be easily calculated when there are continuous or categorical covariates. The model can be easily modified to fit the longitudinal data.

Proc Logistic can be used for calculating the odds ratio (and the confidence interval) and can adjust for continuous or categorical covariates. However, Proc Logistic can not be used for calculating the relative risk ratio.

proc logistic;
     model outcome = treatment;
run;

Proc FREQ can be used for calculating the odds ratio and relative risk ratio (and asymptotic confidence interval) using /cmh option. For adjusted odds ratio or risk ratio, only the categorical covariate can be used.

proc freq order=data;
    tables covariate*treatment*response / CMH;
run;

In the output, the odds ratio will be explicitly indicated while relative risk ratio will be labeled as “col1 risk” or “col2 risk”.

There is no regulatory guidance forcing the use of odds ratio or relative risk ratio. However, in clinical trials, if we compare the ratio of two proportions (eg the proportion of success in treated group vs. the proportion of success in control group), relative risk ratio seems to be better. Relative risk ratio resemble the hazard ratio in may aspects.

In FDA’s guidance “Diabetes Mellitus — Evaluating Cardiovascular Risk in New Antidiabetic Therapies to
Treat Type 2 Diabetes” The calculation of risk ratio is suggested. The guidance indicated
"Sponsors should compare the incidence of important cardiovascular events occurring with the investigational agent to the incidence of the same types of events occurring with the control group to show that the upper bound of the two-sided 95 percent confidence interval for the estimated risk ratio is less than 1.8. This can be accomplished in several ways. The integrated analysis (meta-analysis) of the phase 2 and phase 3 clinical trials described above can be used. Or, if the data from all the studies that are part of the meta-analysis will not by itself be able to show that the upper bound of the two-sided 95 percent confidence interval for the estimated risk ratio is less than 1.8, then an additional single, large safety trial should be conducted that alone, or added to other trials, would be able to satisfy this upper bound before NDA/BLA submission. Regardless of the method used, sponsors should consider the entire range of possible increased risk consistent with the confidence interval and the point estimate of the risk increase. For example, it would not be reassuring to find a point estimate of 1.5 (a nominally significant increase) even if the 95 percent upper bound was less than 1.8.”

In a presentation by Dr Bob O’Neill “Non -Inferiority Clinical Trials Some key statistical issues and concepts” he suggested that Log (Hazard ratio) or Log(relative risk) is preferred when determining the non-inferiority margin

In statistical review for Maxipime (cefepime hydrochloride) NDA, the risk ratio and 95% confidence interval are used. In an article “Relative risks of reported serious injury and death associated with hemostasis devices by gender”, the risk ratio were reported.

However, there are also many cases of using odds ratio instead of risk ratio in clinical trials. Some examples are:

In summary, while both odds ratio and risk ratio can be used in clinical trials, risk ratio should be given the adequate emphasis in comparing the ratio of two proportions (between two treatment groups). In non-inferiority clinical trials, the risk ratio and its confidence interval are preferred.

Thursday, August 18, 2011

FDA's new guidance on device approval process and device clinical trial

For a long time, device makers have been complaining about the FDA's device approval process. I personally heard a lot of talks that the device approval process is easier than the approval of the drug and the biological product. The requirements for clinical trials in device approval has lower standard comparing to clinical trials in drugs and biological products. FDA device division (CDRH) has the loose criteria for the product approval. In response to the critics, FDA now releases two new draft guidance on August 15, 2011 and is seeking to educate industry on device approval.

The first guidance “Factors to Consider when Making Benefit-Risk Determinations in Medical Device Premarket Review” explains the agency's approval process for diagnostic and therapeutic devices, specifically:

How the agency weighs the benefits and risks of a device
How the agency assess the seriousness of a disease or condition
How many people would use the device if approved
The availability of other devices approved to treat the same condition

It is interesting enough that in examples used in this guidance, several clinical trials were mentioned as flawed or with unreliable data, however, FDA would approve the device anyway. This is just another reflection that CDRH indeed has lower data quality standard comparing to CDER for drugs and CBER for biological products.

In the second draft document “Design Considerations for Pivotal Clinical Investigations for Medical Devices”, the FDA laid out its expectations for clinical trials for medical devices. The agency stated that it looks for a study to provide reasonable assurance that the device is safe and effective.

Back in July 2011, FDA issued a guidance on “in Vitro Companion Diagnostic Devices” where the In Vitro Companion diagnostic device (IVD companion diagnostic device) is defined as an in vitro diagnostic device that provides information that is essential for the safe and effective use of a corresponding therapeutic product. The guidance intended to accomplish the following:

Define in vitro companion diagnostic device
Explain the need for FDA oversight of IVD companion diagnostic devices
Clarify that, in most circumstances, if use of an IVD companion diagnostic device is essential for the safe and effective use of a therapeutic product, the IVD companion diagnostic device and therapeutic product should be approved or cleared contemporaneously by FDA for the use indicated in the therapeutic product labeling
Provide guidance for industry and FDA staff on possible premarket regulatory pathways and FDA’s regulatory enforcement policy
Describe certain statutory and regulatory approval requirements relevant to therapeutic product labeling that stipulates concomitant use of an IVD companion diagnostic device to ensure safety and effectiveness of the therapeutic product

No matter what the device is, if clinical trials are required, the statistical analyses can be based on the FDA guidance “Statistical Guidance on Reporting Results from Studies Evaluating Diagnostic Tests” that was issued in 2007.

In Device field, not all approval requires clincial trials. If clinical trials are required, they do not have to be interventional. If clinical trials are interventional, they do not have to be randomized, controlled. If clinical trials are randomized, controled, they do not have to be double-blinded. Some clinical trials in device field may be using the samples (for example blood samples) from patients with no intervention performed. The blood samples could be historical retains or prospectively collected. The use of the blood samples for device clinical trial still needs the informed consent from the patient.

Saturday, August 13, 2011

Equipoise and Lack of Equipoise in Randomized Clinical Trials

According to Merriam-Webster dictionary, the word "Equipoise" means a state of equilibrium. In clinical trial, the concept of 'clinical equipoise' means that there is genuine uncertainty over whether a treatment will be beneficial. In other words, in randomized controlled clinical trials, there should be substantial uncertainty or there is no clear evidence that one treatment arm is particularlly better or worse. Clinical equipoise provides the ethical basis for medical research involving patients assigned to different treatment arms of a clinical trial - it is unethical to assign a subject to an inferior arm if the lack of equipoise exists and if there is substantial evidence that one treatment is better or worse than another treatment.

In the real word, when we plan a randomized, controlled clinical trial, 'lack of equipoise' may often exist. This is especially true in late stage clinical trials. In late stage clinical trials, there typically be some evidences about the benefit and treatment effect of the experimental drug from early phase I or phase II clinical trials. Our sample size calculation is based on the assumed treatment effect of the experimental drug.

We recently published a paper in the Journal of Neurology "Challenges of clinical trial design when there is lack of clinical equipoise: use of a response-conditional crossover design" where we discussed a situation of 'lack of equipoise' and the use of a response-conditional crossover design to ease the concern about the lack of equipoise in clinical trial design. Several small trials had suggested that IVIg is beneficial in treating the disease CIDP - lack of equipoise. However, in the absence of an approved treatment for this indication, gaining regulatory approval for the use of IVIg in this indication required the conduct of large-scale, placebo-controlled confirmatory trials. Using the response-conditional crossover design, we eased the concern about subjects exposed to the 'perceived' inferior arm (Placebo in this case). The results indicated that we minimized subject's exposure to the inferior treatment arm.

Interpretation of 'clinical equipoise' may be different among clinicians, investigators, patients, clinical trial sponsors, and regulatory agencies. Evidences of treatment effects from small-scale clinical trials may be thought as real evidence for clinician and patients, but not for regulatory agencies.

When we bring the overall benefit/risk into the picture for assessing the 'clinical equipoise', it may be difficult to determine whether or not a clinical equipoise exist or not. A new treatment may have been demonstrated beneficial in efficacy, but with great uncertainty in safety.

Thursday, July 14, 2011

Data Query on Diary and Patient Reported Outcome?

We keep hearing that for patient report outcome (PRO) including patient diary data, there is no query or data clarification required and the diary data is source data and can not be queried. For example, in a paper titled “How to clean up dirty data in Patient reported outcomes”, it said “The investigator is not allowed to query any of the patient's answers which leads in general to a lot of dirty data.”. Is it true that the diary or patient reported outcome can not be queried at any circumstance no matter how horrible the data quality is? It is not true and this popular perception is just not true.

Data clarification or data query is an essential process in clinical data management to ensure that the questionable data are corrected. According to EMA reflection paper ON EXPECTATIONS FOR ELECTRONIC SOURCE DOCUMENTS USED IN CLINICAL TRIALS, “Data clarification is part of the process to ensure complete data and the clarification process is one of the processes underlining the need for the maintenance of the audit trail.”

In a clinical study with paper-based case report form, the data clarification is typically issued by the data managers to the investigation sites. The investigator then reviews the issues to provide the responses to the data query. Data managers or clinical monitors can not directly make the changes to the data without issuing query and getting approval from the investigation site. In a clinical study with electronic data capture (EDC), the data clarification/query process is built into the EDC system. The data is entered at the investigation site. The query is issued by data managers or study monitors within EDC system. The investigator will then provide the responses to the query also within EDC system and make the data corrections.

The process becomes vague for patient reported outcome (PRO) or patient diary (no matter it is on the paper or electronic). The key difference for PRO data is that the data is directly recorded or entered by the subject or patient. There is a perception that no matter how poorly the data is, there is no data clarification or query process for PRO data or diary data.

In many clinical trials, the study endpoints rely on the collection of the information provided by the patient (daily symptoms, daily activities, quality of life,…). For example, for clinical studies on urinary incontinence, the primary efficacy endpoint may be the frequency of urinary incontinence episodes (UIE) per week as determined from patient daily diary. For clinical studies on female sexual dysfunction, the clinical endpoint may be the sexual events or encounters recorded daily by the study subjects using diaries. .

I used to work on a clinical trial with irritable bowel syndrome indication where the study endpoints were collected by a touch-tone telephone system (IVR – interactive voice response system). Efficacy parameters were symptom relief, abdominal discomfort or pain, bloating, stool frequency, stool consistency, straining and urgency recorded through IVR on daily basis by the study subjects. For stool frequency, the subject were asked “how many bowel movements did you have today?” As a statistician, I had to check the outliers before analyzing the data. I found some entries with the number of bowel movements being extremely high (55, 66). When I discussed these impossible numbers with the study manager, I was told that patient diary could not be queried. I pointed out that if these obvious data errors could not be corrected, the study data would be severely compromised. Later, we identified many more entries with duplicate numbers (such as 11, 22, 33, 44, 55, 66). After inquiring to the study subjects, it was found that the subject pressed the telephone number key twice for 1, 2, 3, 4, 5, and 6. This is just an example showing that the diary data could be easily recorded wrongly in the database and the query for accuracy is necessary.

FDA’s guidance on “Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims” touched on ensuring the quality of PRO data. In discussion about the ePRO, FDA guidance indicated the concern about clinical investigator inability to maintain and confirm electronic PRO data accuracy. It said “the data maintained by the clinical investigator should include an audit trail to capture any changes made to the electronic PRO data at any point in time after it leaves the patient’s electronic device.” Furthermore, the guidance indicated the concern about ability of any entity other than the investigator (and/or site staff designated by the investigator) to modify the source data”. These statements inexplicitly imply that the diary data can be queried and modified by the clinical investigator (and/or site staff designated by the investigator) and the clinical investigator has responsibility to ensure the completeness and accuracy of the diary data.

Thursday, July 07, 2011

Adaptive Licensing for Drug Approval

From this year’s Drug Information Association (DIA) annual conference in Chicago, I learned a new concept of “Adaptive Licensing”. According to http://www.bnid.org/node/6921, “in the past five years, a wave of proposals for reform of drug licensing has emerged in the EU, US, and Canada under the labels of staggered approval, adaptive licensing, managed entry and progressive authorization. Through iterative phases of information gathering followed by regulatory evaluation and correction, these approaches seek to align licensing decisions on market access of drugs with emerging information on benefits and harms of drugs as actual used. It is hoped that this approach will provide patients with earlier access to innovative drugs to address unmet medical needs, better management of known risks, and improved detection of unanticipated adverse effects that emerge in use. “

The current drug licensing process is called ‘phased approach’ which requires the sponsors to conduct a series of clinical trials from phase I to phase III to establish the safety/tolerability and to confirm the efficacy before the regulatory agencies can consider the approval for marketing authorization of a product. This phased approach and the purpose of each phase of the clinical trial are discussed in this free web article. Recently, with the adaptive design concept, we are trying to break the traditional phased approach. The seamless phase II/III studies or seamless phase I/II studies have been much discussed and debated. Even with adaptive design, before a drug can be approved, a series of clinical trials are still required. Let’s now call “learning and confirming”: from early trial for learning to late stage trial for confirmation of the safety and efficacy with Adequate and well-controlled clinical study(ies) (A&WC).

With the phased drug approval approach, there will be a magic moment during the drug approval process. This magic moment is the regulatory reviewer’s action date (or decision date) - The date tells when a regulatory action, such as an original or supplemental approval, takes place. The regulatory agencies will be based on the review of data from pre-marketing clinical studies to make a decision of approving or not approving a production for market authorization. If the efficacy and safety have been demonstrated, the product is approved; if the efficacy and safety have not been sufficiently demonstrated, additional clinical studies may be requested; if the efficacy and safety are not demonstrated, the application of market authorization will be denied.

The adaptive licensing is trying to remove this magic moment from the drug approval process and instead considers the drug licensing as a continuous process. Whether or not a product should be authorized for marketing depends on the risk-benefit ratio. When the benefit outweighs the risk, the product should be approved; when the risk outweighs the benefit, the product should not be approved. For an approved product, if the benefit/risk ratio becomes unfavorable, the product should be withdrawn from the market. I liken this continuous adaptive drug licensing process to the p-value assessment in statistics. The magic moment is like the magic number of p=0.05 (p-value should be a continuous number and p=0.051 (not significant) may not be different from p=0.049 (significant)).

It looks like EU is pioneering in implementing the adaptive licensing. EMA has started to work on Adaptive Licensing to Reach Roadmap Goals (Roadmap to 2015) . However, in US, there is a similar program called “Accelerated Approval of New Drugs for Serious or Life-Threatening Illnesses”. With this program, “FDA may grant marketing approval for a new drug product on the basis of adequate and well-controlled clinical trials establishing that the drug product has an effect on a surrogate endpoint that is reasonably likely, based on epidemiologic, therapeutic, pathophysiologic, or other evidence, to predict clinical benefit or on the basis of an effect on a clinical endpoint other than survival or irreversible morbidity. Approval under this section will be subject to the requirement that the applicant study the drug further, to verify and describe its clinical benefit, where there is uncertainty as to the relation of the surrogate endpoint to clinical benefit, or of the observed clinical benefit to ultimate outcome. Postmarketing studies would usually be studies already underway. When required to be conducted, such studies must also be adequate and well-controlled. The applicant shall carry out any such studies with due diligence.”

However, the recent debates on whether or not Avastin (Bevicuzmab) should be withdrawn from the market for breast cancer demonstrated how difficult to implement this program. Based on the established rule, if a drug is approved through ‘accelerated approval’, the approval is conditional and can be withdrawn from the market if the drug is later showed to be ineffective or with unfavorable benefit/risk ratio. Avastin in breast cancer case just showed how difficult to withdraw a product due to ineffectiveness. In even worse situation, the drug companies may not fulfill the commitment to finish the follow-up studies. There have been several cases of marketing withdrawal due to safety concerns (for example, Vioxx), but it seems to be more difficult to withdraw a product from the market due to the efficacy concern.

Adaptive licensing may be the future direction for drug approval process; however, many issues need to be considered and resolved before this new process can be implemented.

What is the impact of adaptive licensing on patent expiration?
What is the tipping point the conditional approval can be granted?
How to assess the benefit/risk ratio with sound scientific / statistical approaches (ie avoiding subjective assessment in benefit/risk ratio)?
What if the committed follow-up studies never completed?
How to deal with the difficulties to withdrew a product when other factors (for example, emotional effect in Avastin case) are kicked in?
Adaptive Licensing: Taking the Next Step in the Evolution of Drug Approval
Commentary by Jane Woodcock

Saturday, June 18, 2011

Is blinded study really blinded? - assessment of blinding / unblinding in clinical trials

Randomization and blinding are critical components of the clinical trial from the start (design) to the end. Randomized, controlled, and double-blinded trial (RCT) has been the ideal clinical trial design. Inappropriate randomization and blinding (or potential unblinding) affect the integrity for the clinical trial. If the patient or investigator is aware of the treatment assignment, there will be conscious or unconscious biases in assessing efficacy, safety, or patient-reported outcome. With available software and computer programs, generating a randomization schedule is relatively easy. Ten years ago, I wrote a paper on “Generating randomization schedule using SAS programming” to show how easily randomization can be generated. With the interactive response technologies (IRT) including interactive voice response system (IVRS) and interactive web response system (IWRS), implementation of the randomization can also be easily managed. However, maintaining the blinding during the study may not be as easy as we thought.

I still remember the time when I was one of the randomization team members in PPD. After we generated the randomization schedule, we had to put the randomization schedule into an envelope and sealed with signatures. Then we had to put the envelope into a locked security box in a secured randomization room. In order to get the randomization schedule, at least two statisticians had to be present in order to open the security box.

While the actual randomization schedule is locked and secured, the randomization information or treatment assignment concealment can still be compromised by what happened at the site, how the patient and investigator guess the treatment assignment, and how the unblinded personnel communicate with the blinded team members.

There are many factors that can cause the potential unblinding. Here are some examples:

Guess treatment assignment by the experience of adverse events and side effects. Suppose an intravenously administered drug can cause more headaches than Placebo, a patient with headache may guess he/she is on treatment group and not on Placebo. While this guess may not be 100% accurate, majority of patients may guess their treatment assignment correctly. In a book by Chow et al, ‘Design and analysis of clinical trials: concepts and methodologies’, an example about challenge in maintaining blinding was described “beta-blocker (e.g., pro-pranolol) have specific pharmacologic effects such as lowering blood pressure and the heart rate and distinct adverse effects such as fatigue, nightmares, and depression. Since blood pressure and heart rate are vital signs routinely evaluated at every visit in clinical trials, if a drug such as propranolol is known to lower blood pressure and the heart rate, then preservation of blindness is a huge challenge and seems almost impossible” In a large scale study (BHAT study), at the conclusion of the trial, patients, investigators, and clinical coordinators were asked to guess the patient’s treatment assignment, 79.9%, 69.6%, and 67% of patients, investigators, and clinical coordinators respectively guessed correctly the patient was on Propranolol and 42.8%, 58.6, and 70.6% of patients, investigators, and clinic coordinators respectively guessed correctly that the patient was on Placebo.

Guess treatment assignment by improvement or no improvement in efficacy. If there is a prior knowledge that an treatment is effective (lack of equipoise), the investigator or patient can guess which treatment the patient is on based on the lack of effect.

Guess treatment assignment by knowing the blood concentration of the drug or analytes. If a treatment is for augmentation purpose, a patient could have his/her blood sample tested to know whether or not the concentration for augmented drug is increased or not, then guess which treatment group he/she is on.

In double-blinded studies, there are always some unblinded groups. These groups could include global drug safety for safety monitoring, laboratories that measure drug concentration or biomarkers, study drug supplies, site unblinded pharmacist… all of these groups could potentially reveal the treatment assignment to other study team unintentionally.

For a study with DMC that involved a third party to prepare the unblinded data for DMC, treatment concealment could potentially be compromised during the information exchange with the blinded study team. This is critical for studies with adaptive designs where the patient data needs to be constantly reviewed and analyzed. An interesting example was discussed by Janet Witts regarding an awkward situation in an adaptive design where the DMC knew the event rate by treatment assignment and the sponsor didn’t.

There is a dilemma when we develop the informed consent form. On the one hand, we are required to put into the informed consent form as much information as we can. On the other hand, the more information we put into the informed consent form, the more likely we enable the patients to guess their treatment assignment (based on their experience of side effects or perceived efficacy).

Ideally, in a double-blind trial, it is a good practice to evaluate for both the subjects and investigators whether or not blinding / masking has been preserved. However, in the real world, it is rare in double-blinded clinical trials to include a formal assessment of how well the blinding has been preserved. If the assessment of blinding becomes a routine, I think that many studies will show that subjects/investigators guessed correctly more frequently than they should have done by chance alone. Part of the reason this assessment has not been done often is perhaps the difficulty to explain the study results if the blinding is found to be compromised. It will be extremely difficult to assess the magnitude of the impact on the safety and efficacy evaluation if the blinding/treatment assignment concealment is compromised.

Further readings:

Assuring that double-blind is blind

Blinded trials taken to the test: an analysis of randomized clinical trials that report tests for the success of blinding

Blinding, unblinding, and the placebo effect: An analysis of patient’s guesses of treatment assignment in a double-blind clinical trial

Can keeping clinical trial participants blind to their study treatment adversely affect subsequent care?

Assessment of blinding in clinical trials

Concealing treatment allocation in randomised trials

Wednesday, June 15, 2011

Bland-Altman Plot for Assessing Agreement

Bland-Altman plot is a scatter plot of variable means plotted on the horizontal axis and the differences plotted on the vertical axis which shows the amount of disagreement between the two measures (via the differences) and lets you see how this disagreement relates to the magnitude of the measurements.

When I was in graduate school, the statistical analysis of microarray data just started to be a hot topic. In collaboration with Dr Rick Song, we looked at the microarray data and wrote a manuscript titled “On Graphical Presentation and Quantitative Analysis of cDNA Microarray Data” and we presented in JSM. In this manuscript, we proposed to use Bland-Altman plot. In clinical trials, I have not got a chance to apply this approach, but I do often see articles using the Bland-Altman plot. For example, an article titled “Using the Bland–Altman method to measure agreement with repeated measures” from British Journal of Anaesthesia.

When data is appropriate, Bland-Altman plot can be a handy tool to use. It is worth relaying the paragraphs from our original paper on graphical presentation of micro-array data using Bland-Altman plot.

“Graphical presentation is usually the first step for data analysis of microarray data. In the case without duplication (this is typical in microarray experiment), scatter plots will be drawn and then a regression line drawn through the data. This helps the eye in gauging the degree of agreement between two measurements and also may help us to identify the "outliers" that represent the differentially expressed genes in microarray experiment.

In clinical medicine, to assess agreement between two methods of clinical measurement, Bland and Altman proposed to plot the difference between the methods (A-B) against the mean (A+B)/2[12,13,14,15]. This approach has been extensively used in medical research for assessing measurement error and comparing different measurements for the same quantity. Bland and Altman’s method can be also applied to the microarray data. We can plot (Rm-Gm) against (Rm+Gm)/2 (figure2 above).

Calculating or plotting a regression line is not our focus as we are not concerned with the estimated prediction of one color intensity by another but with the theoretical relationship of equality and deviations from it.

There are several advantages for presenting the microarray data using Brand and Altman’s approach:

The plot of difference against mean allows us to investigate any possible relationship between the discrepancies and the true value. The plot will also show clearly any extreme or outlying observations. If two different samples are used in the experiment, these extreme or outlying observations could indicate the differentially expressed genes. It is often helpful to use the same scale for both axes when plotting differences against mean values. This feature helps to show the discrepancies in relation to the size of the measurement.

Brand and Altman's method makes it easier for us to estimate the precision of the estimated limits of agreement between two color intensities. We want a measure of the agreement that is easy to estimate and to interpret for a measurement on the color intensity of an individual gene. An obvious starting point is the difference between measurements by the two channels on the same gene. There may be a consistent tendency for one channel to exceed the other. This is called calibration factor and can be estimated by the mean difference. There will also be variation about this mean, which we can estimate by the standard deviation of the differences. These estimates are meaningful only if we can assume that calibration factor and variability are uniform throughout all genes.”

More references on Bland-Altman Plot:

Friday, June 03, 2011

Restructure FDA's Drug Review Process?

Last week, I had a chance to listen to a speech by Dr Scott Gottlieb. While he touched several topics in related to the health care reforms, I was specifically interested in his discussion on restructuring FDA’s drug approval process.

Dr Gottlieb gave a lot of insights within FDA and analyzed the root cause of the very long and inefficient FDA drug review process.

Following his speech, I located his paper “SHOULD FDA RESTRUCTURE ITS DRUG REVIEW PROCESS?” from FDLI’s website. A lot of his analyses are so true and to the point.

For example, he elaborated why FDA adopt a matrix management structure for its review program.

“Prior to FDA’s adoption of a matrix management structure for its drug review program, agency scientists were organized largely around the clinical areas in which they worked (oncology, cardio-renal, antiviral, etc). This therapeutically focused structure had some advantages, but also led to some of its own challenges. FDA’s adoption of a matrix structure was aimed at solving some of these problems.

For one thing, sponsors complained that the advice they received about disciplines like biostatistics or clinical pharmacology varied (sometimes significantly) across different therapeutic divisions. Statisticians in one clinical division would be interpreting certain principles of statistics or evaluating a particular protocol design in a manner different than statisticians inside another therapeutic division.

These discrepancies still occur. But it is believed that the matrix organizational structure cuts down on this sort of conflict.

Grouping all of the statisticians or pharmacologists inside the same office increases opportunities for comparable training and cross-calibration on key principles. CDER management pulled the first group of the review divisions—the chemists—in 1995. The impetus was differences in pharmaceutical quality requirements being maintained among the different clinical divisions. Ultimately, having the chemists organized as a single group fostered the development of consistent standards. It also enabled FDA to negotiate the standards established by the International Conference on Harmonization (ICH).

Another reason for establishing the matrix was to improve morale. FDA remains a very “physician centered” culture, but was much more so prior to adoption of the matrix. Staff who lacked medical degrees or who weren’t the clinical reviewers on an application sometimes complained that they felt marginalized in the review process. As one statistician told me, “we were treated like second-class citizens.” Specialists from non-clinical disciplines like statistics also complained that remaining immersed in a single therapeutic area didn’t give them the breadth of experience that they needed for their own professional development.

Similarly, the organization of scientific personnel by therapeutic area was also seen as an impediment to their continued training in their chosen disciplines. For example, statisticians came together for the equivalent of grand rounds or other kinds of shared learning experiences. But these kinds of cross-training opportunities were challenging because staff were ultimately accountable to their divisions. The shared training opportunities weren’t prioritized. Efforts were also made to rotate non-clinical experts across different therapeutic areas. But the challenges endured.”

He then went on discussing the issues with FDA’s weak matrix management structure.

“But in practical terms, the weak matrix means that FDA project managers have limited dominion over key aspects of the review. Key disciplines involved in the review aren’t accountable to the project manager, or the division director. It is a system where there are few management carrots and no sticks. This weak structure also makes it harder to organize collaborative projects or even team meetings. The project manager doesn’t have strong authority when it comes to managing the collaboration between the different scientists involved in a drug’s review.”

He also mentioned the quality of the FDA review scientist and issues with FDA’s policy to allow very flexible working hours and work-from-home schedule. The current FDA drug review team is loosely organized and inefficient in many aspects. However, it is not easy to make big changes to the current process.

On Biostatistics and Clinical Trials