On Biostatistics and Clinical Trials

Saturday, April 28, 2012

Cookbook SAS Codes for Bioequivalence Test in 2x2x2 Crossover Design

In Clinical Pharmacology, inferential statistics is performed to show the bioequivalence in terms of the Area Under the Curve (AUC) and the Maximum Concentration (Cmax) that are obtained from the time-concentration data. The typical clinical trial design is 2x2x2 crossover design contains two treatment sequences (Test followed by Reference vs. Reference followed by Test), two treatment periods (period 1 vs period 2), and two treatment groups (Test vs. Reference).

According to FDA’s guidance “Statistical Approaches to Establishing Bioequivalence”, the following assumptions can be made for the test of bioequivalence:

1. AUC and Cmax follow log-normal distribution

2. Bioequivalence is shown if the 90% confidence interval for the geometric least square mean ratio of Test/Reference is fall within 0.8 and 1.25

The statistical tests will follow so called two one-sided tests procedure (TOST) which can be implemented using the following cookbook SAS codes.

* Preparing the data and log-transfer the AUC and Cmax data;

data pkparm;
    set pdkparm;
    keep subno seqence treat period AUC CMAX;
    lauc=log(auc);
    lcmax=log(cmax);
run;

*** Fit the ANOVA model;
ods output LSMeans=lsmean;
ods output estimates=est;
proc mixed data=pkparm;
      class sequence period treat subno;
      model LAUC or Cmax=sequence period treat;
      random subno(sequence);
      lsmeans treat/pdiff cl alpha=0.1;
      estimate 'T/R' treat -1 1 / cl alpha=0.1;
     * make 'LSMEANS' out=lsmean; *used in old SAS versions;
     * make 'estimate' out=est; *used in old SAS versions;
run;

* Anti-log transformation to obtain the Geometric Means;
data lsmean;
      set lsmean;
      gmean=exp(estimate); *Geometric means;
run;

proc print data=lsmean;
run;

* Anti-log transformation to obtain the ratio of Geometric Means (point estimate) and its 90% confidence interval (lower and upper bounds);
data diffs;
     set EST;
     ratio=exp(estimate); ** Ratio of geometric mean;
     lower=exp(lower); ** 90% CI lower bound;
     upper=exp(upper); ** 90% CI upper bound;
run;

proc print data=diffs;
run;

Some Notes:

1. p-value for Treatment/Reference comparison can also be obtained from the model (in above EST data set). However, p-value is not the criteria for declaring the bioequivalence and must be interpreted appropriately. We could have a significant p-value (p<0.05) and still show bioequivalence as long as the 90% confidence interval of the geometric mean ratio is fall within 0.8 and 1.25 range. If we have a 90% confidence interval like [0.85, 0.95] or [1.05, 1.15], the bioequivalence will be shown even though the p-values are significant.

2. In SAS Proc Mixed model, the subject within sequence is coded as subno(seqence), not sequence(subno). However, if you use sequence(subno), the results will be the same.

3. For log transformation, it does not matter which base (base 10, 5, or e (natural log)) as long as the final results from the model are correctly an-log transferred back.

4. while we typically say ‘Ratio of geometric mean’, it is actually the ‘ratio of geometric least square mean’ from the model.

5. FDA guidance "Statistical Approaches to Establishing Bioequivalence" appendix E "SAS Program Statements for Average BE Analysis of Replicated Crossover Studies" provided the detail SAS codes with Proc Mixed. While it is stated for the 'replicated crossover studies', however, 2x2x2 crossover design is a simplest case of the replicated crossover studies.

The following illustrates an example of program statements to run the average BE analysis using
PROC MIXED in SAS version 6.12, with SEQ, SUBJ, PER, and TRT identifying sequence,
subject, period, and treatment variables, respectively, and Y denoting the response measure (e.g., log(AUC), log(Cmax)) being analyzed:

PROC MIXED;
CLASSES SEQ SUBJ PER TRT;
MODEL Y = SEQ PER TRT/ DDFM=SATTERTH;
RANDOM TRT/TYPE=FA0(2) SUB=SUBJ G;
REPEATED/GRP=TRT SUB=SUBJ;
ESTIMATE 'T vs. R' TRT 1 -1/CL ALPHA=0.1;

The Estimate statement assumes that the code for the T formulation precedes the code for the R formulation in sort order (this would be the case, for example, if T were coded as 1 and R were coded as 2). If the R code precedes the T code in sort order, the coefficients in the Estimate statement would be changed to -1 1.

In the random statement, TYPE=FA0(2) could possibly be replaced by TYPE=CSH. This guidance recommends that TYPE=UN not be used, as it could result in an invalid (i.e., not non-negative definite) estimated covariance matrix.
Additions and modifications to these statements can be made if the study is carried out in more than one groups of subjects

Thursday, April 19, 2012

The "PATIENTS' FDA" Act - Sens. Richard Burr and Tom Coburn Introduce a New Plan to Reform the FDA

In my previous article "Should the design and conduct of clinical trials be simplified? ", I discussed several FDA guidance that suggested that in several areas, the dada collections may be reduced and the clinical trial monitoring may need to switch to the risk-based approach instead of the current frequent on-site visits and 100% source data verification.

Coincidently, yesterday, Sens. Richard Burr and Tom Coburn introduced a new plan to reform the FDA - The "PATIENTS' FDA" Act . The patient' FDA act (if approved) will force FDA to be further transparent and to be mindful in requesting too much data from the pharmaceutical companies. For last several years, after several high-profile drug withdrawals (Vioxx, Avandia for example), FDA has swung to another extreme and become very conservative, which subsequently made the clinical trials more difficult to execute and drug development more costly. Perhaps, it is really not the FDA's intension, however, many of its staff/reviewers become too conservative. Instead of working with the industry to bring the new medications to the patients with the reduced cost and within the reasonable timeframe, some reviewers request the sponsors to collect data with no real justification and ask the sponsors to implement something that may just be for reviewer's own interest or opinion.

Forbes published a good article as a companion to this bill. Here are some of the paragraphs from this article.

More accountability for meeting drug-review deadlines. The FDA has been increasingly failing to meet its PDUFA-mandated deadlines for giving companies approval decisions on new drug applications. The PATIENTS’ FDA Act would require the FDA to “report [to Congress] on a deeper level detail with respect to the performance goals agreed to in the prescription drug, generic drug, and biosimilar user fee agreements,” and hold individual reviewers accountable for their speed in reviewing applications.
stop forcing companies to do unnecessary and expensive busywork. The bill’s summary notes that “some FDA reviewers request reams of additional information about a drug or device that is beyond the scope of data needed to meet the FDA’s approval standard.” The FDA will be required, under the bill, to “document the scientific and regulatory rationale” for such decisions, and review within one year “the costs and adoption of the least burdensome approaches to regulation.” The bill would also codify the FDA’s “commitment to improve on patient risk-benefit considerations…to ensure accountability for fulfilling…the user fee agreements.”
Take more advantage of clinical trials in other countries. The bill would require FDA to work with “other specific regulatory authorities of similar standing” to encourage uniform standards for clinical trials. (The Geneva-based International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, or ICH, performs many of these functions.) FDA will also be instructed to help sponsors “minimize the need for duplication of clinical studies, preclinical studies or non-clinical studies.”

Sunday, April 15, 2012

Should the design and conduct of clinical trials be simplified?

As mentioned in one of FDA’s guidance, “in the past two decades, the number and complexity of clinical trials have grown dramatically. These changes create new challenges in clinical trial oversight such as increased variability in investigator experience, ethical oversight, site infrastructure, treatment choices, standards of health care, and geographic dispersion.”. According to an article by Mr Getz titled “The Heavy Burden of Protocol Design More complex and demanding protocols are hurting clinical trial performance and success”, companies sponsoring clinical research have openly acknowledged that protocol design negatively impacts clinical trial performance and may well be the single largest source of delays in getting studies completed. When designing a clinical trial, sponsors often try to include too many endpoints and too many measures hoping that all of these endpoints (if results are good) will contribute to the overall evidence of the clinical efficacy. The sponsors attempt to collect a lot of information that is not must-to-have, but nice-to-have (for future data dredging, marketing, publications,…). We want to do it all in one clinical study. This obviously increases the complexity of the study protocol that subsequently increases the length of the clinical trial, the cost of the clinical trial, and the quality (more protocol incompliance) of clinical trial data.

In terms of the conduct of the clinical trials, emphasis on the compliance of good clinical practice has resulted in perceptions that the clinical trial data must be 100% monitored and source-verified, all data programming and analysis must be independently validated, over-reporting adverse events must be requirement of the GCP compliance…

Over the last two years, FDA has issues several guidance in an attempt to change these perceptions.

In FDA’s new draft guidance "Determining the extent of safety data collection needed in late stage premarket and post-approval clinical investigations", it states that its intention is “to assist clinical trial sponsors in determining the amount and types of safety data to collect in late-stage premarket and post-market clinical investigations for drugs or biological products, based on existing information about a product’s safety profile.” This new guidance addresses the circumstances in which it may be acceptable to acquire a reduced amount of safety information during clinical trials. In some situations, excessive data collection may be unnecessary and not helpful.

FDA conducted a webinar on this topic and the webinar is free to the general public at: https://collaboration.fda.gov/p42291115/
The presentation slides can be accessed at: http://www.fda.gov/downloads/Drugs/UCM297332.pdf

In Guidance for Industry “Oversight of Clinical Investigations — A Risk-Based Approach to Monitoring”, FDA encourages the sponsors to implement the alternative clinical monitoring instead of the only way of on-site monitoring. The webinar and presentations slides can be found at http://www.fda.gov/Training/GuidanceWebinars/ucm277044.htm

“For major efficacy trials, companies typically conduct on-site monitoring visits at approximately four- to eight-week intervals,8 at least partly because of the perception that the frequent on-site monitoring visit model, with 100% verification of all data, is FDA’s preferred way for sponsors to meet their monitoring obligations. In contrast, academic coordinating centers, cooperative groups, and government organizations use on-site monitoring less extensively. For example, some government agencies and oncology cooperative groups typically visit sites only once every two or three years to qualify/certify clinical study sites to ensure they have the resources, training, and safeguards to conduct clinical trials. FDA also recognizes that data from critical outcome studies (e.g., many National Institutes of Health-sponsored trials, Medical Research Council-sponsored trials in the United Kingdom, International Study of Infarct Survival, and GISSI), which had no regular on-site monitoring and relied largely on centralized and other alternative monitoring methods, have been relied on by regulators and practitioners. These examples demonstrate that use of alternative monitoring approaches should be considered by all sponsors, including commercial sponsors when developing risk-based monitoring strategies and plans”

Many sponsors may be reluctant to adopt this guidance and stick with the status quo approach of frequent on-site visits with 100% verification of all data. They may be worried that loosing the clinical monitoring could incur the incompliance.

In Guidance for Clinical Investigators, Sponsors, and IRBs Adverse Event Reporting to IRBs — Improving Human Subject Protection, FDA advised the sponsors to report to IRB (and FDA presumably) the AE only if it were unexpected, serious, and would have implications for the conduct of the study, not all unanticipated AEs. The sponsor should analyze the unanticipated AEs before reporting.

“the practice of local investigators reporting individual, unanalyzed events to IRBs, including reports of events from other study sites that the investigator receives from the sponsor of a multi-center study—often with limited information and no explanation of how the event represents an unanticipated problem—has led to the submission of large numbers of reports to IRBs that are uninformative. IRBs have expressed concern that the way in which investigators and sponsors of IND studies typically interpret the regulatory requirement to inform IRBs of all "unanticipated problems" does not yield information about adverse events that is useful to IRBs and thus hinders their ability to ensure the protection of human subjects.”

There are many other areas in clinical trial practices that can and should be simplified. For example, some protocols instruct investigators to record and report all untoward events that occur during a study as AEs/SAEs, which could include common symptoms of the disease under study and/or other expected clinical outcomes that are not study endpoints. Over-reporting of AE/SAEs can incur additional burdens and can dilute or obscure signal identification. Another example is to spend too much efforts on the screening failure subjects. It is true that the recording of the adverse events starts once the informed consent form is signed. However it is unnecessary to write a full-blown SAE narrative for a screening failure subject that has nothing to do with the assessement of the safety of the experimental product.

Sunday, March 11, 2012

Standardized study data for electronic submission - CDISC compliance

FDA has recently issued a series of draft guidance on “providing regulatory submission in electronic format”. The most recent one is about “providing regulatory submissions in electronic format – standardized study data”. We have heard a lot of discussions about CDISC, SDTM, ADaM,…While these data standards are not mandated yet, FDA is encouraging the submission of the electronic data in CDISC compliant format. The draft guidance about ‘standardized study data’ is another sign that the industry should move toward the compliance of the CDISC standards for the submission.

In terms of what kinds of data standards to be used in electronic submissions, please see FDA’s web pages on study data standards.

CBER may have a little bit different requirements from other divisions. CBER has its own resource page for submission of data in CDISC format to CBER

While CDISC standards may be good for FDA reviewers and may accelerate the review process, it will indeed add burdens to the industry. For the original clinical datasets, there will be additional mapping to be implemented in order to prepare the datasets in SDTM compliant formats. SDTM format is not user-friendly and sometimes it is several steps to link back to the CRF/eCRFs.

Friday, March 09, 2012

Futility Analysis in Clinical Trials - Stop the trial for futility

A colleague of mine asked me to explain the concept of “futility analysis” using plain languages. The question is triggered by the recent news such as “Solanezumab, Gammagard Trials Survive Futility Analysis” In an Alzheimer trial, it even says from the futility analysis, “there is greater than a 20% statistical probability of success in achieving the primary outcome measure of cognitive function preservation.

During a clinical trial, we can perform interim analysis (or DMC, DSMB review) for three different reasons:

The interim analysis for safety

1) with pre-specified stopping rule (for example stop the trial if we see # of cases of Serious Adverse Events)

2) without pre-specified stopping rule (rely on DMC members to review the overall safety)

The interim analysis for efficacy: To see if the new treatment is overwhelmingly better than control - then stop the trial for efficacy
The interim analysis for futility: To see if the new treatment is unlikely to beat the control – then stop the trial for futility - this is called ‘futility analysis’.

In situations 2 and 3, the criteria for stopping rule for efficacy could be different from the stopping rule for futility, but need to be pre-specified.

In situation #2 (stopping the study for efficacy), there will be a penalty for alpha spending - if the overall alpha is 0.05, a portion of the alpha will be allocated for tee interim analysis and the alpha for final analysis will be less than 0.05. In situation #3, there is no penalty for alpha spending.

An example for futility analysis: at the beginning of the trial, we assumed 65% successful rate for new treatment group and 50% successful rate for the control group. We would like to establish superiority. In the middle of the study, we did an interim analysis. The interim analysis showed 55% successful rate for new treatment group and 50% successful rate for the control group. Based on the results from the interim analysis, we can calculate the probability and conditional power: if we continue to finish the study, what is the probability of the new treatment group better than control? If this probability is too small and meets the pre-specified criteria, we would stop the trial for futility. If this probability is reasonable, we can continue the trial as pre-planned or we can continue the trial with the sample size adjustment (typically increase due to the smaller effect size).

In a paper by Miller et al “Paclitaxel plus Bevacizumab versus Paclitaxel Alone for Metastatic Breast Cancer”, one pre-planned interim analysis and two additional interim analyses were performed and three stopping rules (for safety, for efficacy, and for futility) were pre-specified and evaluated. It is reasonable to assume that none of these stopping rules was triggered since the study was not stopped.

Futility analysis or stopping the trial for futility is not without controversy. An article by Schoenfeld and Meade discussed this issue. See “Pro/con clinical debate: It is acceptable to stop large multicentre randomized controlled trials at interim analysis for futility”.

Sunday, February 12, 2012

How to interpret odds ratios that are smaller than 1?

Here is a question posted on the web about the interpretation of odds ratios that are less than 1.

"I know that OR estimates= 1 mean that both groups/categories have the same odds. I also know that if OR estimates are greater than 1, e.g, 1.24 for Young vs. Old persons, then I can say: Young people have 24% increase in the odds of living in an apartment than older people. Or, I also know I can say, for example, for an OR of 0.322 Non-White vs. White, that the odds of Whites are 1/.322 = about 3 times higher than those of Non-Whites, to live in a house they own. Now, how would I say the odds are of a NON-White person in the example above, to live in a house they own? Is is 1-.322=.678 less likely, with respect to odds, to live in a house they own? Or, similarly, they have 67.8% lower odds to live in a house they own? "

If we have to say the odds for a Non-White person, we may say "Non Whites have odds .322 times as great as those of Whites".

In an article "When can odds ratio misled?", Davies et al stated:

"the odds of an event is the number of those who experience the event divided by the number of those who do not. It is expressed as a number from zero (event will never happen) to infinity (event is certain to happen). Odds are fairly easy to visualise when they are greater than one, but are less easily grasped when the value is less than one. Thus odds of six (that is, six to one) mean that six people will experience the event for every one that does not (a risk of six out of seven or 86%). An odds of 0.2 however seems less intuitive: 0.2 people will experience the event for every one that does not. This translates to one event for every five non-events (a risk of one in six or 17%). "

Another webblog described the issue in interpreting the odds ratio that is less than one.

"When you are interpreting an odds ratio (or any ratio for that matter), it is often helpful to look at how much it deviates from 1. So, for example, an odds ratio of 0.75 means that in one group the outcome is 25% less likely. An odds ratio of 1.33 means that in one group the outcome is 33% more likely."

In an article "The odds ratio: calculation, usage, and interpretation" in Biochemia Medica, the author clear suggest converting the odds ratio to be greater than 1 by arranging the higher odds of the evnet to avoid the difficulties in interpreting the odds ratio that is less than 1.

“An OR of less than 1 means that the first group was less likely to experience the event. However, an OR value below 1.00 is not directly interpretable. The degree to which the first group is less likely to experience the event is not the OR result. It is important to put the group expected to have higher odds of the event in the first column. It is not valid to try to determine how much less the first group’s odds of the event was than the second group’s. When the odds of the first group experiencing the event is less than the odds of the second group, one must reverse the two columns so that the second group becomes the first and the first group becomes the second. Then it will be possible to interpret the difference because that reversal will calculate how many more times the second group experienced the event than the first. If we reverse the columns in the example above, the odds ratio is: (5/22)/(45/28) = (0.2273/1.607) = 0.14 and as can be seen, that does not tell us that the new drug group died 0.14 times less than the standard treatment group. In fact, this arrangement produces a result that can only be interpreted as “the odds of the first group experiencing the event is less than the odds of the second group experiencing the event”. The degree to which the first group’s odds are lower than that of the second group is not known.”

In practice, when dealing with the odds ratio less than 1, when possible, I almost always try to reverse the column or recode the response variable to get the odds ratio larger than 1 before I do an interpretation. It is easier for people (especially non-statisticians) to understand the odds ratio with the value greater than 1.

In an example below, the treatment group is actually less effective in terms of the response.

Treatment	Failure (0)	Success (1)
No (0)	21	30
Yes (1)	32	17

The following SAS code can be easily used to calculate the odds ratio:
Data test; input Trt resp count; datalines;
1 1 17
1 0 32
0 1 30
0 0 21
;
proc logistic data=test descending; weight count; model resp=trt;
run;

From the SAS outputs, we get the odds ratio of 0.372, which indicates that the treatment group has odds 0.372 times lower compared to the non-treatmetn group in terms of the success. The interpretation is somewhat difficult to understand.

The program can be easily revised to calculate the odds ratio of failure rate, which gives an odds ratio of 1/0.372 = 2.689. The odds ratio can be intepretated as "the odds of achieve the success in non-treatment group is 2.689 times higher than that in treatment group".

proc logistic data=test; weight count; model resp=trt;
run;

In SAS PROC Logistic, with descending option, probability modeled is response=1 (success); without descending option, probability modeled is response=0 (failure);

Sunday, February 05, 2012

Design and Analysis of Bioequivalence Studies for Highly Variable Drugs (HVD) or Highly Variable Drug Products (HVDP)

For bioequivalence studies, it is often for us to show the average bioequivalence by declaring the bioequivalence if the 90% confidence interval of the geometric least squares mean ratio is within 80-125%. The associated study design is typically 2x2x2 cross over design with reasonable sample size (for example, 12 subjects, 24 subjects,…) if the within subject variable is not so big. This approach has been outlined in several FDA’s guidelines:

- Statistical Approaches to Establishing Bioequivalence

- Bioavailability and Bioequivalence Studies for Orally Administered Drug Products — General Considerations

- Food-Effect Bioavailability and Fed Bioequivalence Studies

Recently, there are a lot of discussions about the bioequivalence studies for a product with high variability (high variable drugs). Highly Variable Drugs refer to the type of drugs with higher within subject variability and is Defined as one for which the root mean square error (RMSE) from the ANOVA bioequivalence analysis > 0.3 for either AUC or Cmax.

For highly variable drugs, if we employ the common study design, the required sample size will be very large, which will cause the ethic concerns to implement such studies.

FDA had several advisory committee meeting in discussing this issue. The most recent meetings were in 2004 and 2009. In 2009 meeting, the slide presentation by Dr Conner from FDA summarized the development in dealing with this issue and FDA’s position (see slide presentation “Bioequivalence Methods for Highly Variable Drugs and Drug Products”).

Among various approaches to address the bioequivalence issue for highly variable drugs, reference-scaled average BE approach has been suggested. This approach requires less subjects in the study, but with replicated treatment design such as three-period, reference- replicated, crossover design with sequences of TRR, RTR, & RRT or four-period design with sequences of TRTR and RTRT. The replicated crossover designs were also discussed in FDA guidance “Statistical Approaches to Establish Bioequivalance”, but was for dealing with the carryover effects. Here, the replicated crossover designs are for dealing with highly variable drugs.

The implementation of the reference-scaled average BE approaches have been detailed and discussed in FDA guidance (draft) many publications. The most relevant ones are:

- FDA Guidance on Progesterone (2011)

- Sample Sizes for Designing Bioequivalence Studies for Highly Variable Drugs by Endrenyi and Tothfalusi (2012)

- Bioequivalence of Highly Variable Drugs: A Comparison of the Newly Proposed Regulatory Approaches by FDA and EMA by Karalis et al (2011)

The European Medicines Agency also recognizes certain drugs as highly variable drug products (HVDP) and is willing to accept a wider difference (i.e., a wider 90% confidence interval) in Cmax for bioequivalence evaluation. In its guidance "Guideline on the Investigation of Bioequivalence", section 4.1.10 specifically discussed the HVDP:

Saturday, January 28, 2012

Recording the outcome for AE/SAE when multiple events contribute to Death

In clinical trials, death event may occur. According to CDISC CDASH standards, death should not be recorded as an adverse event (AE) or serious AE, but should be recorded as the outcome of the event. The condition that resulted in the death should be recorded as the AE/SAE.

In the case of multiple AE/SAEs contributing to the fatal (death) outcome, there seem to be two different ways in recording the AE outcome. There is no clear regulatory guideline in detail about this situation. The most closely related guideline may be from the ICH E2B where it states:

B.2.i.8 Outcome of reaction/event at the time of last observation

recovered/resolved

recovering/resolving

not recovered/not resolved

recovered/resolved with sequelae

fatal

unknown

User Guidance:
In case of irreversible congenital anomalies the choice, not recovered/not resolved should be used.
Fatal should be used when death is possibly related to the reaction/event. Considering the difficulty of deciding between "reaction/event caused death" and "reaction/event contributed significantly to death", both were grouped in a single category. Where the death is unrelated, according to both the reporter and the sender, to the reaction/event, death should not be selected here, but should be reported only under section B.1.9.

In practice, one way is to record multiple SAEs and record 'fatal' for each of these SAEs. The drawback of this approach is to have multiple SAEs with 'fatal' outcome for the same subject while subject can only die once.

Another way is to identify one SAE as the principal cause of the death. In this case, only will one SAE have the outcome recorded as Fatal. The subject can only die once so it makes sense to record ‘fatal’ as the outcome for the principal event The question is that what the appropriate outcome should be for other SAEs that may also contribute to the death event. If there is an 'Ongoing' option in the list of AE outcomes, the appropriate choice may be 'ongoing' indicating that the SAE is ongoing during the time of death. If there is no choice of 'ongoing' in the AE outcome list (as specified in E2B above), the most appropriate choice seems to be "not recovered/not resolved" indicating that the SAE is still not resolved during the time of death. For AE stop time, the principal SAE with fatal outcome should be the time of death. For other SAEs contributing to death, the stop time may be appropriately recorded as 'ongoing' instead of recording the death time as the SAE stop time.

Friday, December 02, 2011

Serious Adverse Events (SAE) vs Severe Adverse Events

Professionals who are new to the clinical trial field are often confused with the concept of 'Serious Adverse Events (SAEs)' and 'Severe Adverse Events". Severity is not synonymous with seriousness. SAE is based on patient/event outcome or action criteria usually associated with events that pose a threat to a patient's life or functioning. Seriousness (not severity) serves as a guide for defining regulatory reporting obligations. In other words, the SAEs need to be filfill additional reporting process (reported to corporate global drug safety group or pharmacovigilence group, regulatory authorities, EC/IRBs). Severe AE is one class of AEs with severity (old term intensity) classified as 'severe'. Severe AE is one of the AE classifications – AE severity (other classifications are relationships/causality).

The FDA defines a serious adverse event (SAE) as one when the patient outcome is one of the following:

Death
Life-threatening
Hospitalization (initial or prolonged)
Disability - significant, persistent, or permanent change, impairment, damage or disruption in the patient's body function/structure, physical activities or quality of life.
Congenital anomaly
Requires intervention to prevent permanent impairment or damage

On the other hand, Severity of an AE is a point on an arbitrary scale of intensity of the adverse event in question. The terms "severe" and "serious" when applied to adverse events are technically very different. They are easily confused but can not be used interchangeably, require care in usage.

A headache is severe, if it causes intense pain. There are scales like "visual analog scale" that help us assess the severity. On the other hand, a headache is not usually serious (but may be in case of subarachnoid haemorrhage, subdural bleed, even a migraine may temporally fit criteria), unless it also satisfies the criteria for seriousness listed above. Similarly, a severe rash is not likely to be an SAE. However, mild chest pain may result in a day’s hospitalization and thus is an SAE.

Classifications of the AE sevirity often include the following:

Mild: Awareness of signs or symptoms, but easily tolerated and are of minor irritant type causing no loss of time from normal activities. Symptoms do not require therapy or a medical evaluation; signs and symptoms are transient.
Moderate: Events introduce a low level of inconvenience or concern to the participant and may interfere with daily activities, but are usually improved by simple therapeutic measures; moderate experiences may cause some interference with functioning
Severe: Events interrupt the participant’s normal daily activities and generally require systemic drug therapy or other treatment; they are usually incapacitating

The guidelines for AE severity assessment is based on:

Other sources about this topic may be useful:

Sunday, November 27, 2011

Reporting/recording the Serious Adverse Events (SAE) vs. Adverse Event (AE) Outcomes

In clinical trials, the serious adverse event reporting is critical to the safety assessment and to fulfill the regulatory requirements. The criteria for defining an SAE have been documented in many regulatory guidelines. However, in clinical trial implementation, the confusion could arise whether or not an event should be reported as an SAE or outcome of an SAE. Misinterpretation of the regulatory guidelines could cause in the inappropriate reporting of SAEs.

Acording to ICH E2A “CLINICAL SAFETY DATA MANAGEMENT: DEFINITIONS AND STANDARDS FOR EXPEDITED REPORTING”

                  A serious adverse event (experience) or reaction is any untoward medical occurrence that at any dose:                          * results in death,
                         * is life-threatening,
                             NOTE: The term "life-threatening" in the definition of "serious" refers to an event in which the patient was at
                             risk of death at the time of the event; it does not refer to an event which hypothetically might have caused death
                             if it were more severe.
                         * requires inpatient hospitalisation or prolongation of existing hospitalisation,
                         * results in persistent or significant disability/incapacity, or
                         * is a congenital anomaly/birth defect.
FDA website has provided a little bit more detail descriptions on SAE
"What is a Serious Adverse Event?

An adverse event is any undesirable experience associated with the use of a medical product in a patient. The event is serious and should be reported to FDA when the patient outcome is:
Death

Report if you suspect that the death was an outcome of the adverse event, and include the date if known.
Life-threatening

Report if suspected that the patient was at substantial risk of dying at the time of the adverse event, or use or continued use of the device or other medical product might have resulted in the death of the patient.
Hospitalization (initial or prolonged)

Report if admission to the hospital or prolongation of hospitalization was a result of the adverse event.

Emergency room visits that do not result in admission to the hospital should be evaluated for one of the other serious outcomes (e.g., life-threatening; required intervention to prevent permanent impairment or damage; other serious medically important event).
Disability or Permanent Damage

Report if the adverse event resulted in a substantial disruption of a person's ability to conduct normal life functions, i.e., the adverse event resulted in a significant, persistent or permanent change, impairment, damage or disruption in the patient's body function/structure, physical activities and/or quality of life.
Congenital Anomaly/Birth Defect

Report if you suspect that exposure to a medical product prior to conception or during pregnancy may have resulted in an adverse outcome in the child.
Required Intervention to Prevent Permanent Impairment or Damage (Devices)

Report if you believe that medical or surgical intervention was necessary to preclude permanent impairment of a body function, or prevent permanent damage to a body structure, either situation suspected to be due to the use of a medical product.
Other Serious (Important Medical Events)

Report when the event does not fit the other outcomes, but the event may jeopardize the patient and may require medical or surgical intervention (treatment) to prevent one of the other outcomes. Examples include allergic brochospasm (a serious problem with breathing) requiring treatment in an emergency room, serious blood dyscrasias (blood disorders) or seizures/convulsions that do not result in hospitalization. The development of drug dependence or drug abuse would also be examples of important medical events."

The standard coding dictionary for adverse events is MedDRA (Medical Dictionary for Regulatory Activities). The guidance document MedDRA® TERM SELECTION: POINTS TO CONSIDER gives clear explanation how death and other patient outcomes should be handled.

3.2 – Death and Other Patient Outcomes

Death, disability, and hospitalization are considered outcomes in the context of safety reporting and not usually considered ARs/AEs. Outcomes are typically recorded in a separate manner (data field) from AR/AE information. A term for the outcome should be selected if it is the only information reported or provides significant clinical information.

(For reports of suicide and self-harm, see Section 3.3).

3.2.1 Death with ARs/AEs

Death is an outcome and not usually considered an AR/AE. If ARs/AEs are reported along with death, select terms for the ARs/AEs. Record the fatal outcome in an appropriate data field.

3.2.4 Other patient outcomes (non-fatal)

Hospitalization, disability and other patient outcomes are not generally considered ARs/AEs.

There are many other examples in terms of recording the outcome instead of AE/SAE. Adverse events represent the untoward medical event, not the intervention to treat that event. For example, if a subject has appendectomy, the AE is appendicitis not the surgical procedure; if a subject has an limb amputation, the AE is the cause for amputation (perhaps, the worsening of the ischemia in the peripheral artery) and limb amputation should be reported as the outcome of the AE/SAE; If a patient is hospitalized due to congestive heart failure, congestive heart failure should be reported SAE and hospitalization should be reported as an outcome for congestive heart failure.
We should also be aware that not every hospitalization will have an associated SAE to be reported. Any AE leading to hospitalization or prolongation of hospitalization meets ONE of the followings should not be reported as SAE.
A hospitalization admission is pre-planned (ie, elective or scheduled surgery arranged prior to the start of the study). European Commission’s guidelines on medical devices “CLINICAL INVESTIGATIONS: SERIOUS ADVERSE EVENT REPORTING “ indicated that a planned hospitalization for pre-existing condition, or a procedure required by the Clinical Investigation Plan, without a serious deterioration in health, is not considered to be a serious adverse event.

A hospitalization admission is clearly not associated with an AE (eg, social hospitalization for purposes of respite care). If a patient wants to be stay in the hospital during the drug treatment because of the fear that something bad could happen, this should not be reported as SAE just because of the hospital stay if nothing else happens

According to these definitions, the events with outcome of death, hospitalization, disability or permanent damage, congenital anomaly/birth defect, … should be reported as SAE while death, hospitalization, disability or permanent damage, congenital anomaly/birth defect…should be reported as the outcome of the corresponding SAE. To be crystal clear, the Death, Hospitalization should not be reported as SAE and the causes leading to the death and hospitalization should be reported as SAE.

Thursday, November 24, 2011

Studentized residual for detecting outliers

Last time, I discussed the outliers and a simple approach of Dixon’s Q test for detecting a single outlier. When there are multiple outliers, we can detect the outliers using the standard deviation (for data that is normal distributed) or using percentiles (for the skewed data). A box plot may be useful to visually check the data for potential outliers.

In regression setting, there are several approaches in detecting the outliers. One of the approaches is to utilize the ‘standardized residual’ or ‘studentized resitual’. In linear regression, an outlier is an observation with large residual. In other words, it is an observation whose dependent-variable value is unusual given its values on the predictor variables.

The studentized residual is the quotient resulting from division of a residual by an estimate of its standard deviation. Just like the standard deviation, the studentized residual is very useful in detecting the outliers. For values outside the 3, 4, or 5 times standard deviation, we may have reasonable doubt that the values are outliers. In regression setting, observed values outside 3, 4, or 5 times the studentized residual are the targets for outliers.

In SAS, two regression procedures can be easily utilized to compute the studendized residual for detecting outliers. PROC REG and PROC GLM. The studentized residual is labelled as RSTUDENT in Output statement. Other regression procedure (such as PROC MIXED) also compute studentized residual as part of Influence test.

output out=newdata rstudent=xxx;
Further readings:

Regression with SAS - Regression Diagnostics
SAS version 9.3 PROC REG
SAS version 9.3 PROC GLM

Saturday, November 12, 2011

Outliers in clinical trial, Dixon's Q test for a single outlier

In clinical trials, we deal with the outlier issue differently from other fields. During the clinical trial, for the suspected ‘outliers’, every effort should be taken to query the investigator sites, to repeat measures, or to re-test the samples in order to get the correct information. Typically those suspected ‘outliers’ can be clarified during the data cleaning process. It is just not very common to throw away the data (even it is suspected to be ‘outlier’) in clinical trials. In one of pharmacokinetics studies, I did have to deal with the suspected outliers (we used the term ‘exceptional value’ instead of ‘outliers’). After the sample re-test, we still had one value very high. Instead of throwing away this exceptional value, we had to perform the analysis with and without this exceptional value.

In one of the presentations by a FDA officer, the term ‘outliers’ vs anomalous are used.

Outlier subjects may be “real” results and are therefore very valuable in making a correct BE conclusion
Anomalous results are data that are not correct due to some flaw in study conduct or analysis

In many situations, it is very difficult to know for sure whether or not an exceptional value is a outlier or an anomalous result.

In ICH E9 "Statistical Principles for Clinical Trials", the handling of outliers was discussed in the section of "missing values and outliers".

5.3 Missing Values and Outliers

Missing values represent a potential source of bias in a clinical trial. Hence, every effort should be undertaken to fulfil all the requirements of the protocol concerning the collection and management of data. In reality, however, there will almost always be some missing data. A trial may be regarded as valid, nonetheless, provided the methods of dealing with missing values are sensible, and particularly if those methods are pre-defined in the protocol. Definition of methods may be refined by updating this aspect in the statistical analysis plan during the blind review. Unfortunately, no universally applicable methods of handling missing values can be recommended. An investigation should be made concerning the sensitivity of the results of analysis to the method of handling missing values, especially if the number of missing values is substantial.

A similar approach should be adopted to exploring the influence of outliers, the statistical definition of which is, to some extent, arbitrary. Clear identification of a particular value as an outlier is most convincing when justified medically as well as statistically, and the medical context will then often define the appropriate action. Any outlier procedure set out in the protocol or the statistical analysis plan should be such as not to favour any treatment group a priori. Once again, this aspect of the analysis can be usefully updated during blind review. If no procedure for dealing with outliers was foreseen in the trial protocol, one analysis with the actual values and at least one other analysis eliminating or reducing the outlier effect should be performed and differences between their results discussed.

I was recently asked for help to test an outlier for the data from a lab experiment (not a clinical trial).

The titer for the same sample was measured for 20 times. The titer is 25 for 7 times, 125 for 12 times. However, for one time, the title is 625. Is there any way to test (statistically) whether the titer of 625 is an outlier?

Titer	25	125	625
N	7	12	1

There is a simple test for outlier called Dixon's Q-test. Dixon’s Q-test calculates the Q value that is the ratio of the Gap (the difference between the extreme value and the immediately adjacent value) and the Range (the difference between the extreme value and the maximal or minimal value)

In the case above, the titer value needs to be log-transferred first, therefore, with Log10 data transfer, data will be listed as the following (in order):

1.39794 1.39794 1.39794 1.39794 1.39794 1.39794 1.39794 2.09691 2.09691 2.09691
2.09691 2.09691 2.09691 2.09691 2.09691 2.09691 2.09691 2.09691 2.09691 2.79588

The gap = 2.79588 - 2.09691 = 0.69897
The range = 2.79588 - 1.39794 = 1.39794
The Q value = 0.69897 / 1.39794 = 0.5

The Q value will then be compared with the critical value. The critical value can be found at difference web sources or from the original paper. The critical value for N=(7+12+1) = 20 is 0.342.
Since Q value is larger than 0.342, we can reject 2.79588 and conclude that the original value 625 (log-transferred value of 2.79588) is a outlier.

If we use a Log5 data transfer, the calculation will be easier and conclusion is the same.

This approach can only be used for detecting a single outlier. If there are more than one values in 625 titer group, Dixon's Q test will not be an appropriate approach.

Typically, identifying of the outliers is against a continuous variable (ie, the data is continuous). The data above contains many ties (due to the design). Therefore, the results from the Dixon’s Q-test needs to be interpreted in caution. The determination of the outliers should always be based on the understanding of the experimental data.

For further reading about the outlier issues:

Saturday, October 29, 2011

Story of Xigris (Protein C) for Sepsis

Xigris, also called Drotrecogin Alfa (Activated) or Protein C, was the only approved drug for severe sepsis indication and it was withdrawn from the market last week (Oct 25, 2011). In a recently completed clinical trial (PROWESS-SHOCK trial), Xigris failed to show a survival benefit. Due to the early controversies over Xigris’s approval and the continuous debate on Xigris’s risk benefit, PROWESS-SHOCK trial has been under watch since its start. The study design, statistical analysis plan, and unblinding plan have all been published way before the completion of the trial.

A decade ago, prior to the approval of Xigris for sepsis indication, the risk-benefit had been debated quite a bit. Xigris was know to be linked to the increased risk of serious bleeding in patients. there was "controversy surrounds both the drug study itself and the FDA approval," wrote NEJM editor-at-large Richard P. Wenzel, MD in 2002. FDA held the anti-infective advisory committee meeting for Xigris in treating sepsis. The FDA approved the drug despite the advisory committee's split vote (10 to 10) due to concerns about the validity of the claimed efficacy and safety findings on the basis of a single trial. At that time, Xigris was approved based on a single pivotal trial (PROWESS trial) that was also stopped early for efficacy. At that time, the FDA reviewers certainly believed that Xigris was beneficial and could save a lot of lives.

PROWESS trial has been the model for other clinical trials in Sepsis even though the PROWESS trial itself has been criticized for changes in the protocol during the trial. According to the NEJM article by H. Shaw Warren, MD, from Massachusetts General Hospital in Boston, and fellow consultants to the FDA, the study protocol changed during the PROWESS trial, shifting the study population composition toward patients with less severe underlying disease and more acute infectious illnesses. Other changes included use of a different placebo and elimination of protein C deficiency status as a primary variable. Around the same time, Lilly began producing the drug using a new master cell bank. Cumulative mortality curves suggest an improvement in protective efficacy of Xigris after these changes were made.

Subsequent trials have now shown that Xigris has no benefit and has unfavorable risk-benefit profiles. The ADDRESS trial (published in 2005) showed the absence of a beneficial treatment effect, coupled with an increased incidence of serious bleeding complications. The result indicates that Xigris should not be used in patients with severe sepsis who are at low risk for death, such as those with single-organ failure or an APACHE II score less than 25. Now the PROWESS-SHOCK trial further confirmed that the risk of bleeding outweigh the benefit in reducing the mortality – unfortunately it is a decade after Xigris has been on the market.

The market practice for Xigris has also been criticized. Several years ago, there were a lot of talks about Lilly’s influence on a committee in defining the sepsis treatment guidelines which was in favor of using Xigris.

Retrospectively, we can have something to learn from the Xigris story: 1) a single pivotal trial may be insufficient in confirming the treatment benefit; 2) change the protocol during the trial could have bias to the trial results; 3) stop a trial for efficacy may be risky.

New drugs for life-threatening disease such as sepsis are desperately needed, however, to demonstrate the benefit of any drug in the complicated sepsis treatment is a challenging task. The diversities in sepsis treatment in various institutes make the clinical trials in sepsis very difficult and the sample size for sepsis trials need to be sufficiently large to show the benefit.

Thursday, October 27, 2011

A medical joke to share

Not sure where the origin is. It is circulated quite a bit.

Best friends graduated from medical school at the same time and decided that, in spite of two different specialties, they would open a practice together to share office space and personnel.

Dr. Smith was the psychiatrist and Dr. Jones was the proctologist; they put up a sign reading: "Dr. Smith and Dr. Jones: Hysterias and Posteriors". The town council was livid and insisted they change it.

So, the docs changed it to read: "Schizoids and Hemorrhoids". This was also not acceptable, so they again changed the sign. "Catatonics and High Colonics" - No go.

Next, they tried "Manic Depressives and Anal Retentives" - thumbs down again. Then came "Minds and Behinds" - still no good. Another attempt resulted in "Lost Souls and Butt Holes" - unacceptable again! So they tried "Analysis and Anal Cysts" - not a chance. "Nuts and Butts" - no way. "Freaks and Cheeks" - still no good. "Loons and Moons" - forget it.

Almost at their wit's end, the docs finally came up with: "Dr. Smith and Dr. Jones - Specializing in Odds and Ends". Everyone loved it.

Saturday, October 15, 2011

Will Electronic Data Capture be always better than Paper-CRF?

The traditional way to do the clinical trial data management is to use the paper based case report forms (CRFs). The blank paper CRFs are distributed to the investigator sites. The investigator or study coordinator fills out the CRFs. CRFs will then be monitored and collected from the investigator sites. CRFs will subsequently be handled by a centralized group - data management group where the activities include the clinical database building, data entry, data cleaning, data clarification,...

The industry trend has been gradually moving away from the paper-based CRFs and moving toward to the electronic data capture (EDC). In EDC world, the database was built prior to the study start (significant longer leading time prior to the study study is needed) . The data will be directly entered into the database by the investigator site (investigator or study coordinator). EDC has been touted by many vendors as the preferred way for conducting clinical trials: getting the data fast, saving timeline, saving cost, minimizing data transcription errors... While this is generally true, it is not universal.

In some situations, the trial using the traditional paper-based CRFs is a better way than EDC. For example, in a clinical trial for a rare disease, there are many investigator sites and each site may only enroll very few subjects or not enroll any subject. The EDC will not be an efficient way in data collection. Many site staff will be trained on EDC and never have chance to enroll any patient into the study and never have a chance to use EDC. When a site finally has a chance to enroll a subject, the initial training on using EDC may be a distant memory.

The EDC trial is not always cheap. With EDC trial, significant cost could be spent on the EDC system hosting and EDC system help desk support. Imagining a slow enrollment trial running for 7-8 years, the cost for hosting EDC system and providing the help desk support will be too much comparing to a paper-based study.

While EDC is a trend, the adoption of EDC is not universal. In some situations, the traditional paper CRFs may be better.