Tuesday, January 20, 2009

Multiple Comparisons

In statistics or biostatistics, the multiple comparisons problem occurs when one considers a set, or family, of statistical inferences simultaneously. Errors in inference, including confidence intervals that fail to include their corresponding population parameters, or hypothesis tests that incorrectly reject the null hypothesis, are more likely when one considers the family as a whole.

Multiple comparison issues were nicely summarized in EMEA's guidance titled "Points to consider on multiplicity issues in clinical trials". This guidance also discussed the situations where the adjustment for multiplicity is not needed.

Adjustment for multiplicity is also mentioned in many regulatory guidance, for example, FDA guidance on ISE and its importance has been recognized in may medical journal review process.

SAMSI held a workshop in 2005 to discuss teh multiplicity issues which included the issue in Multiple Testing, Reproducibility, and Subgroup analysis.

For an introduction about multiple comparisons, refer to Wikipedia "http://en.wikipedia.org/wiki/Multiple_comparisons"

SAS Proc Multitest can be an easy tool to compute the adjusted p-values (with different methods) if the raw p-values from multiple tests are provided. For example, with the following program, we would be able to obtain a set of adjusted p-values.

data integrated;
input Method$ Raw_P;
method1 .331
method2 .090
method3 .105
method4 .xxx
proc multtest pdata=integrated holm hoc fdr bon;

Monday, January 19, 2009

Trial Biomarker Analysis More Than Data Dredging

members of the FDA's oncology Drugs Advisory Committee cautioned sponsors against treating retrospective clinical trial biomarker analysis like an exercise in data dredging.

The committee met last month go consider the adequacy of retrospectivly mined data in determining whether a biomarker is truly predictive of patient response. The discussion stemmed from a retrospective data analysis conducted to show that the KRAS biomarker status of patient tumors helps predict responses to Amgen's Vectibix (panitumumab) and ImClone and Bristol-Myers Squibb's Erbitux (cebuximab) cancer drugs.

See meeting transcribts here or the slides.

In other news, the US FDA encourage the integration of biomarkers in drug development and their appropriate use in clinical practice.

Data dredging vs. Data mining; Post-hoc vs. Ad-hoc

Data dredging (data fishing, data snooping) is the inappropriate (sometimes deliberately so) search for 'statistically significant' relationships in large quantities of data. This activity was formerly known in the statistical community as data mining, but that term is now in widespread use with an essentially positive meaning, so the pejorative term data dredging is now used instead.

Data mining is the process of extracting hidden patterns from data. As more data is gathered, with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform this data into information. It is commonly used in a wide range of applications, such as marketing, fraud detection and scientific discovery. Data mining can be applied to data sets of any size. However, while it can be used to uncover hidden patterns in data that has been collected, obviously it can neither uncover patterns which are not already present in the data, nor can it uncover patterns in data that has not been collected.

In or of the form of an argument in which one event is asserted to be the cause of a later event simply by virtue of having happened earlier: coming to conclusions post hoc; post hoc reasoning.
[Latin, short for post hoc, ergō propter hoc, after this, therefore because of this : post, after + hoc, neuter of hic, this.]

For the specific purpose, case, or situation at hand and for no other: a committee formed ad hoc to address the issue of salaries.adj.
Formed for or concerned with one specific purpose: an ad hoc compensation committee.
Improvised and often impromptu: “On an ad hoc basis, Congress has . . . placed . . . ceilings on military aid to specific countries” (New York Times).
[Latin : ad, to + hoc, neuter accusative of hic, this.]

While both post-hoc and ad-hoc analysis may be performed based on the data or results we have seen, the ad-hoc analysis typically occurred alongside the project while the post-hoc analysis occurred absolutely after the project or after the unblinding of the study or after the pre-specified analyses results have been reviewed. In this sense, the ad-hoc analysis is better than post-hoc analysis.

Sunday, January 11, 2009


EQ-5D is a standardised instrument for use as a measure of health outcome. Applicable to a wide range of health conditions and treatments, it provides a simple descriptive profile and a single index value for health status. EQ-5D was originally designed to complement other instruments but is now increasingly used as a 'stand alone' measure.

An EQ-5D health state (or profile) is a set of observations about a person defined by a descriptive system. An EQ-5D health state may be converted to a single summary index by applying a formula that essentially attaches weights to each of the levels in each dimension. This formula is based on the valuation of EQ-5D health states from general population samples.

EQ-5D was established and subsequently developed by the EuroQol Group, established in 1987. The aim of the group is to test the feasibility of jointly developing a standardized non disease specific instrument for describing and valuing health-related quality of life.

As a matter of fact, EQ-5D is becoming popular and one day may be replace the SF-36 as the most popular generalized health-related quality of life instrument. The main advantage of EQ-5D may be:
  • Preference-based and suitable for cost-utility analysis
  • EQ-5D value sets can be easily converted to the QALY which is the denominator in cost-utility analysis.
  • Less questions and easy to implement within short time

In one of my studies, SF-36 were performed as a quality of life measure. However, in order to perform the cost-utility analysis, these SF-36 scores have to be converted into something similar to EQ-5D - SF-6D . There is also other discussions about the mapping of SF-36 to EQ-5D. QualityMetric, the company for developing SF-36, is also providing the mapping for SF-6D.

However, the analysis of EQ-5D is not as easy as the questions presented in the instrument. According to a book titled "EQ-5D value sets: inventory, comparative review and user guide" (see UNC catalog), two terms seem to be important, but I may need to do a complete study using EQ-5D to figure out how to use these value sets.

  • Time Trade-off (TTO) value sets
  • Visual Aalogue Scale value sets

9 ways to stay alive when the worst happens

The followings are copied from PARADE (Jan 11, 2009). I am not sure if these arguments (or suggestions) have any scientific merit, but I copy here just for fun.

  1. Escape a plane crash
    The safest seats on a plane are within five rows of any exit. The No. 1 safest seats are in an exit row or one row away.
  2. Get out of a hotel five
    Most fire departments use ladders that, at their maximum, can extend around 80 feets into the air. That means in order to be able to climb out of your building's window and onto a truck's ladder, you should be on or below the seventh floor.
  3. Leave the hospital alive
    If you need to go to the hospital, weekdays are much safer than weekends. Possible explanatoins are, during the weekends, there are lower staffing levels and the presence of workers who are less experienced and less familiar with procedures and patients.
  4. Don't bo back to hospital
    beware of checking out of the hospital on a Friday. Friday is the most common hospital discharge day, but the individuals released on Friday also have an increased readmissions rate to hospital.
  5. Get an initial boost
    In one intriguing study, California researchers analyzed death records to find out whether there was any correlation between people's initials and how long they lived. They divided their subjects' initials into positive and negative groups. The good-initial group included ACE, WIN, WOW, and VIP; the bad contained RAT, BUM, SAD, and DUD. They matched up initials with lifespans and looed for any correlation. The results were stunning (and also hotly debated): a person's initial actually may influence the time and cause of his or her death. "A symbol as simple as one's initials can add four years to life or subtract three years"
    In related news, last names that begin with letters occurring later in the alphabet can be associated with a phenomenon that Scotish researchers call "alphabetical prejudice." They found that when medical teams in a brain-injury rehabilitation center met to discuss patients, people with surnames that came early in the alphabet tended to receive three to four minutes' more attention than people with names later in the alphabet.
  6. Outlive a heart attack
    One of the best places to be is in a casino in Las Vegas. The heart-attack survival rate in Las Vegas is 53%. Compare that to rates of 16% in Seattle (which has some of the nation's best response systems) or 2% in Chicago.
  7. Walk away rom a car accideng
    The rear middle seat was 16% safer than any other place in the vehicle. Overall, riding in the back is 59% to 86% safer than riding in the front, and riding on the hump is 25% safer than riding in the rear window seats.
    Compared with white cars in daylight ours, black cars had a 12% higher crash risk; gray, 11%; silver, 10%; blue and red, 7%. At dawn or dusk, black cars had a 47% higher crash risk than white cars; gray, 25%; silver 15%.
  8. Cross the street safely
    The three deadliest days for pedestrians are Jan 1, Dec 23, and Oct 31.
  9. Beware of your birthday
    Women are more likely to die in the week after their birthdays than any other week of the year, while mean's deaths peak before their birthdays.

Tuesday, January 06, 2009

FDAAA and clinical trial data bases

The recently-enacted FDA Amendments Act (“FDAAA”) has lots of requirements that may have impact on statistical analysis and programming. It is not new that the study information needst o be registered in clinicaltrials.gov database. However, a major change to the clinical trial database is that not later than December 25, 2007, the database must include links to information on clinical trial results. The term “results” is used fairly broadly in FDAAA to include summary information FDA has posted from an advisory committee meeting that considered a particular study, FDA public health advisories, FDA’s application review documents, Medline citations to any publications focused on the results of the trial, and the drug entry in the National Library
of Medicine database of structured product labels (if available).

The results requirements include demographic and baseline characteristics of the study participants, results values for each of the primary and secondary outcomes for each arm of the study, point of contact for scientific queries, and information on sponsor agreements with investigators that could restrict their ability to discuss or publish trial results.

What makes the results requirement most complicated is the format in which the results must be submitted: rather than uploading study results that have already been compiled into a clinical study report for example, using Clinicaltrials.gov's online Protocol Registration System (PRS), sponsors must first create results tables and then enter the data and statistical analyses.

This requirement means that the statistician needs to step in when the study information needs to be entered in the correct way.

Some of the statements in the amendment act are worth attention. The act stated that only applicable drug clinical trials are required to have results published. "IN GENERAL.—The term ‘applicable drug clinical trial’ means a controlled clinical investigation, other than a phase I clinical investigation, of a drug subject..." This seems to imply that the phase I study can be exempted from this requirement. However, the assignment of the study phases sometimes is arbitrary especially when a phase I study is conducted in the patients rather than the healthy volunteers.

The requirement of presenting the results for all primary and secondary could provide the misleading information to the reader if the readers have no knowledge about the interpretation of the results. Not everybody can read the results of scientifically appropriate tests of the statistical significance. Statement says ‘‘(ii) PRIMARY AND SECONDARY OUTCOMES.—The
primary and secondary outcome measures as submitted under paragraph (2)(A)(ii)(I)(ll), and a table of values for each of the primary and secondary outcome measures for each arm of the clinical trial, including the results of scientifically appropriate tests of the statistical
significance of such outcome measures."

Regarding the AE and SAE reporting, the statement says ‘‘(I) SERIOUS ADVERSE EVENTS.—A table of anticipated and unanticipated serious adverse events grouped by organ system, with number and frequency of such event in each arm of the clinical trial.
‘‘(II) FREQUENT ADVERSE EVENTS.—A table of anticipated and unanticipated adverse events that are not included in the table described in subclause (I) that exceed a frequency of 5 percent within any arm of the clinical trial, grouped by organ system, with number and frequency of such event in each arm of the clinical trial." The confusion from this is that there is no clear definition for anticipated and unanticipated SAE and AE. Perhaps for the future study protocols, the anticipated SAE and AE need to be listed in the protocol. Subsequently, the summary table of SAEs and AEs need to be separated for anticipated events and unanticipated events.

Further readings on this topic can be found from: