On Biostatistics and Clinical Trials: June 2011

Saturday, June 18, 2011

Is blinded study really blinded? - assessment of blinding / unblinding in clinical trials

Randomization and blinding are critical components of the clinical trial from the start (design) to the end. Randomized, controlled, and double-blinded trial (RCT) has been the ideal clinical trial design. Inappropriate randomization and blinding (or potential unblinding) affect the integrity for the clinical trial. If the patient or investigator is aware of the treatment assignment, there will be conscious or unconscious biases in assessing efficacy, safety, or patient-reported outcome. With available software and computer programs, generating a randomization schedule is relatively easy. Ten years ago, I wrote a paper on “Generating randomization schedule using SAS programming” to show how easily randomization can be generated. With the interactive response technologies (IRT) including interactive voice response system (IVRS) and interactive web response system (IWRS), implementation of the randomization can also be easily managed. However, maintaining the blinding during the study may not be as easy as we thought.

I still remember the time when I was one of the randomization team members in PPD. After we generated the randomization schedule, we had to put the randomization schedule into an envelope and sealed with signatures. Then we had to put the envelope into a locked security box in a secured randomization room. In order to get the randomization schedule, at least two statisticians had to be present in order to open the security box.

While the actual randomization schedule is locked and secured, the randomization information or treatment assignment concealment can still be compromised by what happened at the site, how the patient and investigator guess the treatment assignment, and how the unblinded personnel communicate with the blinded team members.

There are many factors that can cause the potential unblinding. Here are some examples:

Guess treatment assignment by the experience of adverse events and side effects. Suppose an intravenously administered drug can cause more headaches than Placebo, a patient with headache may guess he/she is on treatment group and not on Placebo. While this guess may not be 100% accurate, majority of patients may guess their treatment assignment correctly. In a book by Chow et al, ‘Design and analysis of clinical trials: concepts and methodologies’, an example about challenge in maintaining blinding was described “beta-blocker (e.g., pro-pranolol) have specific pharmacologic effects such as lowering blood pressure and the heart rate and distinct adverse effects such as fatigue, nightmares, and depression. Since blood pressure and heart rate are vital signs routinely evaluated at every visit in clinical trials, if a drug such as propranolol is known to lower blood pressure and the heart rate, then preservation of blindness is a huge challenge and seems almost impossible” In a large scale study (BHAT study), at the conclusion of the trial, patients, investigators, and clinical coordinators were asked to guess the patient’s treatment assignment, 79.9%, 69.6%, and 67% of patients, investigators, and clinical coordinators respectively guessed correctly the patient was on Propranolol and 42.8%, 58.6, and 70.6% of patients, investigators, and clinic coordinators respectively guessed correctly that the patient was on Placebo.

Guess treatment assignment by improvement or no improvement in efficacy. If there is a prior knowledge that an treatment is effective (lack of equipoise), the investigator or patient can guess which treatment the patient is on based on the lack of effect.

Guess treatment assignment by knowing the blood concentration of the drug or analytes. If a treatment is for augmentation purpose, a patient could have his/her blood sample tested to know whether or not the concentration for augmented drug is increased or not, then guess which treatment group he/she is on.

In double-blinded studies, there are always some unblinded groups. These groups could include global drug safety for safety monitoring, laboratories that measure drug concentration or biomarkers, study drug supplies, site unblinded pharmacist… all of these groups could potentially reveal the treatment assignment to other study team unintentionally.

For a study with DMC that involved a third party to prepare the unblinded data for DMC, treatment concealment could potentially be compromised during the information exchange with the blinded study team. This is critical for studies with adaptive designs where the patient data needs to be constantly reviewed and analyzed. An interesting example was discussed by Janet Witts regarding an awkward situation in an adaptive design where the DMC knew the event rate by treatment assignment and the sponsor didn’t.

There is a dilemma when we develop the informed consent form. On the one hand, we are required to put into the informed consent form as much information as we can. On the other hand, the more information we put into the informed consent form, the more likely we enable the patients to guess their treatment assignment (based on their experience of side effects or perceived efficacy).

Ideally, in a double-blind trial, it is a good practice to evaluate for both the subjects and investigators whether or not blinding / masking has been preserved. However, in the real world, it is rare in double-blinded clinical trials to include a formal assessment of how well the blinding has been preserved. If the assessment of blinding becomes a routine, I think that many studies will show that subjects/investigators guessed correctly more frequently than they should have done by chance alone. Part of the reason this assessment has not been done often is perhaps the difficulty to explain the study results if the blinding is found to be compromised. It will be extremely difficult to assess the magnitude of the impact on the safety and efficacy evaluation if the blinding/treatment assignment concealment is compromised.

Further readings:

Assuring that double-blind is blind

Blinded trials taken to the test: an analysis of randomized clinical trials that report tests for the success of blinding

Blinding, unblinding, and the placebo effect: An analysis of patient’s guesses of treatment assignment in a double-blind clinical trial

Can keeping clinical trial participants blind to their study treatment adversely affect subsequent care?

Assessment of blinding in clinical trials

Concealing treatment allocation in randomised trials

Wednesday, June 15, 2011

Bland-Altman Plot for Assessing Agreement

Bland-Altman plot is a scatter plot of variable means plotted on the horizontal axis and the differences plotted on the vertical axis which shows the amount of disagreement between the two measures (via the differences) and lets you see how this disagreement relates to the magnitude of the measurements.

When I was in graduate school, the statistical analysis of microarray data just started to be a hot topic. In collaboration with Dr Rick Song, we looked at the microarray data and wrote a manuscript titled “On Graphical Presentation and Quantitative Analysis of cDNA Microarray Data” and we presented in JSM. In this manuscript, we proposed to use Bland-Altman plot. In clinical trials, I have not got a chance to apply this approach, but I do often see articles using the Bland-Altman plot. For example, an article titled “Using the Bland–Altman method to measure agreement with repeated measures” from British Journal of Anaesthesia.

When data is appropriate, Bland-Altman plot can be a handy tool to use. It is worth relaying the paragraphs from our original paper on graphical presentation of micro-array data using Bland-Altman plot.

“Graphical presentation is usually the first step for data analysis of microarray data. In the case without duplication (this is typical in microarray experiment), scatter plots will be drawn and then a regression line drawn through the data. This helps the eye in gauging the degree of agreement between two measurements and also may help us to identify the "outliers" that represent the differentially expressed genes in microarray experiment.

In clinical medicine, to assess agreement between two methods of clinical measurement, Bland and Altman proposed to plot the difference between the methods (A-B) against the mean (A+B)/2[12,13,14,15]. This approach has been extensively used in medical research for assessing measurement error and comparing different measurements for the same quantity. Bland and Altman’s method can be also applied to the microarray data. We can plot (Rm-Gm) against (Rm+Gm)/2 (figure2 above).

Calculating or plotting a regression line is not our focus as we are not concerned with the estimated prediction of one color intensity by another but with the theoretical relationship of equality and deviations from it.

There are several advantages for presenting the microarray data using Brand and Altman’s approach:

The plot of difference against mean allows us to investigate any possible relationship between the discrepancies and the true value. The plot will also show clearly any extreme or outlying observations. If two different samples are used in the experiment, these extreme or outlying observations could indicate the differentially expressed genes. It is often helpful to use the same scale for both axes when plotting differences against mean values. This feature helps to show the discrepancies in relation to the size of the measurement.

Brand and Altman's method makes it easier for us to estimate the precision of the estimated limits of agreement between two color intensities. We want a measure of the agreement that is easy to estimate and to interpret for a measurement on the color intensity of an individual gene. An obvious starting point is the difference between measurements by the two channels on the same gene. There may be a consistent tendency for one channel to exceed the other. This is called calibration factor and can be estimated by the mean difference. There will also be variation about this mean, which we can estimate by the standard deviation of the differences. These estimates are meaningful only if we can assume that calibration factor and variability are uniform throughout all genes.”

More references on Bland-Altman Plot:

Friday, June 03, 2011

Restructure FDA's Drug Review Process?

Last week, I had a chance to listen to a speech by Dr Scott Gottlieb. While he touched several topics in related to the health care reforms, I was specifically interested in his discussion on restructuring FDA’s drug approval process.

Dr Gottlieb gave a lot of insights within FDA and analyzed the root cause of the very long and inefficient FDA drug review process.

Following his speech, I located his paper “SHOULD FDA RESTRUCTURE ITS DRUG REVIEW PROCESS?” from FDLI’s website. A lot of his analyses are so true and to the point.

For example, he elaborated why FDA adopt a matrix management structure for its review program.

“Prior to FDA’s adoption of a matrix management structure for its drug review program, agency scientists were organized largely around the clinical areas in which they worked (oncology, cardio-renal, antiviral, etc). This therapeutically focused structure had some advantages, but also led to some of its own challenges. FDA’s adoption of a matrix structure was aimed at solving some of these problems.

For one thing, sponsors complained that the advice they received about disciplines like biostatistics or clinical pharmacology varied (sometimes significantly) across different therapeutic divisions. Statisticians in one clinical division would be interpreting certain principles of statistics or evaluating a particular protocol design in a manner different than statisticians inside another therapeutic division.

These discrepancies still occur. But it is believed that the matrix organizational structure cuts down on this sort of conflict.

Grouping all of the statisticians or pharmacologists inside the same office increases opportunities for comparable training and cross-calibration on key principles. CDER management pulled the first group of the review divisions—the chemists—in 1995. The impetus was differences in pharmaceutical quality requirements being maintained among the different clinical divisions. Ultimately, having the chemists organized as a single group fostered the development of consistent standards. It also enabled FDA to negotiate the standards established by the International Conference on Harmonization (ICH).

Another reason for establishing the matrix was to improve morale. FDA remains a very “physician centered” culture, but was much more so prior to adoption of the matrix. Staff who lacked medical degrees or who weren’t the clinical reviewers on an application sometimes complained that they felt marginalized in the review process. As one statistician told me, “we were treated like second-class citizens.” Specialists from non-clinical disciplines like statistics also complained that remaining immersed in a single therapeutic area didn’t give them the breadth of experience that they needed for their own professional development.

Similarly, the organization of scientific personnel by therapeutic area was also seen as an impediment to their continued training in their chosen disciplines. For example, statisticians came together for the equivalent of grand rounds or other kinds of shared learning experiences. But these kinds of cross-training opportunities were challenging because staff were ultimately accountable to their divisions. The shared training opportunities weren’t prioritized. Efforts were also made to rotate non-clinical experts across different therapeutic areas. But the challenges endured.”

He then went on discussing the issues with FDA’s weak matrix management structure.

“But in practical terms, the weak matrix means that FDA project managers have limited dominion over key aspects of the review. Key disciplines involved in the review aren’t accountable to the project manager, or the division director. It is a system where there are few management carrots and no sticks. This weak structure also makes it harder to organize collaborative projects or even team meetings. The project manager doesn’t have strong authority when it comes to managing the collaboration between the different scientists involved in a drug’s review.”

He also mentioned the quality of the FDA review scientist and issues with FDA’s policy to allow very flexible working hours and work-from-home schedule. The current FDA drug review team is loosely organized and inefficient in many aspects. However, it is not easy to make big changes to the current process.

On Biostatistics and Clinical Trials