Monday, August 01, 2016

Should hypothesis tests be performed and p-values be provided for safety variables in efficacy evaluation clinical trials?

p-value is the probability of observing a test statistic at least as large as the one calculated assuming the null hypothesis is true. In many situations, p-value from the hypothesis testing has been over-used, mis-used, or mis-interpreted. American Statistical Association seems to be fed up with the mis-use of the p-values and has formally issued a statement about the p-value (see AMERICAN STATISTICAL ASSOCIATION RELEASES STATEMENT ON STATISTICAL SIGNIFICANCE AND P-VALUES). It also provides the following six principles to improve the Conduct and Interpretation of Quantitative Science.
  •  P-values can indicate how incompatible the data are with a specified statistical model.
  •  P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
  • Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
  • Proper inference requires full reporting and transparency. 
  • A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
  • By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

However, we continue to see the cases that the p-values are over-used, mis-used, and mis-interpreted, or used for wrong purpose. One area for p-value misuse is in the analysis of safety endpoints such as adverse events and laboratory parameters in clinical trials. One of the ASA proposed principles of using p-value is “a p-value, or statistical significance, does not measure the size of an effect or the importance of a result” – ironically, this is also the reason for people to present p-values for tens and hundreds of p-values hoping that a p-value will measure the size of an effect or the important of a result.

In a recent article in New England Journal of Medicine, the p-values were provided for each adverse event even though the hypothesis testing to compare the incidence of each adverse event was not the intention of the study. In Marso et al (2016) Liraglutide and Cardiovascular Outcomes in Type 2 Diabetes, the following summary table was presented with p-values for individual adverse events.

The study protocol did not mention any inferential analysis for adverse events. It is clear that these p-values presented in the article are post-hoc and unplanned. Here is the analysis plan for AE in the protocol.

AEs are summarised descriptively. The summaries of AEs are made displaying the number of subjects with at least one event, the percentage of subjects with at least one event, the number of events and the event rate per 100 years. These summaries are done by seriousness, severity, relation to treatment, MESI, withdrawal due to AEs and outcome.”

In this same article, the appendix also presented p-values for cardiovascular and anti-diabetes medications at baseline and during trial. However, it could be misleading to interpret the results based on these p-values. For example, for Statins introduced during trial, the rates are 379 / 4668 = 8.1% in Liraglutide group and 450 / 4672 = 9.6% in Placebo group with a p-value of 0.01. However, while the p-value is statistically significant, the difference in rate (8.1% versus 9.6%) is not really meaningful.

Similarly, in another NEJM article (Goss et al (2016) Extending Aromatase-Inhibitor Adjuvant Therapy to 10 Years).  The p-values were provided for individual adverse events.

Usually, the clinical trials are designed to assess the treatment effect for efficacy endpoints, not for the safety endpoints such as adverse events and laboratory test results. For a clinical trial, there could be many different adverse events reported. Providing the p-value for each adverse event could be mis-interpreted as testing the statistical significant difference for each event between treatment groups. Uniformly and non-discretionary applying the hypothesis testing for tens and hundreds of different adverse event terms is against the statistical principle.

FDA Center for Drug Evaluation Research (CDER) has a reviewer guidance for Conducting a Clinical Safety Review of a New Product Application and Preparing a Report on the Review. It has the following statements about the hypothesis testing for safety endpoints.
Approaches to evaluation of the safety of a drug generally differ substantially from methods used to evaluate effectiveness. Most of the studies in phases 2-3 of a drug development program are directed toward establishing effectiveness. In designing these trials, critical efficacy endpoints are identified in advance, sample sizes are estimated to permit an adequate assessment of effectiveness, and serious efforts are made, in planning interim looks at data or in controlling multiplicity, to preserve the type 1 error (alpha error) for the main end point. It is also common to devote particular attention to examining critical endpoints by defining them with great care and, in many cases, by using blinded committees to adjudicate them. In contrast, with few exceptions, phase 2-3 trials are not designed to test specified hypotheses about safety nor to measure or identify adverse reactions with any pre-specified level of sensitivity. The exceptions occur when a particular concern related to the drug or drug class has arisen and when there is a specific safety advantage being studied. In these cases, there will often be safety studies with primary safety endpoints that have all the features of hypothesis testing, including blinding, control groups, and pre-specified statistical plans.
In the usual case, however, any apparent finding emerges from an assessment of dozens of potential endpoints (adverse events) of interest, making description of the statistical uncertainty of the finding using conventional significance levels very difficult. The approach taken is therefore best described as one of exploration and estimation of event rates, with particular attention to comparing results of individual studies and pooled data. It should be appreciated that exploratory analyses (e.g., subset analyses, to which a great caution is applied in a hypothesis testing setting) are a critical and essential part of a safety evaluation. These analyses can, of course, lead to false conclusions, but need to be carried out nonetheless, with attention to consistency across studies and prior knowledge. The approach typically followed is to screen broadly for adverse events and to expect that this will reveal the common adverse reaction profile of a new drug and will detect some of the less common and more serious adverse reactions associated with drug use.
7.1.5.5 Identifying Common and Drug-Related Adverse EventsFor common adverse events, the reviewer should attempt to identify those events that can reasonably be considered drug related. Although it is tempting to use hypothesis-testing methods, any reasonable correction for multiplicity would make a finding almost impossible, and studies are almost invariably underpowered for statistically valid detection of small differences. The most persuasive evidence for causality is a consistent difference from control across studies, and evidence of dose response. The reviewer may also consider specifying criteria for the minimum rate and the difference between drug and placebo rate that would be considered sufficient to establish that an event is drug related (e.g., for a given dataset, events occurring at an incidence of at least 5 percent and for which the incidence is at least twice, or some other percentage greater than, the placebo incidence would be considered common and drug related). The reviewer should be mindful that such criteria are inevitably arbitrary and sensitive to sample size.
7.1.7.3 Standard Analyses and Explorations of Laboratory DataThis review should generally include three standard approaches to the analysis of laboratory data. The first two analyses are based on comparative trial data. The third analysis should focus on all patients in the phase 2 to 3 experience. Analyses are intended to be descriptive and should not be thought of as hypothesis testing. P-values or confidence intervals can provide some evidence of the strength of the finding, but unless the trials are designed for hypothesis testing (rarely the case), these should be thought of as descriptive. Generally, the magnitude of change is more important than the p-value for the difference.
PhUSE is an independent, not-for-profit organisation run by volunteers. Since its inception, PhUSE has expanded from its roots as a conference for European Statistical Programmers, to a global platform for the discussion of topics encompassing the work of Data Managers, Biostatisticians, Statistical Programmers and eClinical IT professionals. PhUSE is run by the statistical programmers, but it is attempting to put together some guidelines about how the statistical tables should be presented. I guess that statisticians may not agree with all of their proposals.

PhUSE has published a draft proposal Analyses and Displays Associated with Adverse Events – Focus on Adverse Events in Phase 2-4 Clinical Trials and Integrated Summary Documents”. The proposal has a specific section about presentation of p-values for adverse event summary tables.
6.2. P-values and Confidence IntervalsThere has been ongoing debate on the value or lack of value for the inclusion of p-values and/or confidence intervals in safety assessments (Crowe, et. al. 2009). This white paper does not attempt to resolve this debate. As noted in the Reviewer Guidance, p-values or confidence intervals can provide some evidence of the strength of the finding, but unless the trials are designed for hypothesis testing, these should be thought of as descriptive. Throughout this white paper, p-values and measures of spread are included in several places. Where these are included, they should not be considered as hypothesis testing. If a company or compound team decides that these are not helpful as a tool for reviewing the data, they can be excluded from the display.
Some teams may find p-values and/or confidence intervals useful to facilitate focus, but have concerns that lack of “statistical significance” provides unwarranted dismissal of a potential signal. Conversely, there are concerns that due to multiplicity issues, there could be over-interpretation of p-values adding potential concern for too many outcomes. Similarly, there are concerns that the lower- or upper-bound of confidence intervals will be over-interpreted. It is important for the users of these TFLs to be educated on these issues.
Similarly, PhUSE also has a white paper on “Analyses and Displays Associated with Demographics, Disposition, and Medications in Phase 2-4 Clinical Trials and Integrated Summary Documents “where p-values in summary table for demographics and concomitant medications are also discussed.
6.1.1. P-values There has been ongoing debate on the value or lack of value of the inclusion of p-values in assessments of demographics, disposition, and medications. This white paper does not attempt to resolve this debate. Using p-values for the purpose of describing a population is generally considered to have no added value. The controversy usually pertains to safety assessments. Throughout this white paper, p-values have not been included. If a company or compound team decides that these will be helpful as a tool for reviewing the data, they can be included in the display.
It is very common that the p-values are provided for the demographic and baseline characteristics to make sure that there is a balance (no difference between treatment groups) in key demographic and baseline characteristics. These demographic and baseline characteristics are usually the factors for performing the sub-group analysis.

It is also very common that the p-values are not provided for safety and ancillary variables such as adverse events, laboratory parameters, concomitant medications, and medical histories. The obvious concerns are about the multiplicity, lack of pre-specification, the interpretation of these p-values, and mis-interpretation of p-value as a measure of the importance of a result.  The safety analyses are still mainly on summary basis unless the specific safety variables are pre-specified for hypothesis testing. The safety assessment is sometimes based on the qualitative analysis rather than the quantitative analysis – this is why the narratives for serious adverse events (SAEs) place a critical role in safety assessment. For example, it is well known now the drug Tysabri is effective in treating the relapsing-remitting multiple sclerosis, but increases the risk of progressive multifocal leukoencephalopathy (PML), an opportunistic viral infection of the brain that usually leads to death or severe disability. PML is very rare and is not supposed to be seen in clinical trial subjects. If any PML case is reported in Tysabri treatment group, it will be considered as significant even though the p-value may not be.


Friday, July 15, 2016

Protocol amendment in clinical trials

For every clinical trial, study protocol is the centerpiece and study protocol dictates how the study should be conducted, what data will be collected, and how the data will be analyzed. Usually, after the IND (Investigational New Drug) including the study protocol is filed and FDA does not provide any comments (or put it on clinical hold) within 30 days, the sponsor will consider the study protocol is granted approval to proceed. However, it is very common that during the study conduct, some aspects of the study protocol needs to be changed or amended.

Protocol amendment is guided by the Code of Federal Register (CFR) section 312.30 Protocol amendments. CFR states the followings regarding the protocol amendments:
(b) Changes in a protocol. (1) A sponsor shall submit a protocol amendment describing any change in a Phase 1 protocol that significantly affects the safety of subjects or any change in a Phase 2 or 3 protocol that significantly affects the safety of subjects, the scope of the investigation, or the scientific quality of the study. Examples of changes requiring an amendment under this paragraph include:
(i) Any increase in drug dosage or duration of exposure of individual subjects to the drug beyond that in the current protocol, or any significant increase in the number of subjects under study.
(ii) Any significant change in the design of a protocol (such as the addition or dropping of a control group).
(iii) The addition of a new test or procedure that is intended to improve monitoring for, or reduce the risk of, a side effect or adverse event; or the dropping of a test intended to monitor safety.
One question that is often asked is whether or not the protocol needs to be amended if the only changes to the protocol are about the statistical analyses or the sample size. The changes in statistical analysis or the sample size can be considered as changes “that significantly affects the scope of the investigation, or scientific quality of the study”, therefore subject to the protocol amendment. Notice the word ‘significantly’, which implies that some changes may be allowed without amending the protocol, for example small increase (usually less than 10% of total subjects) in sample size, and changes in the statistical analysis methods for secondary or exploratory endpoints, 

In the latest issue of “Therapeutic Innovation and Regulatory Science”, Getz et al published a paper “the impact of protocol amendment on clinical trial performance and cost”. They found that 57% of protocols had at least one substantial amendment, and nearly half (45%) of these amendments were deemed ‘avoidable’. Phase II and III protocols had a mean number of 2.2 and 2.3 global amendments, respectively. My experiences with clinical trials in rare diseases indicates even a high percentage (almost every protocol) of protocols with amendments and a high number of protocol amendments.
The protocol amendments has great impact on the conduct of the clinical trials:
  • Significant impact on the cost
  • Significant impact on the timeline
  • Significant impact on the resources
  • May have significant impact on the credibility of the study results
  • May result in more protocol deviations

“Unplanned delays, disruptions, and costs associated with implementing protocol amendments have long challenged drug development companies and their contract research partners. Despite a rigorous and extensive internal review and approval process, the majority of finalized protocols are amended multiple times – particularly those directing later-stage phase III studies.
 The frequency of protocol amendments varies by therapeutic area and is highly correlated with more scientifically and operationally complex protocols. Increased amendment frequency per protocol is associated with protocols that have a higher relative number of protocol procedures and eligibility criteria, and more investigative sites dispersed across more countries.
Amendments are implemented for a wide variety of reasons, including the introduction of new standards of care, changes to medications permitted before and during the clinical trial, the availability of new safety data, and requests from regulatory agencies and other oversight organizations (eg, ethical review boards). The top reason for amending a protocol is to modify study volunteer eligibility criteria due to changes in study design strategy and difficulties recruiting patients. “
Some large pharmaceutical companies start to look at the impact of the protocol amendment on the overall cost and the timeline. I even heard (unconfirmed) that GSK used the number of protocol amendment as one of the performance evaluation criteria. The less number of the protocol amendments, the better the performance is.

The number of protocol amendments may be the results of a specific study design. A recent discussion about the phase I dose cohort expansion study results in unlimited number of protocol amendments. This specific design has been discussed in my previous posting. As an example, a dose cohort expansion study by Merck has 50 protocol amendments and still counting.

Adaptive design has been a hot topic in clinical trial field for last ten years. The enthusiasm about the adaptive design has died down a little bit except in oncology area. One of the key advantages for adaptive design is to implement the changes based on pre-specified criteria and therefore avoid the protocol amendment. For example, if all criteria for pruning the treatment arms or for increasing the sample size are pre-specified, when the criteria are met, there is no need for protocol amendment to implement the changes. This could be good in saving the time/cost, but may be bad due to the loss in learning opportunities.  

For traditional study designs, it is always desirable to minimize the number of the protocol amendments. In reality, the protocol amendments are ubiquitous. The reason for protocol amendments may include the followings (it is not intended to be an exclusive list).
  • Lack of internal expertise in the therapeutics area
  • Lack of consultation from external experts
  • Lack of Engagement with steering committee
  • Lack of engagement with CRO who may have the first hand experiences about the sites.
  • Lack of experience from other countries – standard care may be very different in other countries
  • Too late in engaging the statisticians – statistician should engage in the study design including endpoint selection, not just calculating the sample size.
  • Sign off the final protocol too early, for example the protocol was signed off before pre-IND meeting with FDA
  • Submitting the final protocol when the concept protocol, protocol synopsis, or draft protocol suffice
  • Inadequate or unrealistic inclusion/exclusion criteria
  • Lack of quality control in protocol review / approval process.
It is often that the protocol amendment is triggered by the following – eventually the protocol may go through several round of amendments before the first patients is enrolled into the study.
  • Protocol amendment after FDA pre-IND meeting
  • Protocol amendment per external committees requests – such as Data Monitoring Committee (DMC), Steering Committee
  • Protocol amendment after the investigator meeting
  • Protocol amendment after FDA’s IND comments
  • Protocol amendment due to the difficulties in patient enrollment
  • Protocol amendment after blinded interim analysis
  • Protocol amendment due to expansion in the number of countries
For multi-national clinical trials, there may be situations that a particular country's regulatory authority will require a slight deviation to an IND study protocol. This may be implemented through country-specific protocol amendments. However, for "country-specific" protocol amendments for international studies, if the data will support a marketing application, FDA will want to know what was done differently in those countries, so the amendments would need to be submitted to FDA. If the study is under an IND at the non-U.S. sites, then these amendments would need to be submitted as specified under 21 CFR 312.30. If the international sites are not officially under the IND, this information would need to accompany the data in the marketing application at the very least.

It is not a good idea to have a country specific protocol amendment with significant deviation to an IND study protocol. For example, I used to work on a randomized, placebo controlled study where the regulatory authority in one of the targeted countries did not approve the inclusion of the placebo arm in the study. I was asked if a country-specific protocol could be used so that the placebo arm could be dropped from the protocol for this specific country. In this case, the deviation to the IND study protocol seems to be too big. The country-specific protocol is not a good solution and this specific country may need to be excluded from the study participation.

A small tip for CRF/eCRF revision due to the protocol amendment :
When inclusion/exclusion criteria are revised in protocol amendment, to avoid the potential impact on the CRF data collection and the downstream activities, it is better to:
  • Keep the inclusion / exclusion number intact (i.e., skip the number) if one or more of them are removed, for example if the inclusion criteria #3 is removed, the amended protocol will have inclusion criteria 1, 2, 4, 5 (i.e., #3 is skipped).
  • Add additional inclusion/exclusion criteria after the last existing inclusion or exclusion riterion if additional inclusion/exclusion criteria need to be added, 

Tuesday, July 05, 2016

Some Blinding Techniques in Clinical Trials

In randomized controlled clinical trials, the blinding is one of the key components. The purpose of the blinding to the treatment assignment is to avoid the conscious or unconscious biases in assessing the efficacy and safety endpoints, therefore, to maintain the integrity of the study. How much important of the blinding? Look at the investigator initiated studies, early phase trials – many of them had positive results, but later was demonstrated untrue.

In terms of the blinding technique, researchers should look for 3 qualities: it must successfully conceal the group allocation; it must not impair the ability to accurately assess outcomes; and it must be acceptable to the individuals that will be assessing outcomes. In some clinical trials, not all these three qualities can be met.

Based on how the blinding is maintained, the clinical trials can be categorized as open-label study, single-blind study, and double-blind study.

The open label study (may be called 'open study' in EU countries) is a study with both the investigator and the subject knowing the treatment the subject is receiving. The open label study can be a study without any control group or can be a randomized, controlled, open label study.

The single blind study is a study with investigator knowing the treatment assignment and with the subject not knowing which treatment he/she is receiving.

The double blind study is a study with both investigator and subject not knowing which treatment the subject is receiving.

  • The blinding is defined based on whether or not the investigator and subject know the treatment assignment, however, in industry, additional parties who are involved in the managing and conducting the clinical trials may also be blinded to the treatment assignment. For example, in double-blind studies, the study team on the sponsor side, the CRO, and vendors are usually also blinded.

  • There is an extended term ‘triple blind study' which is defined as a double-blind study in which, in addition, the identities of those enrolled in the study and control groups and/or the details about the nature of the interventions (experimental medications), are withheld from the statistician(s) who conduct the analysis of the data. Since the study statisticians (with exception of the DMC statisticians) are part of the study team and usually remain blinded to the treatment assignments during the study, the ‘double blind study’ is usually operated as a triple blind study in practice. This is why we rarely see the term ‘triple blind study’ is actually used in clinical trials.

  • In practice, for a single blind study, it is usually better to be conservative to treat the single blind study as if a double blind study for the study teams.
The blinding is usually easy to operate if the investigational products are pills. The pills for investigational products and the control products can be manufactured to be identical in size, color, smell,...   There are clinical trials where the comparison involves different route of drug administrations, different type of surgical procedures, or different devices. In these situations, the blinding of the treatment assignments seems to be impossible. However, this may not be entirely true. There are still some techniques or approaches that can be employed to have certain levels of the blinding to minimize the biases in assessing the efficacy and safety endpoints. 

Using sham treatment: Sham treatment is an inactive treatment or procedure (usually a medical procedure) that is intended to mimic as closely as possible a therapy in a clinical trial. Sham treatment is given to the subjects in the control group to mimic the investigational treatment group. With the use of the sham treatment, a seemly impossibly blinded study can now be blinded. Here are some examples that sham treatment is used in the study. 

Separating the treating physician and the evaluation or examining physician: in clinical trials where a separate physician other than the treating physician or investigator is employed to assess the efficacy or safety. The treating physician and the examining physician are separate and do not communicate about the assessment results. Treating physician can be unblinded to treatment assignment, but the examining physician is blinded. Therefore, the blinding is maintained. This type of arrangements is useful and necessary in neurological trials especially in multiple sclerosis trials where many subjective scales are used. In EMA (2006) GUIDELINEON CLINICAL INVESTIGATION OF MEDICINAL PRODUCTS FOR THE TREATMENT OF MULTIPLESCLEROSIS, the following is described:
“As several subjective decisions and assessments will have to be performed, with a considerable risk of bias, all possible efforts should be done to keep the design double blind. In cases where double blind is not possible (some active comparator trials, some easily unblinded treatments,...) a blind observer design with a blinded examining physician different than the treating physician may be used. All measures to ensure reliable single blind evaluation should be guaranteed (i.e. patches that cover injection sites to hide reddening or swellings, education of examining physicians,…).”
Similarly, in FDA's guidance for industry "Rare Diseases: Common Issues in Drug Development", it stated: 

...As another example, effective blinding of treatments can reduce concern about bias in the subjective aspects of an assessment, as can conduct of endpoint evaluation by people not involved in other aspects of the trial (e.g., radiologists, exercise testers).

Some of the examples are: 

Central reader with the blinded clinical data: using central reader for image endpoints where the central reader is blinded to the clinical data to avoid the biases. Additional blinding can also be employed through blinding of the baseline image and the subsequent images so that the changes (for example the tumor size) assessed by the central reader can be more reliable.

"In unblinded clinical trials, clinical information may bias a site-based image interpretation because the expected relation of clinical features to outcome is known and, therefore, local reading will raise concern about potential unblinding. A centralized image interpretation process, fully blinded, may greatly enhance the credibility of image assessments and better ensure consistency of image assessments. Some imaging modalities also may prove vulnerable to site-specific image quality problems, and a centralized imaging interpretation process may help minimize these problems. For example, the National Lung Screening Trial’s experience with computed tomography of the chest suggested that centralized image quality monitoring was important to the reduction of imaging defects (Gierada, Garg, et al. 2009). Hence, a centralized image interpretation process may be used to help control image quality as well as to provide the actual imaging-based endpoint measurements."
"In a time-sequential presentation, a subject’s complete image set (from baseline through the follow-up evaluations) is shown in the order in which the images were obtained. In this process (unless prespecified and justified in the charter), the reader does not initially know the total number of time points in each subject’s image set.
 In a hybrid, randomized image presentation, a subject’s complete image set (or only the postbaseline images) are shown fully randomized. After the read results have been locked for each time point, the images are shown again in known chronological order for re-read. Changes in any of the randomized assessments are tracked and highlighted in the final assessment. In within-subject-control trials (e.g., comparative imaging), images obtained before and after the investigational drug should be presented in fully randomized unpaired fashion and in randomized paired fashion in two separate image evaluations. The minimum number of images in each randomized block necessary to minimize recall should be considered."
Firewall to prevent the sponsor from performing the aggregate analysis: Building a firewall between the sponsor and the Data Monitoring Committee is a technique that is necessary to make sure that the study integrity is maintained. The firewall between the sponsor the investigator/clinical research organization can also be implemented in an open label study or single-blind study so that the sponsor is prevented to access the cumulative data for the primary efficacy endpoint for analysis. The primary efficacy endpoint information is accessible to the investigator and the CROs, but withheld from the sponsor. While the investigator and CRO may have some biases due to knowing the treatment assignment, but the biases from the sponsor side may be prevented.

Additional reading:

1.      Kenneth F Schulz, David A Grimes (2002) Blinding in randomised trials: hiding who got what
2.      Karanicolas et al (2009) Blinding:Who, what, when, why, how?

Monday, June 13, 2016

Basket (Bucket) Trial, Umbrella Trial, and Master Protocol


In response to the precision medicine or an old term ‘personalized medicine’, the clinical trial designs are also evolving in order. Over the last several years, two clinical trial designs have been proposed and implemented in many oncology trials: basket (or bucket) trials and umbrella trials.  The basket and umbrella trial designs are generally falling into the enrichment trial design with the purpose of avoiding over-treatment and saving valuable resources by matching the right drug to the right subgroup of patients through genetic biomarkers. While basket and umbrella trial designs are almost exclusively in oncology trials, we can assume that such designs can be generalized to therapeutics areas other than the oncology.

In general, here are the comparison of basket and umbrella trial designs with some example trials:

Basket (or Bucket) Trial
  Umbrella Trial
Targeted Therapy: biomarker analysis to identify patients likely to response
One Molecular Abnormality Targeted Across Multiple Tumor Types

Cancer is defined based on the genetic aberrant or biomarker signature (cancer with xxx positive biomarker)
One tumor type, multiple molecular targets


Cancer is defined as body location or histology (lung cancer, liver cancer,...)
Many diseases (or many cancer types), single subgroup, one drug
One disease (or one cancer type), multiple subgroups (identified by the biomarkers), many drugs
Test the effect of one or more drugs on one or more single mutations in a variety of cancer types
Test the impact of different drugs on different mutations in a single type of cancer
Biomarkers is usually measured locally
Biomarkers are usually measured centralized
Allows patients with multiple diseases and one or more target to be enrolled in cohorts or groups in one trial (the basket). This allows researchers to separately analyse the responses of patients as each tumour type can be put in one cohort, and assess the impact of the drug on all of the patients as one group. If one group shows a good response, we expand this group to immediately assess whether others could benefit from the new therapy. If another group does not show evidence of effectiveness, this group may be closed and the other cohort can continue the recruitment.
Designed to test the impact of different drugs on different mutations in a single type of cancer, on the basis of a centralised molecular portrait performed after obtaining informed consent: one disease, several molecular subtypes, several therapies. This design allows validation of a strategy based on a mixture of biomarkers and drugs
Cancers of different types are tested to see if they have a particular molecular abnormality. If they do, the patients with that abnormality are eligible to be treated with a new drug that targets that particular abnormality. The advantage of this approach is that it allows us to test new treatments across cancer types. On the other hand, we often have to test many patients to find that handful that have the abnormality targeted by the new treatment. If can be incredibly frustrating for a patient who agrees to be tested, only to be told he/she is not eligible to be treated on the study because his/her cancer does not have the appropriate target.
Patients with a given type of cancer are assigned a specific treatment arm based on the molecular makeup of their cancer. Umbrella trials have many different arms under the umbrella of a single trial.
Since the treatment assignment/stratification is based on molecular biomarkers, Umbrella trial requires the complicated process for centralized screening tests for multiple biomarkers. The genotyping to identify the molecular biomarkers, if done locally, could have results less reproducible.
NCI MATCH Targeted Therapy Directed by Genetic Testing in Treating Patients With Advanced Refractory Solid Tumors or Lymphomas

NCI MPACT Molecular Profiling-Based Assignment of Cancer Therapy for Patients With Advanced Solid Tumors

John Hainsworth, et al (2016) MyPathway trial

Focus4 Trials a molecularly stratified, multi-site randomised trial programme for patients with colorectal cancer
Basket Trials
 
Basket and Umbrella Trials
:


I don’t know when exactly the terms ‘basket trial’ and ‘umbrella trial’ proposed, but the following articles used these terms and provided very good explanations about ‘basket trial’ and ‘umbrella trial’.



In clinicaltrials.gov, we can see many cancer trials are designed as basket or umbrella trials and the terms basket and umbrella trials may be explicitly indicated.


Basket and Umbrella trial design will require having a master protocol. Master protocol refers to one overarching protocol that includes one or more of the following:

  • Multiple diseases
  • Multiple therapies
  • Multiple biomarkers



Launching a clinical trial typically takes long time to go through the administrative (for example contract negotiation) and regulatory approvals (health authority and IRB/EC approvals). The master protocol can create an experimental plan to test several candidate drugs or conduct the trials in multiple diseases under a one protocol so that we don’t need to have a fresh protocol approval each time and minimize the delay caused by the clinical trial launching. It’s like a Plug and Play. Here is a slide presentation and a youtube video by Friends of Cancer Research for Lung Cancer Master Protocol Trial and a video by Dr Fred Hirsch who talked about the development of a master protocol.




To see what the master protocol looks like, the FOCUS4 clinical trial programme provides an excellent opportunity. All protocols including the master protocol and each of the umbrella protocols. The master protocol describes the overall study design and all common procedures across all umbrella protocols. The umbrella protocol provides information about procedures for entering patients into FOCUS4-A  after a patient has been registered through the procedures described in the FOCUS4 Master Protocol.









There are statistical considerations when we come to the master protocol, the basket trial, and umbrella trial. The operation of these trials will be more challenging than the usual single disease single drug trial. Statisticians need to embrace these new trial designs and adapt to the new paradigm of the clinical trial designs in the era of precision medicine. To this aspect, it is helpful to read Dr LaVange discussion on “Statistical Considerations in Designing Master Protocols


Further reading: