Saturday, June 22, 2019

Historical Control vs. External Control in Clinical Trials

Last week, I had an opportunity to attend the annual ICSA Applied Statistics Symposium in Raleigh, North Carolina. The symposium had a lot of good sessions to discuss contemporary statistical issues. Representing the DIA NEED group, we presented a session about “historical control in clinical trials”.

What is the historical control?

(v) Historical Control. The results of treatment with the test drug are compared with experience historically derived from the adequately documented natural history of the disease or condition, or from the results of active treatment, in comparable patients or population. Because historical control populations usually cannot be as well assessed with respect to pertinent variables as can concurrently control populations, historical control designs are usually reserved for special circumstances. Examples include studies of diseases with high and predictable mortality (for example, certain malignancies) and studies in which the effect of the drug is self-evident (general anesthetics, drug metabolism).

“The external control can be a group of patients treated at an earlier time (historical control),…”

In a recent FDA guidance (2019) “Rare Diseases: Common Issues in Drug Development”, the historical control and external control was used interchangeably.
1. Historical (external) controlsFor serious rare diseases with unmet medical need, interest is frequently expressed in using an external, historical, control in which all enrolled patients receive the investigational drug, and there is no randomization to a concurrent comparator group (e.g., placebo/standard of care). The inability to eliminate systematic differences between nonconcurrent treatment groups, however, is a major problem with that design. This situation generally restricts use of historical control designs to assessment of serious disease when (1) there is an unmet medical need; (2) there is a well-documented, highly predictable disease course that can be objectively measured and verified, such as high and temporally predictable mortality; and (3) there is an expected drug effect that is large, self-evident, and temporally closely associated with the intervention. However, even diseases with a highly predictable clinical course and an objectively verifiable outcome measure may have important prognostic covariates that are either unknown or unrecorded in the historical data.
What is the difference between historical control and external control?

The historical control was used to be one type of external controls that had a time (early time) component. In more recent guidelines, the historical control and external control are used interchangeably. The concept of historical control has a broader meaning now. In terms of the clinical trial design and statistical analyses, the same issues will apply no matter it is a study using historical control or external control.

Examples of Clinical Trials with Successful use of Historical or External Control

The randomized, controlled trials (RCTs) are still the golden standard, the study with historical control or external control can be used when concurrent controls are impractical or unethical. Many drugs, biological products or medical devices have successfully been approved or cleared by regulatory agencies for marketing authorization using the evidence generated from the clinical trials with historical or external control.

Here are some examples:

Brineura for Battten Disease
Brineura for Batten disease was approved by FDA based on a non-randomized, single-arm study in 22 subjects and a comparison with 42 subjects from a natural history cohort (a historical control group)

Venetoclax for Relapsed/Refractory Chronic Lymphocytic Leukemia

Venetoclas for R/R CLL was approved by FDA based on a single-arm study in 106 subjects with a comparison of the overall response rate to a 40% response rate that was considered as clinically meaningful.

Multiple IGIV products were approved based on FDA guidance. The guidance suggested to measure the rate of serious bacterial infections during regularly repeated administration of the investigational IGIV product in adult and pediatric subjects for 12 months (to avoid seasonal biases) and compare the observed infection rate to a relevant historical standard - a statistical demonstration of a serious infection rate per person-year less than 1.0.

FDA recently approved XVIVI XPS EVLP device to help increase access to more lungs for transplant. According to Summary of Safety and Effectiveness, the PMA approval was based on a single-arm study with a matched control to demonstrate the lung transplants with EVLP lungs were not inferior to the matched control group (all other lungs transplanted at that transplant center during the same time period). The one-year survival rate was compared to the matched control group and also the large database from UNOS (United Network for Organ Sharing). This is a good example of a study using ‘external control’.   

Monday, June 03, 2019

Six-Minute Walk Test (6MWT), 2-Minute Walk Test (2MWT), 12-Minute Walk Test (12MWT), and Timed Walk (T25FW)

Six-Minute Walk Test (6MWT) is to measure the distance in a fixed duration (6 minutes). It has been used as a clinical trial endpoint to measure the functional capacity in many therapeutical areas especially in pulmonary diseases (such as COPD, Pulmonary Hypertension) and neurology diseases (such as Duchenne Muscular Dystrophy) and others (such as the treatment of Mucopolysaccharidosis type VII (MPS VII, Sly syndrome)). The distances measured through 6MWT is 'Six-Minute Walk Distance' (6MWD).

Guidelines for Performing Standardized 6MWT

There are several guidelines for performing standardized 6MWT. The guidelines by the  American Thoracic Society is the one we usually follow:

6MWT is one of the approaches to measure 'exercise capacity' and is considered as a simulated test for measuring the function. FDA has a long-standing position that the clinical trial endpoint needs to measure patients' feel, function, and survival. 6MWT is measuring patients' function.

In FDA's guidance "Chronic Obstructive Pulmonary Disease: Developing Drugs for Treatment", 6MWT along with other exercise capacity measures were described as the following:
"Exercise capacity. Reduced capacity for exercise is a typical consequence of airflow obstruction in COPD patients, particularly because of dynamic hyperinflation occurring during exercise. Assessment of exercise capacity by treadmill or cycle ergometry combined with lung volume assessment potentially can be a tool to assess efficacy of a drug. Alternate assessments of exercise capacity, such as the Six Minute Walk or Shuttle Walk, also can be used. However, all these assessments have limitations. For instance, the Six Minute Walk test reflects not only physiological capacity for exercise, but also psychological motivation. Some of these assessments are not rigorously precise and may prove difficult in standardizing and garnering consistent results over time. These factors may limit the sensitivity of these measures and, therefore, limit their utility as efficacy endpoints, since true, but small, clinical benefits may be obscured by measurement noise."
History of the Six-Minute Walk Test: 

On "ATS Statement: Guidelines for the Six-Minute Walk Test" contained the following descriptions about the history of 6MWT.
Assessment of functional capacity has traditionally been done by merely asking patients the following: “How many flights of stairs can you climb or how many blocks can you walk?” However, patients vary in their recollection and may report overestimations or underestimations of their true functional capacity. Objective measurements are usually better than self-reports. In the early 1960s, Balke developed a simple test to evaluate the functional capacity by measuring the distance walked during a defined period of time. A 12-minute field performance test was then developed to evaluate the level of physical fitness of healthy individuals. The walking test was also adapted to assess disability in patients with chronic bronchitis. In an attempt to accommodate patients with respiratory disease for whom walking 12 minutes was too exhausting, a 6-minute walk was found to perform as well as the 12-minute walk. A recent review of functional walkingtests concluded that “the 6MWT is easy to administer, better tolerated, and more reflective of activities of daily living than the other walk tests”.
History of the Six-Minute Walk Test in Pulmonary Arterial Hypertension:

6MWD has been accepted by the FDA as the primary efficacy endpoint in the drug development in pulmonary arterial hypertension (PAH). According to a presentation by Dr. Barbara LeVarge "Exercise physiology and noninvasive assessment in PAH', the use of 6MWT in PAH started with the clinical development program of Epoprostenol.

6MWT versus Timed Walk
I am curious why 6MWT is a popular measure, but not the timed walk. To assess the functional capacity, we can either fix the time, then measure the distance (such as 6MWT), or fix the distance, then measure the time (such as Timed 25 Foot Walk [T25FW]). In sports, for all events in track and field and swimming, we always fix the distance and then measure the time.

In terms of the measurement accuracy, timed walk (such as T25FW) seems to be more accurate than 6MWT. For the timed walk, we need to make sure the recording of the time is accurate because the distance is fixed. For 6MWT, we need to make sure the recordings of both time and distance are accurate - while time is fixed, it usually needs to be measured as well.

The timed walk is actually used in clinical trials in neurology area and is accepted by the FDA as a clinical trial endpoint, for example, the timed walk is used to measure the improvement of walking ability in multiple sclerosis patients

2MWT, 6MWT, and 12MWT 

While 6MWT is the most commonly used, the 12-minute Walk Test (12MWT) was initially used to measure the functional capacity by Balke and 2-Minute Walk Test (2MWT) has also used in some clinical trials.

Leung et al (2006) did a study to validate the 6MWT in severe COPD "Reliability, Validity, and Responsiveness of a 2-Min Walk Test To Assess Exercise Capacity of COPD Patients" and they concluded:
The 2MWT was shown to be a reliable and valid test for the assessment of exercise capacity and responsive following rehabilitation in patients with moderate-to-severe COPD. It is practical, simple, and well-tolerated by patients with severe COPD symptoms.
Grifols is currently conducting a pivotal FORCE study "Study of the Efficacy and Safety of Immune Globulin Intravenous (Human) Flebogamma 5% DIF in Patients with Post-Polio Syndrome" where 2MWD is the primary efficacy endpoint.

Monday, May 13, 2019

Pediatric Extrapolation for Pediatric Indication

In a previous post "Pediatric Study Plan (PSP) and Paediatric Investigation Plan (PIP)", we discussed the requirements for PSP and PIP and the importance of incorporating the pediatric investigation plan into the overall clinical development program.
Doing clinical trials in the pediatric population is always challenging. It is not feasible to have a pediatric investigation plan that is too big to implement. Ethically, it is also not a wise decision to expose too many children in clinical trials (especially the placebo-controlled trials).
Regulatory agencies (such as FDA and EMA) realized the challenges in the clinical development program in children and have issued guidelines that encourage the sponsors to use an approach called 'pediatric extrapolation'.

We have already seen that some sponsors use the pediatric extrapolation to obtain the pediatric indication successfully.


Thursday, May 02, 2019

FDA and EMA Guidance on Adjusting for Covariates in Randomized Clinical Trials

Last week, FDA issues its draft guidance for industry titled 'Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biologicals with Continuous Outcomes'. The guidance is short and sweet and gives five recommendations:

  • Sponsors can use ANCOVA to adjust for differences between treatment groups in relevant baseline variables to improve the power of significance tests and the precision of estimates of treatment effect. 
  • Sponsors should not use ANCOVA to adjust for variables that might be affected by treatment. 
  • The sponsor should prospectively specify the covariates and the mathematical form of the model in the protocol or statistical analysis plan. 
  • Interaction of the treatment with covariates is important, but the presence of an interaction does not invalidate ANCOVA as a method of estimating and testing for an overall treatment effect, even if the interaction is not accounted for in the model. The prespecified primary model can include interaction terms if appropriate. 
  • Many clinical trials use a change from baseline as the primary outcome measure. Even when the outcome is measured as a change from baseline, the baseline value can still be used advantageously as a covariate. 
Not sure why the guidance is only for clinical trials 'with continuous outcomes' and ANCOVA. Adjusting for covariates is also applicable for studies with other types of outcomes: time to event outcomes analyzed using Cox regression, categorical outcomes analyzed using logistical regression,... It is also important to follow the same rules in these studies when dealing with the covariates.

The second recommendation essentially said that the post-randomization or post-treatment variables should not be used as covariates in analyses - which is consistent with what was laid out in EMA's guidelines (see below). If there is a time-dependent covariate in longitudinal studies, the best way is to come up with a pre-adjustment formula instead of using the time-dependent covariate as a covariate in the model. There is no discussion of whether or not the post-treatment covariates can be used in the imputation model when multiple imputation method is used to impute the missing data. There was a misperception that the post-treatment variables could be used in the imputation model, but not in the analysis model. 

EMA had a similar guidance "Guideline on adjustment for baseline covariates in clinical trials" - it was in effect more than three years earlier than FDA's guidance and it was much longer (11 pages versus 3 pages in FDA's guidance) with more details about its recommendations. The pre-specification of the covariates to be used and avoidance of using the post-randomization variables as covariates was also emphasized. Here are executive summaries:
  • Stratification may be used to ensure balance of treatments across covariates; it may also be used for administrative reasons (e.g. block in the case of block randomisation). The factors that are the basis of stratification should normally be included as covariates or stratification variables in the primary outcome model, except where stratification was done purely for an administrative reason.
  • Variables known a priori to be strongly, or at least moderately, associated with the primary outcome and/or variables for which there is a strong clinical rationale for such an association should also be considered as covariates in the primary analysis. The variables selected on this basis should be pre-specified in the protocol. 
  • Baseline imbalance observed post hoc should not be considered an appropriate reason for including a variable as a covariate in the primary analysis. However, conducting exploratory analyses including such variables when large baseline imbalances are observed might be helpful to assess the robustness of the primary analysis.
  • Variables measured after randomisation and so potentially affected by the treatment should not be included as covariates in the primary analysis. 
  • If a baseline value of a continuous primary outcome measure is available, then this should usually be included as a covariate. This applies whether the primary outcome variable is defined as the ‘raw outcome’ or as the ‘change from baseline’.
  • Covariates to be included in the primary analysis must be pre-specified in the protocol. 
  • Only a few covariates should be included in a primary analysis. Although larger data sets may support more covariates than smaller ones, justification for including each of the covariates should be provided. 
  • In the absence of prior knowledge, a simple functional form (usually either linearity or categorising a continuous scale) should be assumed for the relationship between a continuous covariate and the outcome variable. 
  • The validity of model assumptions must be checked when assessing the results. This is particularly important for generalised linear or non-linear models where mis-specification could lead to incorrect estimates of the treatment effect. Even under ordinary linear models, some attention should be paid to the possible influence of extreme outlying values. 
  • Whenever adjusted analyses are presented, results of the treatment effect in subgroups formed by the covariates (appropriately categorised, if relevant) should be presented to enable an assessment of the model assumptions.
  • Sensitivity analyses should be pre-planned and presented to investigate the robustness of the primary analysis. Discrepancies should be discussed and explained. In the presence of important differences that cannot be logically explained – for example, between the results of adjusted and unadjusted analyses – the interpretation of the trial could be seriously affected. 
  • The primary model should not include treatment by covariate interactions. If substantial interactions are expected a priori, the trial should be designed to allow separate estimates of the treatment effects in specific subgroups. • Exploratory analyses may be carried out to improve the understanding of covariates not included in the primary analysis, and to help the sponsor with the ongoing development of the drug. • In case of missing values in baseline covariates the principles for dealing with missing values as outlined e.g. in the Guideline on missing data in confirmatory clinical trials(EMA/CPMP/EWP/1776/99 Rev. 1) applies. 
  • A primary analysis, unambiguously pre-specified in the protocol, correctly carried out and interpreted, should support the conclusions which are drawn from the trial. Since there may be a number of alternative valid analyses, results based on pre-specified analyses will carry most credibility.

Saturday, April 20, 2019

Hodges-Lehmann estimator of location shift: Median of Differences versus Difference in Medians or Median Difference

Hodges-Lehmann estimator has been used to compare the treatment effect while the data is non-normal distributed. See my previous posts:
Many of the journal articles used Hodges-Lehmann estimator to the difference in two medians
In a study by Perkins et al "A Randomized Trial of Epinephrine in Out-of-Hospital Cardiac Arrest",
"The Hodges–Lehmann method was used to estimate median differences with 95% confidence intervals for length-of-stay outcomes"
In a study by Devinsky et al "Trial of Cannabidiol for Drug-Resistant Seizures in the Dravet Syndrome"
"Analysis of the primary end point was performed with the use of a Wilcoxon rank-sum test. An estimate of the median difference between cannabidiol and placebo, together with the 95% confidence interval, was calculated with the use of the Hodges–Lehmann approach. Sensitivity analyses of this primary end point were prespecified in the trial protocol and statistical analysis plan"
Similarly, Hodges-Lehmann estimator was used to estimating the treatment effect in licensure trials:

FDA Clinical/Statistical Review for Vascepa (icosapent ethyl) for reduction of triglycerides in patients with very high triglycerides
The median differences between the treatment groups and 95% CIs were estimated with the Hodges-Lehmann method. P-value is from the Wilcoxon rank-sum test.
FDA Statistical review for RLY5016 for Oral Suspension (Veltassa) for Hyperkalemia
To compare Veltassa with placebo, the difference between the mean ranks was tested using a two-sided t-test. The difference and 95% CI between the treatment groups in median change from baseline was estimated using a Hodges-Lehmann estimator.
FDA Medical Review of Oral Treprostinil for Pulmonary Arterial Hypertension
The magnitude of the treatment effects was defined by the Hodges-Lehmann method to estimate the median difference between treatment groups for the change from baseline in 6MWD.
It sounds like we have found a solution to estimate the difference in medians when the data is not normally distributed. However, if we look at how the Hodges-Lehmann is calculated, we will see that it is not accurate to say the Hodges-Lehmann estimator is to compare the difference in medians, it is actually the estimator of the location shift (the term originally used by the authors) or the estimator of the median of differences (further explained below).

Let's check how medians are calculated using a very simple example: 

Median and the difference in Medians:

Group A
Group B
Original Measures
4, 7, 5, 3, 6
3, 2, 5, 1, 4
Rank the original measures in order
3, 4, 5, 6, 7
1, 2, 3, 4, 5
The difference in Medians (A-B)

Hodges-Lehmann Estimator of Location Shif (median of differences)

Group A
Group B
Original Measures
4, 7, 5, 3, 6
3, 2, 5, 1, 4
Rank the original measures in order
3, 4, 5, 6, 7
1, 2, 3, 4, 5
Each number in Group A is compared to each number in Group B
3 is compared to numbers in Group B:    2, 1, 0, -1, -2
4 is compared to numbers in Group B:    3, 2, 1, 0, -1
5 is compared to numbers in Group B:    4, 3, 2, 1, 0
6 is compared to numbers in Group B:    5, 4, 3, 2, 1
7 is compared to numbers in Group B:    6, 5, 4, 3, 2
Rank the differences from these pair comparisons in order
-2, -1, -1, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 6
Hodges-Lehmann estimator of location shift
Median of all these differences, in this case, the Hodges-Lehmann estimator is 2

The calculations of the medians can be implemented in the following SAS codes: 

data HodgesLehmann;
  input group $ number @@;
  A 3 A 4 A 5 A 6 A 7
  B 1 B 2 B 3 B 4 B 5
proc means data=hodgeslehmann median maxdec=0;
   class group;
   var number;

proc npar1way data=hodgeslehmann hl;
   class group;
   var number;

The Hodges-Lehmann estimation of the location shift is confirmed to be 2. In this example, the Hodges-Lehmann estimation of the location shift (2) is exactly the same as the differences in two medians (5-3 = 2). 

However, in many situations, the Hodges-Lehmann estimation of the location shift will be different from the differences between the two medians. the Hodges-Lehmann should really be called the median of differences between the two groups or the location shift (as the original authors used). 

The example below shows that the Hodges-Lehmann estimation of the location shift can be very different than the differences between the two medians. 

Group A
Group B
Original Measures
50.6, 39.2, 35.2, 17.0, 11.2, 14.2, 24.2, 37.4, 35.2
38.0, 18.6, 23.2, 19.0, 6.6, 16.4, 14.4, 37.6, 24.4
Rank the original measures in order
The difference in Medians (A-B)

data HodgesLehmann2;                   
   input Group $ number@@;
A 50.6
A 39.2
A 35.2
A 17.0
A 11.2
A 14.2 
A 24.2 
A 37.4 
A 35.2 
B 38.0 
B 18.6 
B 23.2 
B 19.0 
B 6.6 
B 16.4 
B 14.4 
B 37.6 
B 24.4 

proc means data=hodgeslehmann2 median maxdec=1;
  class group;
  var number;

proc npar1way data=hodgeslehmann2 hl;
  class group;
  var number;

As illustrated above, the Hodges-Lehmann estimation of the location shift is 7.8, however, the difference between two medians is 35.2 - 19.0 = 16.2 (the median for groups A is 35.2 and the median for Group B is 19.0).

While the Hodges-Lehmann estimator is often used to measure the treatment difference when the data is not normally distributed, we need to understand how the Hodges-Lehmann is calculated and how Hodges-Lehmann estimator can be very different than the simple difference between two medians. 

Monday, April 08, 2019

The Use of Restricted Mean Survival Time (RMST) Method When Proportional Hazards Assumption is in Doubt

In a recent article from AJRCCM (American Journal of Respiratory and Critical Care Medicine), Harhay et al discussed "An Alternative Approach for the Analysis of Time-to-Event and Survival Outcomes in Pulmonary Medicine'. The alternative approach discussed in the paper is called 'restricted mean survival time' or RMST in short.

In analyzing the time to event data, the most common approach is to draw Kaplan-Meier plots and then use non-parametric method (log-rank test or Wilcoxon test) to compare two different survival curve or use semi-parametric method (proportional hazard model) to perform the regression-type analyses to estimate the magnitude of the treatment difference. A key assumption is that the proportional hazards as the name of the method suggest. What it essentially means is that the ratio of the hazards for any two individuals or for any two groups is constant over time. However, in a lot of situations, the proportional hazard assumption may not hold - we call it non-proportional hazards. If we look at the Kaplan-Meier plots and see two curves crossover, it is likely there exist non-proportional hazards.

In a presentation by FDA statisticians (John Lawrence, Junshan Qiu, Steven Bai, and Jim Hung) "Comparison of Hazard Ratio and Restricted Mean Survival Analysis for Cardiorenal Drug Trials", several examples of the survival data with non-proportional hazards were presented. In the situation of the non-proportional hazards, the common approach such as the Cox proportional hazard model will give a biased estimate.

There are various methods to test the proportional hazard assumption. Please see the link below for details "testing the proportional hazard assumption in Cox models"

In the situation that the proportional hazard assumption is violated, the alternative approach should be explored. One approach coming in handy is the Restricted Mean Survival Time (RMST) method.

The RMST represents the area under the survival curve from time 0 to a specific follow-up time point; it is called restricted mean survival time because given X as the time until any event, the expectation of X (mean survival time) will be the area under the survival function (from 0 to infinity). RMST can be interpreted as the average time until an event occurs during a defined time period ranging from time 0 to a specific follow-up time point.

In the FDA's presentation above, there were final remarks about the RMST method: 
  • RMST which is directly related to patient’s survival/event-free time, is viable for quantifying treatment effect. • RMST can give better clinical interpretation of treatment effect.
  • The results came from a R function. Yesterday, I found that someone in Lily actually developed SAS macro for RMST.
In the article by Harhay et al, there were also the final comments:
"As shown in these examples, the RMST offers several inferential advantages over other
time-to-event statistics. Though we examined survival, any time-to-event endpoint can be assessed using the RMST approach. Statistical inference (i.e., estimation and hypothesis testing) using the RMST, including p-values, confidence intervals, and covariate-adjustment, can be performed in most popular statistical software packages, such as R and STATA. Study group comparisons using the RMST estimate also confer comparable statistical power to the log-rank test and test for the HR in many situations, thereby providing an alternative and clinically meaningful measure of time gained or lost to inform research and patient care."
Programs have been developed to calculate the RMST.

There is an R package developed by Uno H:

In SAS, there was a SAS macro available:

Monday, March 18, 2019

Adjudication Committee (AC), Endpoint Adjudication Committee (EAC), Clinical Endpoint Committee (CEC)

The clinical trials are getting bigger and more complicated these days. For typical multi-national, multi-center pivotal clinical trials, there will be a lot of committees formed: steering committee (SC), data monitoring committee (DMC), central reader, and adjudication committee, each with specific responsibilities.

Adjudication committee (AC) may also be called endpoint adjudication committee (EAC) or clinical endpoint committee (CEC) and are usually needed when the study endpoints are subjective measures.

What is the Adjudication Committee?

Adjudication committee is an independent group of experts that reviews clinical trial data in order to give expert opinions about clinical safety or efficacy events of interest.

According to the FDA’s Guidance for Industry “Establishment and Operation of Clinical Trial Data Monitoring Committees “, the adjudication committee is mentioned the following:
3.3. Endpoint Assessment/Adjudication Committees Sponsors may also choose to establish an endpoint assessment/adjudication committee (these may also be known as clinical events committees) in certain trials to review important endpoints reported by trial investigators to determine whether the endpoints meet protocol-specified criteria. Information reviewed on each presumptive endpoint may include laboratory, pathology and/or imaging data, autopsy reports, physical descriptions, and any other data deemed relevant. These committees are typically masked to the assigned study arm when performing their assessments regardless of whether the trial itself is conducted in a blinded manner. Such committees are particularly valuable when endpoints are subjective and/or require the application of a complex definition, and when the intervention is not delivered in a blinded fashion. Although such committees do not share responsibility with DMCs for evaluating interim comparisons, their assessments (if performed at frequent intervals throughout the trial with results incorporated into the database in a timely manner) help to ensure that the data reviewed by DMCs are as accurate and free of bias as possible.
Which Clinical Trials Need a Clinical Endpoint Adjudication Committee?

Increasingly, regulatory authorities are placing significant focus on clinical trial processes that ensure consistent, standardized, objective and unbiased reporting of safety and efficacy results; given that the definitions for many endpoint events include subjective components, and investigator-to-investigator subjective assessments may differ. Moreover, an increasing number of trials are now conducted in multiple geographies, and clinical practices across these settings can vary substantially. The likelihood of discrepant interpretations of safety and efficacy endpoints by investigators is thus increased.

Throughout a clinical trial, therefore, it is expected by regulatory agencies that certain events that form safety or efficacy endpoints for the study undergo centralized adjudication by a clinical endpoint adjudication committee (CEC).

A CEC consists of a panel of independent experts who have the relevant therapeutic area expertise, are experienced in clinical trials and have been trained on the specific study protocol. The CEC centrally reviews subject/event data and classifies efficacy and/or safety endpoints in a blinded and unbiased manner. The centralized adjudication process should be designed to both preserve the independence of the CEC and prevent any undue bias that could impact its decision-making processes.

A CEC can be used in any therapeutic area where there is a need for an independent, accurate, consistent and standardized assessment of important study events. CECs are most commonly used in cardiovascular outcome / safety studies; however, they are also frequently used in peripheral vascular disease, neurovascular, respiratory and oncology studies.

In some disease areas, the adjudication of the clinical endpoint is expected or even required by the regulatory agencies.

The study endpoint adjudicated by a central committee will be more reliable and is viewed as more credible.

Examples of Clinical Endpoints that Requires Adjudication

In Cardiovascular outcomes studies, a Major Adverse Cardiac Events (MACE) composite endpoint is often used as the primary endpoint for evaluating the efficacy and/ or safety. MACE is comprised of non-fatal myocardial infarction, non-fatal stroke and cardiovascular death. Once events are confirmed through centralised adjudication to meet protocol endpoint criteria, endpoint data is analysed for the number of occurrences of the composite endpoint in the respective treatment groups.

In the Cardio-Pulmonary field, the clinical trials in pulmonary arterial hypertension with composite morbidity/mortality endpoint will require the adjudication. European Medicines Agency (EMA) has a specific guideline “GUIDELINE ON THE CLINICAL INVESTIGATIONS OF MEDICINAL PRODUCTS FOR THE TREATMENT OF PULMONARY ARTERIAL HYPERTENSION” where a composite endpoint of clinical worsening can be used as a valid primary efficacy endpoint for establishing the efficacy.
The investigation of a composite primary endpoint that reflects, in addition to mortality, time to clinical worsening is encouraged. The composition of this composite endpoint may vary depending on the severity and the aetiology of the disease. The following components are suggested:
1. All-cause death.
2. Time to non-planned PAH-related hospitalization.
3. Time to PAH-related deterioration identified by at least one of the following parameters:
i. increase in WHO FC;
ii. deterioration in exercise testing
iii. signs or symptoms of right-sided heart failure
Any chosen parameter should be clinically relevant, adequately defined, well validated and centrally adjudicated by a blinded adjudication committee.
In oncology studies, the RECIST criteria are used to evaluate if the solid tumor has responded to the treatment. If the response rate is the primary efficacy endpoint, it is usually expected that a central reader facility will be used to review the images centrally to assess the treatment response. The central reader process is like the adjudication process. Here are some discussions about the adjudication in oncology trials.

Who are the Adjudicators in the Adjudication Committee?

Adjudication committee is an independent group of experts that reviews clinical trial data in order to give expert opinions about clinical safety or efficacy events of interest.

They are usually the expert in the academic setting. As an independent adjudicator, they cannot be the investigator for the study and cannot serve in other committees (such steering committee and data monitoring committee).

Will the Adjudication Committee be Blinded to the Treatment Assignment?

For double-blinded studies, the adjudication committee members are always blinded to the treatment assignment so that the adjudication is unbiased.

For open-label studies, the adjudication process can still be kept blinded.

What are the Regulatory Requirements for Adjudication Process?

For a study with endpoint adjudication committee, the source data for the primary efficacy endpoint analysis will be based on the adjudicated results – therefore, it is critical to ensure that the adjudication process is adequate, valid, and unbiased.

FDA emphases the importance of the adequate adjudication process in its document “The Use of Clinical Source Data in the Review of Marketing Applications”. In many cases, the adjudication process may be audited by the FDA.
Although many types of clinical source data require minimal or no interpretation after collection (e.g., blood pressure, cholesterol, or other discrete laboratory values), other clinical source data types require detailed interpretation by expert clinicians to assign endpoint values (i.e., endpoint adjudication (e.g., examination of radiographic images to measure tumor size, or examination of hospital records or accumulated data to determine whether a myocardial infarction has occurred)). How the applicant evaluates these source data can critically affect the reported results of the trial. In most cases, it would be expected that such interpretations are made blindly, whether conducted by investigators or special assessment groups (e.g., endpoint assessment committees (EACs)). It is equally critical that there be well-described, prospectively defined, evaluation criteria. In some cases, inspection of the clinical source data by clinical review staff may be necessary to establish the reliability of the data in the CRFs and CRTs for FDA review.
However, the processes used to inspect these source data can themselves pose challenges.
Evaluation of clinical source data is often subjective and, depending on the procedures used, susceptible to bias that could affect both the values of clinical endpoints and the results of efficacy and safety analyses. An FDA audit that reveals deficiencies in endpoint adjudication may trigger the need for additional evaluation of the clinical source data, but the audit itself could be biased. Therefore, just as an applicant’s methods of adjudicating endpoints should be well-defined a priori and free of bias, FDA inspection of such data also should use well-specified audit procedures, generally blinded as to treatment assignment, agreed to before the audit to minimize bias.
On-site inspections related to endpoint adjudication may be warranted under certain circumstances. For example, review of some NDAs may raise questions as to whether proper procedures were followed on endpoint adjudication. Other examples include when re-adjudication requires special equipment only available at the clinical site to access the clinical source data, or when on-site visits are necessary to retrieve clinical source data for re-adjudication. 
If the audit determines that the data in the CRFs and CRTs are not reliable enough for review because of deficiencies in the applicant’s endpoint adjudication process, or in the quality of the actual source data itself, clinical review staff may conclude that a re-adjudication of the endpoints is necessary. Clinical review staff should establish acceptable re-adjudication procedures with the applicant, and the applicant is, in most instances, expected to conduct the re-adjudication and the appropriate reanalysis. Instances in which clinical review staff conduct the re-adjudication itself, excluding the applicant, should be rare and well-justified

Adjudication Committee Charter

A critical document for the adjudication process is the adjudication committee charter. The charter will define the composition of the committee, responsibilities of the adjudication committee, the adjudication process/flowchart, the size of the adjudication committee, …

What is the Typical Adjudication Process?

A good whitepaper by Quintiles (now IQVIA) described the best practice for adjudication committee.

In the studies with the adjudication process that I was involved, a consensus adjudication process was employed. With a consensus adjudication process, the adjudicators must reach a consensus regarding the endpoint. Suppose we have three adjudicators; the case will be assigned to two independent adjudicators. If two independent adjudicators give the same assessment, the event is considered adjudicated. If two independent adjudicators disagree with the assessment, the event will be sent to the third adjudicator. The third adjudicator must agree with one of the initial adjudicators to close the case.

In some cases, the third adjudicator don’t agree with either of two adjudicators, a meeting may be needed to discuss the case and reach the consensus.

Will a Separate Electronic System be Needed for Adjudication Process?

Yes, usually a separate electronic system will be needed for the adjudication process. The adjudication system will be independent of the electronic data capture (EDC) system that is for collecting the clinical data.

Further Readings: