Saturday, December 23, 2017

Composite Endpoint and Competing Risk Model

A competing risk is an event whose occurrence precludes the occurrence of the primary event of interest. For example, when the primary outcome is death due to cardiovascular causes, then death due to non-cardiovascular causes serves as a competing risk, because subjects who die of non-cardiovascular causes (e.g., death due to cancer) are no longer at risk of death due to a cardiovascular cause. However, when the primary outcome is all-cause mortality, then competing risks are absent, as there are no events whose occurrence precludes the occurrence of death due to any cause. In event-driven clinical trials, if a study subject drops out from the study prior to occurrence of the event in interest, the event of dropout precludes the occurrence of the event in interest, this is also a competing risk.

Competing risk issue occurs in clinical trials with a composite endpoint or an endpoint with composite outcome. A composite outcome consists of two or more component outcomes. Patients who have experienced any one of the events specified by the components are considered to have experienced the composite outcome. The main advantages supporting the use of a composite outcome are that it increases statistical efficiency because of higher event rates, which reduces sample size requirement, costs, and time; it helps investigators avoid an arbitrary choice between several important outcomes that refer to the same disease process; and it is a means of assessing the effectiveness of a patient reported outcome that addresses more than one aspect of the patient’s health status
It is common to use a composite endpoint in clinical trials, especially in clinical trials where the primary interest is to reduce the adverse outcomes, but the occurrence of these adverse outcomes may not be frequent enough. If we do a study with each individual component as the endpoint, the sample size required will be too large.

MACE (major adverse cardiac events) is a composite endpoint frequently used in clinical trials assessing the treatment effect in cardiac health. MACE is defined as any event of all-cause mortality, myocardial infarction, or stroke. If a patient died during the study, the MI or stroke will not be observed. If a MI or Stroke event occurred and the subject is discontinued from the study once one of these events occurred, the death event will not be observed – one component is a competing risk for another component.
In clinical trials in pulmonary arterial hypertension, the composite endpoint is used to evaluate the treatment effect in reducing the mortality and morbidity events. EMA guidance “GUIDELINE ON THE CLINICAL INVESTIGATIONS OF MEDICINAL PRODUCTS FOR THE TREATMENT OF PULMONARY ARTERIAL HYPERTENSION “ suggested the time to clinical worsening as the primary efficacy endpoint where the clinical worsening is defined as a composite endpoint consisting of:
1. All-cause death.
2. Time to non-planned PAH-related hospitalization.
3. Time to PAH-related deterioration identified by at least one of the following parameters:
  • increase in WHO FC;
  • deterioration in exercise testing
  • signs or symptoms of right-sided heart failure

Arterial Hypertension”, the primary end point in a time-to-event analysis was a composite of death or a complication related to pulmonary arterial hypertension, whichever occurred first, up to the end of the treatment period. The composite endpoint includes the following events:
  • death (all-cause mortality)
  • hospitalization for worsening of PAH based on criteria defined in the study protocol
  • worsening of PAH resulting in need for lung transplantation or balloon atrial septostomy initiation of parenteral (subcutaneous or intravenous) prostanoid therapy or chronic oxygen therapy due to worsening of PAH
  • disease progression (patients in modified NYHA/WHO functional class II or III at Baseline) confirmed by a decrease in 6MWD from Baseline (≥ 15%, confirmed by 2 tests on different days within 2 weeks) and worsening of NYHA/WHO functional class
  • disease progression (patients in modified NYHA/WHO functional class III or IV at Baseline) confirmed by a decrease in 6MWD from Baseline (≥ 15%, confirmed by 2 tests on different days within 2 weeks) and need for additional PAH-specific therapy.

There is a competing risk issue here, for example, lung transplantation and death are competing each other. If patient has a lung transplantation, the disease course will be changed, and the chance of death and occurrence of other events will be altered. 

A common approach to avoid the competing risk issue is to analyze the time to first event (any one of the components defined in the composite endpoint) as the primary efficacy endpoint even though this approach is often criticized because the importance / severity of these components is not equal (death should be given way more weight than other non-fatal events). FDA seems to be totally comfortable with the time to first event approach in both composite endpoint situation (as evidenced by the approval ofSelexipeg) and recurrent event situation (as evidenced by the FDA advisorycommittee meeting discussion). In a panel discussion at the regulatory-industry workshop in 2017 on the topic of Better Characterization of Disease Burden by Using Recurrent Event Endpoints (View Presentation), Drs Bob Temple and Norman Stockbridge both commented that FDA is fine with the time to fist event analysis as long as further analyses  are performed to evaluate the treatment effect on each individual component.

Competing risk model may be used in statistical analysis of the clinical trial data either as the primary method or as sensitivity analysis. In Schaapveld et al (2015) Second Cancer Risk Up to 40 Years after Treatment for Hodgkin’s Lymphoma, the competing risk model was used for analyzing the cumulative incidence of second cancers.
The cumulative incidence of second cancers was estimated with death treated as a competing risk, and trends over time were evaluated in competing-risk models, with adjustment for the effects of sex, age, and smoking status when appropriate

Competing risk model is more likely to be used as a sensitivity analysis, for example, in SPRINT study “A Randomized Trial of Intensive versus Standard Blood-Pressure Control”, The Fine–Gray model for the competing risk of death was used as a sensitivity analysis.

There are quite some discussions about the competing risk model in clinical trials:

In the situation where there is a competing risk issue, the Grey’s method or Fine and Gray method can be used. These methods are based on the paper below:
  • Gray, R. J. (1988), “A Class of K-Sample Tests for Comparing the Cumulative Incidence of a Competing  Risk,” Annals of Statistics, 16, 1141–1154.
  • Fine, J. P. and Gray, R. J. (1999), “A Proportional Hazards Model for the Subdistribution of a Competing Risk,” Journal of the American Statistical Association, 94, 496–509.

There are SAS macros for Gray’s method. Recently, Gray’s method and Fine and Gray methods are built in SAS PHREG and SAS PHREG can be handily used for performing the competing risk model. Here are some SAS papers regarding competing risk model analysis.

Sunday, December 10, 2017

Recurrent Events versus Composite Events: Statistical Analysis Methods for Recurrent Events

Recurrent events are repeated occurrences of the same type of event.
Composite endpoint is a combination of various clinical events that might happen, such as heart attack or death or stroke, where any one of those events would count as part of the composite endpoint.

While composite endpoint may also be discussed within the scope of the recurrent event endpoint, there are some distinctions between these two terms. The methods for statistical analysis are also different:
Recurrent Event Endpoint
Composite Endpoint
Examples:
  • Relapses in multiple sclerosis
  • Exacerbations in pulmonary diseases such as chronic obstructive pulmonary disease
  • Bleeding episodes in hemophilia

Examples:
  • MACE in cardiovascular trials where MACE (major adverse cardiac event) includes death, MI, and stroke.
  • Clinical worsening event in pulmonary arterial hypertension where clinical worsening includes all-cause death, PAH-related hospitalization, PAH-related deterioration of disease,…

Same type of event
Different type of event
Each event has the same contribution to the total number of events.
It is usually criticized that each component may contribute differently to the total counts (death is much severe event comparing with others)
The study design is usually with fixed duration. Events are collected over a fixed duration of time
The study design is usually an event-driven study. Different subjects may be followed up for different durations
Usually for events with relatively frequency
Usually for events that not frequently or rarely happen (so that we combine all these types to increase the power and minimize the sample size)
Can be analyzed as:
Frequency of events
Annualized rate of events
Time to first event
Duration of events
Duration of event free
Can be analyzed:
Time to the first event
Time to event for each component
Frequency of events
Competing risk is less an issue
Competing risk is an issue
Example of a trial with recurrent event endpoint:
Example of a trial with composite endpoint:






While the composite endpoint is usually analyzed as time to first event (whichever occurs the first for any of the components) using log rank test or Cox proportional hazard model, the recurrent event may be analyzed using different ways. Below are some examples of  

Emicizumab Prophylaxis in Hemophilia A with Inhibitors

The primary end point was the difference in the rate of treated bleeding events (hereafter referred to as the bleeding rate) over a period of at least 24 weeks between participants receiving emicizumab prophylaxis (group A) and those receiving no prophylaxis (group B) after the last randomly assigned participant had completed 24 weeks in the trial or had discontinued participation, whichever occurred first.
For all bleeding-related end points, comparisons of the bleeding rate in group A versus group B and the intraindividual comparisons were performed with the use of a negative binomial-regression model to determine the bleeding rate per day, which was converted to an annualized bleeding rate.
The primary efficacy end point was the annual rate of sickle cell–related pain crises, which was calculated as follows: total number of crises× 365 ÷ (end date − date of randomization + 1),with the end date defined as the date of the last dose plus 14 days. Annualized rates were used for the comparisons because they take into account the duration that a participant was in the trial. The crisis rate for every patient was annualized to 12 months. The annual crisis rate was imputed for patients who did not complete the trial. The difference in the annual crisis rate between the high-dose crizanlizumab group and the placebo group was analyzed with the use of the stratified Wilcoxon rank-sum test, with the use of categorized history of crises in the previous year (2 to 4 or 5 to 10 crises) and concomitant hydroxyurea use (yes or no) as strata. A hierarchical testing procedure was used (alpha level of 0.05 for high-dose crizanlizumab vs. placebo, and if significant, low-dose crizanlizumab vs. placebo).

A painful crisis was defined as a visit to a medical facility that lasted more than four hours for acute sickling-related pain (hereinafter referred to as a medical contact), which was treated with a parenterally administered narcotic (except for a few facilities in which only orally administered narcotics were used); the definition is similar to that used in a previous study. Annual rates were computed by dividing the number of crises by the number of years elapsed (e.g., 6 crises in 1.9 years - 3.16 crises per year). To test the effect of treatment on the crisis rate, the patients were ranked according to the number of crises they had had per year for observed periods of up to two years. Death was considered the worst outcome, followed by a stroke (defined as a documented new neurologic deficit lasting more than 24 hours, confirmed by a neurologist) or the institution of long-term transfusion therapy (more than four months); outcomes for all other patients were ranked according to the individual crisis rate. These ranks were used to compare the two treatment groups (Van der Waerden’s test). A rank statistic was planned for the primary analysis because it was expected to have more power to detect differences and to be less influenced by extreme values than a t-test of the means.

The primary efficacy endpoint was mean change from baseline in frequency of headache days for the 28-day period ending with week 24. A headache day was defined as a calendar day (00:00 to 23:59) when the patient reported four or more continuous hours of a headache, per the patient diary. Subsequent to study initiation, but prior to study completion and treatment unmasking, the protocol and statistical analysis plan for PREEMPT 2 was amended to change the primary and secondary endpoints, making frequency of headache days the PREEMPT 2 primary endpoint. This change was made based on several factors: availability of PREEMPT 1 data, guidance provided in newly issued International Headache Society clinical trial guidelines for evaluating headache prophylaxis in CM (34) and the earlier expressed preference of the US Food and Drug Administration (FDA), all of which supported using headache day frequency as a primary outcome measure for CM. For each primary and secondary variable, prespecified comparisons between treatment groups were done by analysis of covariance of the change from baseline, with the same variable’s baseline value as a covariate, with main effects of treatment group and medication overuse strata. The baseline covariate adjustment was prespecified as the primary analysis; sensitivity analyses (e.g., rank-sum test on changes from baseline without a baseline covariate) were also performed.

The primary outcome was the time to the first acute exacerbation of COPD, with acute exacerbation of COPD defined as “a complex of respiratory symptoms (increased or new onset) of more than one of the following: cough, sputum, wheezing, dyspnea, or chest tightness with a duration of at least 3 days requiring treatment with antibiotics or systemic steroids.” The primary analysis was based on a log-rank test of the difference between the two treatment groups in the time to the first exacerbation, with no adjustments for baseline covariates. A Cox proportional-hazards  model was used to adjust for differences in prespecified, prerandomization factors that might predict the risk of acute exacerbations of COPD.

The primary outcome was the effect of simvastatin on the exacerbation rate, which was defined as the number of exacerbations per person-year.

COPD exacerbation rates in the two study groups were compared with the use of a rate ratio. The independence of individual exacerbations was ensured by considering participants to have had two separate exacerbations if the onset dates were at least 14 days apart. Exacerbation rates in each group and the between-group differences were analyzed with the use of negative binomial regression modeling and time-weighted intention-to-treat analyses with adjustments of confidence intervals for between-participant variation (overdispersion).




FDA recommended the time to first exacerbation as the primary efficacy endpoint over the use of frequency of exacerbations as primary endpoints. The time to first exacerbation will be analyzed using log-rank test or Cos proportional hazard model.
Even though the FDA agrees that the frequency of exacerbations may be a clinically relevant endpoint; however, there are several statistical issues and challenges in providing a reliable and unbiased estimate of treatment effect using this endpoint:
  • Dependencies of exacerbation on previous exacerbations within patients
  • Effect of influential cases as it can potentially impact the results
  • Distinguishing between early vs. late exacerbations as a function of time
  • Distinguishing between first vs. subsequent exacerbations within patients
  • Investigator biases in assessing the number of events (e.g. events occurring close together)

Tuesday, November 28, 2017

Bonferroni method, alpha level partition, and gatekeeper hierarchical test strategy in Bronchiectasis clinical trials

In a recent FDA advisory committee meeting in November 16, 2017, we learned the first hand application of the various approaches for multiplicity adjustment: Single step Bonferroni method, Single step arbitrary partition of alpha level, gatekeeping - hierarchical test procedure which was discussed in one of my previous posts.

During this meeting of the Antimicrobial Drugs Advisory Committee (AMDAC), the committee considered new drug application (NDA) 209367 for ciprofloxacin dry powder for inhalation (DPI), sponsored by Bayer HealthCare Pharmaceuticals, Inc. The drug is being proposed for the reduction of exacerbations in non-cystic fibrosis bronchiectasis (NCFB) adult patients (≥18 years of age) with respiratory bacterial pathogens.

The clinical program to evaluate the safety and efficacy of ciprofloxacin DPI consisted of 2 nearly identical phase 3, randomized, multicenter, placebo-controlled trials known as RESPIRE 1 and RESPIRE 2. See table 1 below for the design information.


For both RESPIRE 1 and RESPIRE 2 studies, the primary efficacy endpoint is time to first exacerbation. Within each study, there are three treatment arms with two hypothesis tests. In order to maintain the blinding, the placebo arm is further divided into placebo for 28 days on/off treatment regimen and 14 days on/off treatment regimen. However, for analysis purpose, the placebo groups are pooled. The list of hypothesis testing and the allocated alpha are listed below. For RESPIRE 1 study, the alpha level of 0.025 for each hypothesis test is based on Bonferroni method for multiplicity adjustment. For RESPIRE 2 study, the alpha level of 0.001 and 0.049 is based on the arbitrary partition (as long as the total alpha = 0.05).  

RESPIRE 1 Study (Bonferroni method for multiplicity adjustment):
Hypothesis 1: ciprofloxacin DPI for 28 days on/off treatment regimen versus pooled placebo (alpha=0.025)
Hypothesis 2: ciprofloxacin DPI for 14 days on/off treatment regimen versus pooled placebo (alpha=0.025)
RESPIRE 2 Study (arbitrary partition of alpha level for multiplicity adjustment):
Hypothesis 1: ciprofloxacin DPI for 28 days on/off treatment regimen versus pooled placebo (alpha=0.001)
Hypothesis 2: ciprofloxacin DPI for 14 days on/off treatment regimen versus pooled placebo (alpha=0.049)
The study results indicate some efficacy, but not consistent across all four hypothesis tests. For details about the study results, please see FDA's advisory committee briefing bookstudy results for RESPIRE 1, and study results for RESPIRE 2 on clinicaltrials.gov.

The study also included a long list of the secondary efficacy endpoints. To control the overall type I error rate associated with testing primary and secondary endpoints in two treatment regimens (Cipro 14 and Cipro 28) against placebo, separate hierarchical testing sequences of primary, key secondary and other secondary endpoints were pre-specified for each regimen with statistical testing at α=0.025 for each Cipro arm in RESPIRE 1 and α=0.001 for Cipro 28 and α=0.049 for Cipro 14 in RESPIRE 2. If the primary endpoint was significant for a Cipro regimen then the next endpoint in the sequence (i.e., key secondary endpoint) was tested within that Cipro regimen. Statistical testing would only continue to the next endpoint in the hierarchy if the preceding endpoint in the hierarchy showed significance. Endpoints which could not be statistically tested were considered to be exploratory. The hierarchical testing strategy is shown in Figure 2.



Unfortunately, the hierarchical strategy did not work well and majority of the secondary endpoints were not tested because the insignificant results in primary efficacy endpoints. As mentioned in FDA's briefing book:
Under the pre-specified hierarchical strategy, confirmatory testing of the first secondary endpoint (frequency of exacerbations) against Pooled Placebo, and all subsequent endpoints, could not be performed for Cipro 28 (both trials) and for Cipro 14 (RESPIRE 2) because the respective findings for the primary endpoint of TFE were not significant. In RESPIRE 1, confirmatory testing of Cipro 14 could only be performed up to the first secondary endpoint (FOE) which failed to show significance. With the exception of a statistically significant finding observed for one comparison (i.e., Cipro 14 day vs. Pooled Placebo for the primary endpoint in RESPIRE 1), all other comparisons were considered to be exploratory or not statistically significant. As indicated in Figure 2 there was the potential for up to 32 comparisons to show statistical significance (8 endpoints in each of two Cipro arms across two trials).
FDA advisory committee was not convinced by the evidence of the ciprofloxacin DPI efficacy. Here is the voting result. It is unlikely for FDA to approve a product with such a voting result even though there is currently no approved drug for treating non-cystic fibrosis bronchiectasis.



Had a different study design and different method for multiplicity adjustment been used, the situation might be very different. The evidence for the experimental drug might be more obvious if a simpler study design was used - at least this is the situation for 14 day on/off regimen versus placebo.

We are now closely watching the fate of Aradigm's NDA for Ciprofloxacin in treating non-CF bronchiectasis. Aradigm's pivital studies (Orbit 3 and Orbit 4) are simpler in study design with one of two studies positive. One thing is for sure: there will not be the complicated situations in dealing with the multiplicity adjustment. 

References:

Saturday, November 25, 2017

Co-primary endpoints and multiple primary endpoints

In recent FDA guidance 'Multiple Endpoints in Clinical Trials' and EMA guidance 'Guideline on multiplicity issues in clinical trials', the term 'co-primary endpoints' and 'multiple primary endpoints' are clarified.

Historically, the term 'co-primary endpoints' was used for different meanings in different clinical trial protocols, statistical analysis plans, and journal articles. In many cases, the term 'co-primary endpoints' was inappropriately used for really 'multiple primary endpoints'.

Co-primary endpoints should only be used when there are more than one primary endpoint and declare the study success only if both primary endpoints are statistically significant in favor of the experimental treatment. When co-primary endpoints are used, each primary endpoint is tested at significant level of 0.05. There is no multiplicity issue involved.

In contrary, the term 'multiple primary endpoints' should be used if there are more than one primary endpoint and declare the study success if either one of the primary endpoints is statistically significant in favor of the experimental treatment. In this case, each primary endpoint is tested at a significant level determined by the method for multiplicity adjustment or simply by the partition of the alpha levels.

Here is what EMA guidance 'guideline on multiplicity issues in clinical trials' says:
If more than one primary endpoint is used to define study success, this success could be defined by a  positive outcome in all endpoints or it may be considered sufficient, if one out of a number of endpoints has a positive outcome. Whereas in the first definition the primary endpoints are designated  as co-primary endpoints, the latter case is different and would require appropriate adjustment for multiplicity. More generally, in case of more than two primary endpoints, adjustment is needed if not all endpoints need to be significant to define study success, and the inability to exclude deteriorations in other primary endpoints would have to be considered in the overall benefit/risk assessment.
In FDA's guidance 'multiple endpoints in clinical trials', the term 'co-primary endpoints' was extensively discussed and the examples of co-primary endpoints were provided. In section C of the guidance, it says:
For some disorders, there are two or more different features that are so critically important to the disease under study that a drug will not be considered effective without demonstration of a treatment effect on all of these disease features. The term used in this guidance to describe this circumstance of multiple primary endpoints is co-primary endpoints. Multiple primary endpoints become co-primary endpoints when it is necessary to demonstrate an effect on each of the endpoints to conclude that a drug is effective.
The guidance provided the following examples of co-primary endpoints where both co-primary endpoints needed to be statistically significant in order to declare the trial success:

  • A recent approach to studying treatments is to consider a drug effective for migraines only if pain and an individually-specified most bothersome second feature are both shown to be improved by the drug treatment. 
  • Drugs for Alzheimer’s disease have generally been expected to show an effect on both the defining feature of the disease, decreased cognitive function, and on some measure of the clinical impact of that effect. Because there is no single endpoint able to provide convincing evidence of both, co-primary endpoints are used. One primary endpoint is the effect on a measure of cognition in Alzheimer’s disease (e.g., the Alzheimer’s Disease Assessment Scale-Cognitive Component), and the second is the effect on a clinically interpretable measure of function, such as a clinician’s global assessment or an Activities of Daily Living Assessment.
In an article by Kantarjian et al “Decitabine improves patients outcome in myelodysplastic syndromes: results of a phase III randomized study”, the term ‘coprimary endpoints’ was incorrectly used for ‘multiple endpoints’ even though the multiplicity adjustment method (Bonferroni correction) was appropriately applied.


The coprimary endpoints in the current study were ORR and time to AML transformation or death. Response was assessed according to the International Working group (IWG) criteria……Two analyses, one interim and one final, were planned using the stopping rules of O’Brien and Fleming. The overall type 1 error rate was maintained at a maximum of 5% by applying a Bonferroni correction for the coprimary endpoints at the final analysis. A maximum P value of .024 was required to establish statistical significance using a 2-sided analysis for either of the coprimary endpoints (ORR or time to AML or Death).

In an article by McLaughlin et al "Bosentan added to sildenafil therapy inpatients with pulmonary arterialhypertension", the term of co-primary endpoints was used for a situation that 'multiple endpoints' should be used. Noticed that the original protocol used a study design with two primary endpoints with partition of alpha-level (0.04 for time to morbidity/mortality and 0.01 for change in 6MWD) as an approach for multiplicity adjustment. 
The initial assumptions for the primary end-point were an annual rate of 21% on placebo with a risk reduced by 36% (hazard ratio (HR) 0.64) with bosentan and a negligible annual attrition rate. In addition, it was planned to conduct a single final analysis at 0.04 (two-sided), taking into account the existence of a co-primary end-point (change in 6MWD at 16 weeks) planned to be tested at 0.01 (two-sided). Over the course of the study, a number of amendments were introduced based on the evolution of knowledge in the field of PAH, as well as the rate of enrolment and blinded evaluation of the overall event rate. On implementation of an amendment in 2007, the 6MWD end-point was changed from a co-primary end-point to a secondary endpoint and the Type I error associated with the single remaining primary end-point was increased to 0.05 (two-sided).


Friday, November 03, 2017

SAD and MAD: Single Ascending Dose and Multiple Ascending Dose first-in-human studies

The acronym is everywhere in clinical trials. Previously I mentioned that in 21st Century Cure Act, an acronym RAT was used for Regnerative Advanced Therapy designation – the term ‘RAT’ was criticized and later was changed to MRAT(Regenerative Medicine Advanced Therapy) in FDA’s implementations.

Now we have a pair of names SAD and MAD commonly used in early phase clinical trials. It does not mean anybody will be sad or mad. A sponsor should be happy (not SAD or MAD) when its development program can progress into the clinical trial stage.
  
SAD stands for single ascending dose and MAD stands for multiple ascending dose. SAD and MAD studies are typically the first-in-human (FIH) studies. They seek to gain information on safety and tolerability, general pharmacokinetic (PK), and pharmacodynamic (PD) characteristics, and identify the maximum tolerated dose (MTD). SAD/MAD study can also be used to test the cardiac safety and evaluate QT/QTc prolongations.

There may be a lot of dose escalation studies that belong to SAD and MAD studies even though the SAD/MAD terms are not used. For example, the popular 3+3 design is one type of the SAD/MAD study with focuses on safety and tolerability.  

SAD/MAD studies are usually conducted in healthy volunteers in clinical research unit (CRU) or phase I unit. But they can be conducted in patients when it is unethical to test the experimental drug (for example, the oncology drugs and plasma-derived drugs) in healthy volunteers. SAD/MAD studies can be combined into one study within the same study protocol or conducted as two separate studies.

For SAD studies, the starting dose is based on the pre-clinical and animal studies. For MAD studies, the starting dose is usually based on results from the SAD study.

From the PK assessment standpoint, in SAD studies, each subject receives a single dose and the series PK samples can be taken to evaluate the PK profiles after single dose. The study will be conducted on cohort basis. Subjects within each cohort receive the same level of dose. In MAD studies, each subject receives multiple doses. After the steady state is achieved, the series PK samples will be taken to evaluate the PK profiles at the steady state. The study is conducted on cohort basis. Subjects within each cohort will receive the same level of dose. With the PK results from SAD/MAD studies, dose linearity and dose proportionality can be evaluated.   

From the safety assessment standpoint, in both SAD/MAD situations, the first cohort of subjects receive the lowest dose (starting dose). Subjects are usually confined in Clinical Research Unit (CRU) with close safety monitoring. After each cohort, safety and tolerability will be assessed to determine if the next cohort with higher dose should be continued. The safety evaluation after each cohort is usually performed by the internal team within the sponsor, but can certainly be performed by the independent committee such as data and safety monitoring committee (DSMB). With the safety data, the maximum tolerated dose (MTD) may be identified.   

In SAD/MAD studies, within each cohort, placebo control can be added. Depending on whether there is a concurrent placebo control group, the SAD/MAD studies could have the following types.
  • SAD without placebo control
  • SAD with placebo control
  • MAD without placebo control
  • MAD with placebo control

When placebo group is added to the SAD/MAD study, to avoid too many subjects in placebo group for the final analysis, it is very common to use a n:1 randomization ratio within each cohort, For the final analysis, subjects in placebo group across all cohorts are pooled together.

Here are a couple of examples for SAD/MAD study designs – they are extracted from a presentation slide I made almost 20 years ago, but is still relevant:





Further Reading/References:


Thursday, October 26, 2017

NIH and FDA Release Protocol Template for Phase 2 and 3 IND/IDE Clinical Trials and e-Protocol Writing Tool

In previous article 'Clinical Trial Protocol Template',  the draft template by NIH/FDA was mentioned. 

NIH/FDA has now finalized the clinical trial protocol template.

NIH and FDA Release Protocol Template for Phase 2 and 3 IND/IDE Clinical Trials
The National Institutes of Health (NIH) and Food and Drug Administration (FDA) developed a clinical trial protocol template with instructional and example text for NIH-funded investigators to use when writing protocols for phase 2 and 3 clinical trials that require Investigational New Drug application (IND) or Investigational Device Exemption (IDE) applications.   In March 2016 a draft template was released for public comment generating nearly 200 comments from 60 respondents.  All comments were carefully considered and many were incorporated into the final template.  The agencies’ goal is to encourage and make it easier for investigators to prepare clinical trial protocols that are organized consistently and that contain all of the information necessary for the review of the protocol.  The template follows the International Conference on Harmonisation (ICH) E6 (R2) Good Clinical Practice and is available as a Word documentThe NIH also released a secure web-based e-Protocol Writing Tool that allows investigators to generate a new protocol using the NIH-FDA Phase 2 and 3 IND/IDE Clinical Trial Protocol Template. The e-Protocol Writing Tool fosters protocol writing collaboration by allowing multiple writers and reviewers to participate in the protocol development process. The e-Protocol Writing Tool allows the author to assign writers and collaborators and the tool assists the author with tracking progress and document version control.
The NIH expects to expand the development of the e-Protocol Writing Tool by adding instructional text and sample text for other types of studies, such as a behavioral and phase 1 trials. Future releases of this e-Protocol Writing Tool will have improvements and enhanced tool functionality. 

Saturday, October 21, 2017

CAR-T Gene Therapy Clinical Trials - Success Story and Beyond

Recently, two CAR-T (Chimeric Antigen Receptor T cell) gene therapies were approved by FDA. The first one is tisagenlecleucel manufactured by Novartis and is indicated for certain pediatric and young adult patients with a form of acute lymphoblastic leukemia (ALL). The second one is axicabtagene ciloleucel by Kite pharmaceutical (now part of Gilead) and it is indicated for treating adult patients with certain types of large B-cell lymphoma or certain types of non-Hodgkin lymphoma (NHL).who have not responded to or who have relapsed after at least two other kinds of treatment.

Before the approval of tisagenlecleucel, FDA summoned an advisory committee meeting for the safety concerns on July 12, 2017. The official approval came one and half months after advisory committee votes unanimously in favor of the approval.

The second CAR-T product approval came just one and half months after the first CAR-T approval and did not need to go through the advisory committee meeting process.

From the clinical trial design standpoint, both approvals were based on a pivotal study with relatively small, but sufficient sample size. Since the study was a single-arm with no control group, the results were compared with the historical control (or a commonly accepted criteria).

For Novartis’ tisagenlecleucel, the pivotal study is registered on clinicaltrials.gov as “A Phase II, Single Arm, Multicenter Trial to Determine the Efficacy and Safety of CTL019 in Pediatric Patients With Relapsed and Refractory B-cell Acute Lymphoblastic Leukemia”. The briefing book of the advisory committee meeting detailed the background information of the CART, the study design, and the results.

The primary efficacy endpoint is the overall remission rate (ORR) during the 3 months after tisagenlecleucel administration; ORR includes complete remission (CR) and complete remission with incomplete hematologic recovery (CRi), as determined by independent review committee (IRC) assessment.

The pre-specified primary efficacy endpoint tested the null hypothesis of the ORR being less than or equal to 20% against the alternative hypothesis that the ORR was greater than 20% at an overall one-sided 2.5% level of significance. The study met its primary objective if the lower bound of the 2-sided 95% exact Clopper Pearson confidence intervals (CI) for ORR was greater than 20% (note: I wrote an article about Clopper Pearson confidence interval).

The study results showed the remarkable results. ORR is 82.5% (52/63). The lower bound of the 95% exact confidence interval using Clopper Pearson method is 70.9% - way above the pre-specified criterion of 20%.

The approval of Kite’s axicabtagene ciloleucel was based on a similar study design. In clinicaltrials.gov, the study was registered as “A Phase 1-2 Multi-Center Study Evaluating the Safety and Efficacy of KTE-C19 in Subjects With Refractory Aggressive Non-Hodgkin Lymphoma (NHL)” – so called ‘ZUMA-1’ trial. The primary efficacy endpoint is ORR (Objective response rate) consisting of complete response [CR] plus partial response [PR] per the revised International Working Group (IWG) Response Criteria for Malignant Lymphoma.

The study results was announced in Kite’s press release. The results indicated that the ORR is 82% (83/101) – p value is less than 0.0001.  The 95% exact confidence interval for ORR was not provided, but with Clopper Pearson method, we can calculate the lower bound of the 95% confidence interval to be 73% - it is probably way above the pre-specified ORR (historical control if exist) for patients without CAR-T treatment. 

While we are excited about the advances in gene therapy, the long-term safety should still be followed. Both approved products carry the black box and the short term safety issue (could be life-threatening) is mainly the cytokine release syndrome (CRS).

It will be interesting to see how the cost of these CAR-T therapy will be managed and accepted by the community. The Novartis tisagenlecleucel (brand name Kymriah) is priced at $475,000 for a treatment course. Kite’s axicabtagene ciloleucel (brand name Yescarta) is priced at $373,000 for a treatment course.

The development of therapies like CAR-T requires great collaboration between the industry and the academic and government. The Novartis’s tisagenlecleucel was mainly developed by the University of Pennsylvania. Kite’s axicabtagene ciloleucel was developed by the National Cancer Institute.

There is an article in New England Journal of Medicine by Dr. Rosenbaum “Tragedy, Perseverance, and Chance — The Story of CAR-T Therapy”.

CAR-T is now very popular in China. A lot of investigational clinical trials using CAR-T therapy are ongoing in China. Someone did a search in clinicaltrials.gov and found more CAR-T clinical trials in China than US.

Both CAR-T successes are hematologic cancers, the next challenge will be to find the successful CAR-T therapy in solid tumors.


The success in CAR-T should bring us a hope that someday we can transplant the pig organs into human - an area my company is a pioneer. See the article in New York Time "Gene Editing Spurs Hope for Transplanting Pig Organs Into Humans".

Wednesday, October 18, 2017

Efficient orphan drug development - clinical trial designs in rare disease area

For a long time, drug development in rare diseases is a niche area for biotechnology companies. With the rapid advanced in genetics and ‘-omics’ and in the precision medicine era, researchers are continuously identifying new diseases or disease variants. A prevalent disease can be dissected into many small pieces, each subset could become a rare disease.
Rare diseases pose challenges in clinical trial designs. Some of the challenges are:
  • Small number of patients affected
  • Small number of patients for clinical trials
  • Heterogeneity of the disease
  • Limited understanding of the disease’s natural history
  • Lack of well-defined study endpoints
  • Limited early phase clinical trial data

There are several initiatives in US and in European countries that are focusing on the clinical trial designs and methodologies in rare disease area.

In US, the NORD (National Organization for Rare Disorders) is a patient advocacy organization dedicated to individuals with rare diseases and the organizations that serve them.  NORD, along with its more than 260 patient organization members is committed to the identification, treatment, and cure of rare disorders through programs of education, advocacy, research, and patient services. NORD’s visions include a culture of innovation that supports basic and translational research to create diagnostic tests and therapies for all rare diseases and a regulatory environment that encourages development and timely approval of safe, effective diagnostics and treatments.
“Robert Temple, deputy center director for clinical science at CDER, said that one of his long held interests at FDA has been making clinical trials more efficient.
"This is particularly important in orphan territory because there are often few patients in total and there are usually very few near the centers that want to do the studies," Temple said.
Because of the challenges inherent in identifying and enrolling patients with rare disorders in studies, Temple said that having detailed natural history data "can make a tremendous difference in identifying the manifestations you want to try to treat and identifying the patients you should include" in a study.
Another important consideration, Temple said, is whether there are design features that can be built into a trial to make it more efficient.
For instance, Temple said that cross-over studies could be done in situations where enrolling patients is difficult and the disease being studied has a transient effect.
This isn't a new idea, Temple said, pointing to a 1976 cross-over study of the synthetic steroid danazol to treat hereditary angioedema that enrolled only nine patients. In that study the patients were randomized to the drug or a placebo until they had an attack, at which point they were moved to the other study arm.
Temple also said that FDA has seen some success with doing randomized withdrawal studies, particularly in situations where a placebo-controlled arm is not feasible.
"Sometimes in the course of development you'll have a lot of people who for one reason or another have been put on the drug because it's the only game in town…with a lot of people on the drug you can sometimes, if they're willing, do a randomized withdrawal study," Temple said.
But Temple emphasized that these types of studies are only appropriate in certain circumstances, and stressed that sponsors should consider approaches early on.
Billy Dunn, director of CDER's division of neurology products, said that especially for rare diseases FDA and industry need to look at novel approaches to studying drugs.
"We're not the cardiovascular division, we're not accustomed to having these multi-thousand patient trials…many things that might seem controversial or unusual in other settings, sometimes—not always, but sometimes—those can be more run-of-the-mill for us," Dunn said.
Dunn also said that different considerations must be made for studies of progressive diseases, where different stages or forms of a disease can vary considerably. In such situations, Dunn said that sponsors need to consider progression when determining enrollment criteria and, "while difficult to operationalize, having patient individualized outcomes where we attempt to assess, for a given stage or form of a disease, those aspects…that are most impairing."
Wilson Bryan, director of the Office of Tissues and Advanced Therapies at the Center for Biologics Evaluation and Research (CBER), said that one of the challenges for his office is working with academic sponsors and small biotech companies that do not have extensive experience in drug development.
This is an especially challenging issue in the rare disease space, Bryan said, because there are very few patients with a particular disease to begin with and any inefficiency in a study can waste precious time and resources. This is also true for advanced therapies as many are disease modifying and have prolonged or permanent effects, making it difficult for patients to enroll in additional studies in the future.
"Too often we have folks come in with [investigational new drug applications] INDs and they haven't started their natural history study yet, and they haven't started thinking about what their outcome measures are going to be for Phase III," Bryan said.
When should sponsors begin thinking about these things? According to Bryan, at the very early stages of drug development and well before a product is set to begin human studies.
"When I say early on in drug development I don't mean at Phase I, I mean when you first start to do preclinical studies, think about what is the target. The target is not getting into clinical trials … the target is getting a product on the market that's of use to patients," Bryan said.
Julia Beitz, director of CDER's Office of Drug Evaluation III, said that sponsors should also make sure to establish a baseline for patients' clinical, cognitive and developmental status before beginning a study and take repeat measurements of those features throughout the trial.
Beitz also said that sponsors should try, "to the extent that is possible," to stabilize patients' conditions before starting them on a drug, especially when patients are expected to need another intervention, such as surgery or a medical device.
"If these measures are instituted during the trial it becomes difficult to assess the independent effect of the new treatment on the patient," Beitz said.

In the social media era, it is important to have a platform for patients and caregivers to communicate and engage. INSPIRE is such an organization and provides the most authentic platform for patient engagement.

A group of people are forming a DIA working group called NEED (the Nature and Extent of Evidence Needed for Decision) engaging in the discussions about the clinical trial design in rare diseases including the natural history study, historical control, innovative study designs such as adaptive design and Bayesian design,…

In European countries, there are several initiatives focusing on the clinical trial design aspects of the rare diseases or small population group trials.