Friday, December 15, 2023

Randomized start design (RSD), delayed start design, randomized withdrawal design to assess disease modification effect

In the latest Global CardioVascular Clinical Trialists (CVCT) Workshop", one of the topics was "How to assess the disease modification in pulmonary arterial hypertension". the academic and industry representatives discussed the definition of disease modification and if the various individual drugs met the criteria as disease modifiers. 

Disease modification requires that the intervention have an impact of the underlying pathology and pathophysiology of the disease. For regulatory purposes, a disease modifying effect is when an intervention delays the underlying pathological processes and is accompanied by improvement in clinical signs and symptoms of the disease. The opposite of the disease modifying effect is symptomatic improvement which is defined as "may improve symptoms but does not affect the long-term survival or outcome in the disease, for example, use of diuretics in PAH". 

For a drug to be defined as a disease modification therapy (DMT), disease modifier, or disease-modifying agent (DMA), the following criteria need to be met: 
  • a drug targeting the underlying pathophysiology
  • distinction should be made from the "symptomatic treatment" (do not affect underlying pathophysiology)
  • Can achieve the goal of remission (partial or complete)
  • endures sustained clinical benefit (referred to as DMA). 
The last criterion "endures sustained clinical benefit" is difficult to meet. The traditional randomized, controlled, parallel design will not be sufficient. Clinical trial designs like delayed start design and randomized withdrawal design are needed to assess the disease modification effect. 

A new article by Zamanian et al "Constructing the Framework for Disease Modification in Pulmonary Arterial Hypertension" attempted to define the disease modifier in PAH field and discussed designing the clinical trials to measure the disease modification effect.

Randomized Start Design or Delayed Start Design

This design was discussed in previous posts: 
Randomized withdrawal design: 

The randomized withdrawal design was extensively discussed in FDA's guidance "Enrichment Strategies for Clinical Trials to Support Determination of Effectiveness of Human Drugs and Biological Products". The following paragraphs are extracted from the guidance. 


The randomized withdrawal design was originally proposed as an approach to enrich the study and reduce the sample size. The randomized withdrawal design (if it is feasible to implement) can be used to evaluate the long-term disease-modifying effect. 

In practice, the randomized withdrawal design can be implemented after the RCT - the responders from both the experimental drug group and the placebo group are re-randomized to receive the experimental drug or placebo. This is exactly what we did in the ICE study ("Intravenous immune globulin (10% caprylate-chromatography purified) for the treatment of chronic inflammatory demyelinating polyradiculoneuropathy (ICE study): a randomised placebo-controlled trial") where the study contained a RCT portion to assess the treatment response and a re-randomized withdrawal portion to assess the relapse after the study drug withdrawal. 

The delayed start design and randomized withdrawal design have been mentioned frequently as a way to assess the disease modification effect. 

In FDA's guidance "Early Alzheimer’s Disease: Developing Drugs for Treatment", the randomized-start or randomized-withdrawal trial design was suggested:

In the FDA’s webinar on “Draft Guidance For Industry On Alzheimer’s Disease: Developing Drugs For The Treatment Of Early-Stage Disease”, the FDA presenters discussed the randomized start design or withdrawal design: 

“… If there is a significant effective treatment that couldn't serve as the basis of approval, we do not believe that that argument in and of itself does not demonstrate. This is where biomarkers come in. We learned in the trial results, the effect on up Alzheimer disease biomarker, it is still very and clear. Where the biomarker has been altered but there is no clinical effect. The clinical outcome was the opposite of what you want to see. The bottom line is that that understanding and how it relates to the clinical outcome still needs a bit of work. We would not be willing to accept the effect on the biomarker, as a basis for a circuit -- Sarah get approval. For that to be the case, it would be a fundamental itself in the disease process. In addition to biomarkers there are other ways to show disease modification, a randomized start design, or withdrawal design. These are based on clinical endpoints. These are difficult studies to design, and conduct, and interpret. We are open to use these approaches to show modification, let me show you what I mean by randomized start design. One would be on .8, the other will be on placebo, the patients on group to will be switched over to active treatment, patients in group 2 will be caught up to group 1, they will have a systematic effect of treatment. Patients that were switched to never really caught up to the first group, can argue for an effect on the disease, this is challenging to do, but we are open to the approach. It is a devastating condition, and an epidemic make -- particularly in late stages, the field is moving to conduct trials in early stages of the illness. As I pointed out they will pose regulatory challenges. We hope that's where our guidance will come in and suggest pathways forward. Thank you. I will have Russell Katz, come up and talk for the rest of the webinar.”

In one of the EMA presentations “The scientific and regulatory approaches to facilitating disease-modifying drug development and registration in a global environment”, the delayed start design (or randomized start) and randomized withdrawal design were mentioned.

 

In 2011, there was an FDA advisory committee meeting to discuss Teva's Parkinson's drug ((rasagiline mesylate)) for disease modification indication. Even though the disease modification claim was voted down), the delayed start design was confirmed to be adequate to evaluate the disease modification effect. Arterial Hypertension

Thursday, December 14, 2023

Defining 'disease modification effect', 'disease modification therapy (DMT)' or 'disease modifier'

The concept of "disease modification," "disease modification therapy (DMT)," and "disease modifier" has been a focal point in the realm of drug development for chronic diseases. The distinction reaches a heightened significance when a drug under development qualifies as a true disease modifier. Disease modification is a default for acute diseases (for example, the acute infections) and for gene therapies, transplants, some surgeries. The focus of our discussion about the disease modification is mainly for chronic diseases. 

Disease modification entails interventions or treatments designed not solely to alleviate symptoms but also to actively influence the trajectory of the disease, effectively impeding or halting its progression.

It's important to note that a unified definition for disease modification does not exist. The nuances of the term, as well as what qualifies as a disease modifier, can vary across different diseases. The understanding and criteria for disease modification may differ, reflecting the intricacies inherent to each specific medical condition.

In a review paper by Vollenhoven et al "Conceptual framework for defining disease modification in systemic lupus erythematosus: a call for formal criteria", authors put together a table summarizing various definitions for disease modification in different disease areas: 


As we are doing the clinical trials, the spectrum of treatment response can be listed as the following: 
Harm -> No Response -> Modest Response -> Strong Response -> Disease Modifying -> Cure. For most chronic diseases, the ultimate outcome of a 'cure' may not be achievable. A therapy with a disease modification effect will be desirable. 

There was a proposal to classify the disease modification into five different levels: 
Level 1: Slowing decline
Level 2: Arrest decline
Level 3: Disease improvement
Level 4: Remission
Level 5: Cure

In a review article for Alzheimer's disease, "Trial Designs Likely to Meet Valid Long-Term Alzheimer's Disease Progression Effects: Learning from the Past, Preparing for the Future", the changes in the level of functioning across time were depicted as the following. 'Slowing progression' would be considered as 'disease modification'. 


Alzheimer’s Disease: Towards a Personalized Polypharmacology Patient-Centered Approach", the following was said about the disease modification therapy in Alzheimer's disease:
A disease-modifying treatment (DMT) is defined as an intervention that produces an enduring change in the clinical progression of AD by interfering with the underlying pathophysiological mechanisms of the disease process that lead to neuronal death. Consequently, a true DMT cannot be established conclusively based on clinical outcome data alone, such a clinical effect must be accompanied by strong supportive evidence from a biomarker program.
In 2011, there was an FDA advisory committee meeting to discuss Teva's Parkingson's drug for disease modification indication. According to the FDA briefing book, to demonstrate the disease modification effect, three hypothesis tests are needed to analyze the data from the study with a delayed start design (even though the disease modification claim was voted down): 
The study was to be analyzed according to three hypotheses, in the following order: 
  • Hypothesis 1-the contrast between the slope of drug and placebo response at Week 36 (using data from weeks 12-36; Linear Mixed Model with random intercept and slope) 
  • Hypothesis 2-the contrast of scores between baseline and Week 72 (Repeated Measures) 
  • Hypothesis 3-a non-inferiority analysis of the slopes of the ES and DS patients from weeks 48-72 (Linear Mixed Model with random intercept and slope) 
The first hypothesis was designed to determine that a difference between treatments emerged in Phase 1, the second hypothesis was designed to determine that there was a difference between ES and DS patients at the end of the study, and the third hypothesis was to determine that an “absolute” difference between the ES and DS patients persisted during Phase 2 (that is, even though a difference between groups at the end of the study might have existed [what was 4 tested by Hypothesis 2], it was important to show that the two groups were not approaching each other). 
To delve into the realm of disease modification therapy research, it is imperative to establish a standardized definition for disease modification, particularly tailored to the nuances of a specific disease area. This foundational step serves as a compass guiding subsequent investigations. Following the definition, the identification of endpoints to measure the disease modification effect becomes paramount. Given the nuanced nature of disease modification effects, the conventional clinical trial designs may prove insufficient. Hence, a specialized approach involving clinical trial designs (such as delayed start design and randomized withdrawal design) with multiple hypothesis tests becomes a requisite. Such a methodological shift is essential to comprehensively capture and validate the nuanced impacts of disease modification therapies.

Sunday, December 10, 2023

Significant level versus p-value

Sometimes, the significant level and p-value are getting mixed up and confusing to some non-statisticians. It is not surprising to receive a question or request for statistician to design a study to obtain a p-value of 0.05 or 0.01. While the significant level and p-value are closed related, ,they are used in different stage of the trial - significant level is used in the study design stage and p-value is used in the analysis stage.


A significant level is usually set at 0.05 at the study design stage. After the study, data is analyzed and p-value is calculated. The p-value is then compared to the pre-specified significant level to determine if the study results is statistically significant. 

If the significant level is set at 0.01 at the study design stage, which is temped for avoiding doing two pivotal studies, it will set the unnecessary high bar for declaring the successful trial in the analysis stage. 

"The significance level," "alpha" (α), and "Type I error rate" are essentially referring to the same concept in the context of hypothesis testing. These terms are often used interchangeably and are closely related. Here's a brief explanation of each:

Significance Level (Alpha, α): The significance level is a pre-defined threshold (usually denoted as α) set by the researcher before conducting a statistical test. It represents the maximum acceptable probability of making a Type I error. Common choices for alpha include 0.05 (5%), 0.01 (1%), and others. It determines the level of stringency for the test, where a smaller alpha indicates a more stringent test.

Significant level is just one of the parameters in calculating the sample size during the study design stage. Other parameters include the effect size (assumed treatment difference), the standard deviation, statistical power (type 2 error), and alpha adjustment due to multiplicity issue, interim analyses,...

Alpha (α): Alpha is the symbol used to represent the significance level in statistical notation. When you see α, it's referring to the predetermined threshold for statistical significance.

Type I Error Rate: The Type I error rate is the probability of making a Type I error, which occurs when you reject the null hypothesis when it is actually true. The significance level (alpha) directly relates to the Type I error rate because the significance level sets the limit for how often you are willing to accept such an error. The Type I error rate is typically equivalent to the significance level (alpha), assuming the test is properly conducted.

P-value: The p-value is calculated as part of the statistical analysis after the data has been collected. It measures the strength of the evidence against the null hypothesis based on the collected data. A smaller p-value indicates stronger evidence against the null hypothesis, and a larger p-value suggests weaker evidence.

The p-value measures the strength of evidence against a null hypothesis. The p-value is the probability under the assumption of no effect or no difference (null hypothesis) of obtaining a result equal to or more extreme than what was actually observed. The 'p' stands for probability and measures how likely it is what any observed value between 0 and 1. Values close to 0 indicate that the observed difference is unlikely to be due to chance, whereas a p value close to 1 suggests that it is highly likely that the difference observed is due to chance. If the p-value is low, it suggests evidence against the null hypothesis, and then alternative hypothesis (assumption of the effect or difference) will be accepted. 

The p-value indicates how incompatible the data are with a specified statistical model constructed under a set of assumptions, together with a null hypothesis. The smaller the p-value, the greater the statistical incompatibility of the data with the null hypothesis. When we get a p-value that is greater than the pre-specified significant level, we fail to reject the null hypothesis - it means that there is insufficient evidence to reject. 

STAT national biotech reporter Damian Garde explains what p-value is:

Even though hypothesis testing and p-value have been criticized (see a previous post "Retire Statistical Significance and p-value?"), the p-value is still the primary indicator by the sponsor, regulator, medical community, and pretty much everybody to judge if a clinical trial is successful or not. 
,
Regulatory approval of a medicinal product depends on more than just a p-value. The approval depends on the totality of the evidence, the magnitude of the treatment difference, clinical significance or clinical meaningfulness, the confidence interval of the estimate, the safety profile, whether the benefit outweighs the risk.

We have seen the cases that the drug is approved even though the p-value was not statistically significant (i.e., did not reach the pre-specified significant level). See the previous post "Drugs Approved by FDA Despite Failed Trials or Minimal/Insufficient Data". We also see the cases that the drug was not approved even though the p-value was statistically significant. See the article "FDA blocks Alnylam's bid to expand Onpattro label" even though the study results were statistically significant and published in the NEJM "Patisiran Treatment in Patients with Transthyretin Cardiac Amyloidosis".

In the end, we can't retire the p-value. We relied on the p-value to measure how strong the evidence is. However, we should not be the slave of the p-value. 

Sunday, November 19, 2023

RCTs for Chinese Traditional Medicine and Botanical Drug Development

In the latest issue of JAMA (Journal of American Medical Association), Yang et al published a paper "Traditional Chinese Medicine Compound (Tongxinluo) and Clinical Outcomes of Patients With Acute Myocardial Infarction The CTS-AMI Randomized Clinical Trial". The CTS-AMI randomized clinical trial is one of the first times that a traditional Chinese medicine has been tested in a large-scale, Western-style clinical trial.

Historically, the efficacy and safety of Chinese Traditional Medicine are not based on randomized, controlled clinical trials and,  therefore, questioned by many. In the era of evidence-based medicine, researchers in China started to adopt and conduct the RCTs for Chinese Traditional Medicine. These RCTs were published primarily in journals in Chinese. It is rare for the CTS-AMI RCT study to be published in the prominent English journal, JAMA. 

In an article "Traditional Chinese Medicine Proves Effective in Modern Clinical Trial", Dr Matthew Saybolt, a cardiologist with the Hackensack Meridian Jersey Shore University Medical Center commented on the study 
"I am not aware of any other large, well-run trials like this studying traditional Chinese medicine. This is a rarely run type of study, and I congratulate the authors for their work and publication in such a prestigious medical journal. The study was well conducted with a large sample size that was well powered to measure the outcomes.
In this trial, there is clearly a benefit to patients treated with this Chinese medicine compound compared to placebo

A reduction in death, reinfarction or complications after a STEMI is a very exciting finding. We have for some time been trying to bend the curve and improve mortality and complications after STEMI. Any new therapy, if safe, that can accomplish this would be very appealing to patients and physicians alike."

Saybolt said he also observed some weaknesses in the way the study was conducted, one of which was that the participants were entirely Chinese citizens and predominantly male.

"Thus the findings may not be generalizable throughout the world or to women," he said. "Furthermore, the patients were less frequently—compared to the United States, for example—treated with traditional proven medicine after their myocardial infarctions. Therefore, the effect of the Chinese medicine may have been augmented by the lack of patient exposure to proven therapies.

"However, there was equivalent low utilization of these traditional medications in both groups," he continued. "Furthermore, the study drug Chinese medicine compound was composed of multiple plant and insect products. Thus, we do not know which component or combination of components were the active ingredients and what is the correct dose."

If Chinese Traditional Medicine Compound needs to be approved in the US or other countries, the RCTs need to be conducted in multi-national clinical trials with a broad patient population. Given that the Chinese traditional medicine Compound is extracted from herbals, the drug development program will need to follow regulatory guidance such as USFDA guidance for the industry "Botanical Drug Development". 

Recent years, psilocybin, the primary psychoactive substance in 'magic mushrooms' has been tested in clinical trials to study its effect on major depression disorder, PTSD,... the drug development process for psilocybin (if extracted from magic mushroom) will need to follow the FDA guidance ""Botanical Drug Development"" and "Psychedelic Drugs: Considerations for Clinical Investigations ".

Monday, November 13, 2023

Operationally seamless design versus inferentially seamless design

Biotech company, Aerovate Therapeutics presented a Phase 2b/Phase 3 study of their inhaled imatinib in the treatment of pulmonary arterial hypertension. The study is dubbed as IMPAHCT study and is posted on clinicaltrials.gov. The study used a so-called 'operationally seamless design' to combine the phase 2 and phase 3 studies. 

“This operationally seamless approach to the Phase 2b/Phase 3 clinical trial design for AV-101, with continued enrollment and collection of multiple endpoints, underscores Aerovate’s commitment to making new treatment options available to patients with PAH as soon as possible without compromising safety and scientific rigor,”

Traditional clinical development program includes phased clinical trials including at least one phase 2 dose-finding study and at least one phase 3 confirmatory or pivotal study. A 'seamless design' is intended to combine the studies in different phases in the hope of expediting the clinical development. When adaptive clinical trial design was initially introduced, the 'seamless phase 2/3 study
In FDA's final guidance "Adaptive Designs for Clinical Trials of Drugs and Biologics", the terms 'seamless design' or 'adaptive seamless design' was no longer used, instead, they were described under the 'Adaptations to Treatment Arm Selection' section. The term 'seamless' occurred only once throughout the guidance in the following sentence: 

In general, seamless designs that incorporate both dose selection and confirmation of efficacy of a selected dose (based on data from the entire trial) can be considered if the principles outlined in section III (Principles for Adaptive Designs) are followed.

Operationally Seamless (or simply Seamless Design)

A "Seamless Design" combines two separate trials (individual Phase 2 and Phase 3 trials) into one trial. "Operationally seamless design" specifically refers to the strategic and efficient organization of various aspects of the clinical trial process to ensure smooth and effective operations. Clinical trials are complex endeavors involving multiple stages, from protocol development and participant recruitment to data collection, analysis, and reporting. Operationally seamless design in this context aims to optimize these processes for enhanced efficiency and effectiveness.

Key features of operationally seamless design in clinical trials may include:

Separate Statistical Analyses: Statistical analyses for Phase 2 and Phase 3 trials are separated, not combined. The data from the Phase 2 study will not be included in the statistical analysis of the Phase 3 data - therefore, the issues with multiplicity adjustment, handling of the immature data during the interim analysis, ... may be avoided. 

Integrated Workflows: Streamlining the various stages of the clinical trial, from patient recruitment to data collection and analysis, to minimize delays and improve overall trial efficiency.

Technology Integration: Leveraging technology solutions for data management, patient tracking, and communication to enhance the overall efficiency of the trial. This may involve using electronic data capture (EDC) systems, remote monitoring tools, and other technologies.

Collaboration and Communication: Fostering effective communication and collaboration among different stakeholders, including researchers, sponsors, regulatory bodies, and clinical sites, to ensure a cohesive and coordinated approach.

Patient-Centric Approaches: Implementing strategies that prioritize the experience of trial participants, making it easier for them to participate and comply with the trial requirements. This might involve the use of telemedicine, remote monitoring, or other patient-centric technologies.

Regulatory Compliance: Ensuring that the trial design and operations adhere to regulatory requirements, which helps in avoiding delays and ensuring the validity and reliability of the trial results.

Risk Management: Proactively identifying and managing potential risks throughout the trial to mitigate issues that could impact the overall progress and success of the study.

In summary, operationally seamless design in clinical trials is about creating a well-integrated and efficient process from the planning stages through to the conclusion of the trial. This approach aims to improve the quality of clinical trial data, reduce operational costs, and accelerate the development of new treatments.

The IMPAHCT study mentioned above is a operationally seamless design and the study design is described as the following: 

PART 1 (Phase 2b): Part 1, which is the Phase 2b portion of the trial, will assess the safety, tolerability, and efficacy of three twice-daily doses (10, 35, or 70 mg) of AV-101 against placebo and establish an optimal dose for Phase 3. The primary endpoint for this part is change in pulmonary vascular resistance (PVR) after 24 weeks compared to placebo. (note: it is said that approximately 200 patients (50 per arm) will be enrolled in this part of the study)

PART 2 (Intermediate, Phase 3): PART 2 begins immediately following enrollment of the last participant in the Phase 2b part of the trial and signifies the start of enrollment in the Phase 3 trial. Part two uses the same dosing as in the Phase 2b part of the trial with participants randomized across three AV-101 doses and placebo. Enrollment in part two will continue until the optimal AV-101 dose is selected based on results from the Phase 2b analysis.

PART 3 (Phase 3): This part of the trial will start once an optimal dose of AV-101 has been selected based on the Phase 2b results. All patients enrolling during this part of Phase 3 will be randomized to either the optimal dose of AV-101 or placebo. The primary endpoint for Phase 3 is change in six-minute walk distance (6MWD) at 24 weeks for the optimal dose of AV-101 compared to placebo.

In order to be operational seamless, the IMPAHCT employed a PART 2 (intermediate, Phase 3) portion of the study. During this stage, all patients for Phase 2b portion of the study have been enrolled, but are being followed up for reaching the endpoint (24 weeks) and for conducting the interim analyses. In PART 2 (intermediate, Phase 3) stage, patients are still randomized to one of three dose arms or placebo. The data collected from two active arms that are not selected for PART 3 will not be included in the final analyses and will be wasted. For example, if the 35 mg BID dose is selected as the optimal dose for PART 3, the final analyses will be comparing the 35 mg BID dose with the Placebo. The data from the 10 mg BID and 70 mg BID treatment arms will be wasted. Suppose it takes 28 weeks from the last patients randomized in Phase 2b to the sponsor establishing the optional dose for Phase 3, if additional 100 patients were enrolled into the PART 2 (Intermediate, Phase 3), 50 of these patients (not in the optimal dose or placebo) will be wasted. 

With the operationally seamless design, the efficiency is sacrificed for the speed.  


Inferentially Seamless Design (Adaptive Seamless Design)

An adaptive seamless design makes use of information (data) from patients enrolled before and after adaptation (pulls together data collected in both the Phase 2 and Phase 3 trials) in the final analysis. 
The primary purpose of using the adaptive seamless design is to combine both the dose selection and confirmation phases into one trial, so information from the learning stage (Phase 2) can be combined with the confirmatory analyses of Phase 3.

"Inferentially seamless design" refers to an approach that emphasizes the seamless integration of data and statistical methodologies to derive meaningful inferences and insights throughout the course of the trial. The goal is to enhance decision-making by continuously analyzing data, drawing inferences, and adapting the trial design based on emerging findings.

Adaptive seamless design is one of the adaptive designs. The typical setting for adaptive seamless design (inferentially seamless design) is to start the study with multiple active dose groups. At the interim analysis at the end of Phase 2 or close to the end of Phase 2, the data monitoring committee will review the unblinded data accumulated so far to select an optimal dose for Phase 3 portion of the study. The phase 2 data from the selected dose group and the placebo group will be included in the final analyses and contribute to the inferential analyses. The appropriate statistical methods need to be applied. For example, Grifols is conducting a phase 2/3 study with adaptive seamless design "Study of the Efficacy and Safety of Immune Globulin Intravenous (Human) Flebogamma® 5% DIF in Patients With Post-polio Syndrome (FORCE)" where a method proposed by Posch et al "Testing and estimation in flexible group sequential designs with adaptive treatment selection" was used to combine the data from phase 2 and phase 3.

The inferentially seamless design can be illustrated as the following (cited from the paper by Maca et al "Adaptive Seamless Phase II/III Designs— Background, Operational Aspects, and Examples"




Putting them side by side, here is the table to compare the operationally seamless design and the inferentially seamless design: 

Operationally Seamless Design

Inferentially Seamless Design

Integration of processes and systems for smooth operation.

Integration of data and insights for seamless decision-making.

Not an adaptive design - since the data from the phase 2 study is not included in the inferential analysis

One type of adaptive designs - Adaptations to Treatment Arm Selection

Emphasizes operational efficiency and workflow integration.

Focuses on integrating data and deriving meaningful insights.

Data collected from the phase 2 and phase 3 portions of the study is analyzed separately and separate clinical study reports may be written

Data collected from phase 2 the selected arms (the selected dose and the placebo arm) is combined into the phase 3 portion of the study and contributed to the final inferential analyses

Data collected from the phase 2 portion of the study is not included in the final inferential analyses

Overall sample size is larger since the data from Phase 2 and the data from the unselected arms in Phase 3 will not be used in the final inferential analyses.

Overall sample size is smaller since the data from some patients in the phase 2 portion of the study are used in the final inferential analyses

No pre-specified rules for dropping the inferior arms or selecting the optimal dose arm for confirmatory portion of the study

Pre-specified rules for dropping the inferior arms or selecting the optimal arm for the confirmatory portion of the study.

Dose selection is at the hand of the sponsor (the sponsor can unblind the phase 2 portion of the study data upon the completion)

Dose selection is implemented through the data monitoring committee (only they can review the unblinded data). 

No need to deal with the multiplicity issue since the Phase 2 data is not used in the final analyses

Multiplicity issue needs to be considered and alpha needs to be adjusted since the portion of the phase 2 data will be included in the final analyses

Overrunning issues and handling of immature data (i.e. incomplete data at the time of data cut for interim analysis for patients who haven't reached the study endpoint) need to be considered



Additional reading: 

Friday, November 03, 2023

Walk Distance versus Timed Walk - endpoints for measuring patients' function in clinical trials

Drug development is now moved to the patient-focused era. Patient-focused drug development (PFDD) is a systematic approach to help ensure that patients’ experiences, perspectives, needs, and priorities are captured and meaningfully incorporated into drug development and evaluation. With PFDD, the endpoint or outcome measure needs to be meaningful, it should reflect or describe how the patient feels, functions, and survives.

Various tools can be used to measure functions. The most commonly used functional measure may be the measure of the walk distance or the walking speed.

Measuring the walk distance by fixing the time: six-minute walk test (6MWT) to measure the distance the patient can walk in six minutes (6MWD), two-minute walk test (2MWT) to measure the distance the patient can walk in two minutes (2MWD). 6MWD and 2MWD can provide a functional, therapeutic response and prognostic data that is valuable in the care of patients with respiratory, cardiac, and neurological diseases. It can also be used as the endpoint for clinical trials to evaluate the treatment differences.

There is also a 10-minute walk test to measure fatigability and walking economy, but it is not commonly used as primary efficacy endpoint in clinical trials. 

Measuring the time/speed by fixing the distance: The 10-meter Walk Test is a performance measure used to assess walking speed in meters per second over a short distance. It can be employed to determine functional mobility, gait, and vestibular function. Timed 25 Foot Walk (T25FW) is a quantitative mobility and leg function performance test based on a timed 25-walk. The time to complete 25-foot walk can be used to calculate the walk speed (ft/s). 


Examples/applications:

6MWT/6MWD

The 6MWT is a sub-maximal exercise test used to assess exercise capacity and endurance. The distance covered over a time of 6 minutes is used as the outcome by which to compare changes in performance capacity. There is a specific guideline developed by ATS (American Thoracic Society): Guidelines for the Six-Minute Walk Test.

6MWT/6MWD was the primary efficacy outcome measure in pivotal studies in pulmonary arterial hypertension (PAH) and pulmonary hypertension associated with interstitial lung disease (PH-ILD).

Bridgebio had a pivotal study to assess the efficacy and safety of the acoramidis in treatment of
ATTRibute-CM (a heart disease) with two parts: 6MWD was the primary efficacy endpoint for part 1 of the study and

Part 1 of the study failed to demonstrate the treatment difference in 6MWD

Part 2 of the study successfully demonstrate the treatment difference in win ratio in clinical events (deaths and cardiovascular related hospitalization)

Alnylam's pivotal study (APOLLO-B study) demonstrated the statistical significant difference in primary efficacy endpoint of 6MWD at week 52. However, the magnitude of the treatment difference was merely 14.7 meters. The study results were published in New England Journal of Medicine and had a positive vote in favor of the approval by the Advisory Committee, however, FDA declined the approval

6MWT/6MWD may also be used in neurology diseases, for example: The 6-minute walk test and other endpoints in Duchenne Muscular Dystrophy: longitudinal natural history observations over 48 weeks from a multicenter study


2MWT/2MWD:

Both 6MWT and 2MWT are clinical assessments to evaluate a patient's functional capacity and endurance, particularly in individuals with cardiopulmonary or musculoskeletal conditions. Compared to 6MWT/6MWD, 2MWT/2MWD was less commonly used in clinical trials. However, 2MWT is a shorter, more focused test designed to quickly assess walking capacity and is often used in situations where a shorter test is preferred. 2MWT can be conducted in a smaller space, making it more suitable for clinics or confined settings. 2MWT is particularly useful for assessing functional capacity in situations where time constraints or physical limitations may necessitate a shorter test, and offers a quicker assessment of walking capacity and can be used for patients who may have difficulty completing a longer test.

Two- and 6-minute walk tests assess walking capability equally in neuromuscular diseases

Grifols conducted a pivotal study to assess the efficacy of IGIV in the treatment of post-polio syndrome and used 2MWT as the primary efficacy endpoint. The study is still ongoing. 

MedDay Pharmaceuticals SA conducted a phase 3 study "MD1003-AMN MD1003 in Adrenomyeloneuropathy" with 2MWD as primary efficacy endpoint

Adamas Pharmaceuticals conducted a phase 3 study "Safety and Efficacy of ADS-5102 in Multiple Sclerosis Patients With Walking Impairment" where 2MWT used as the secondary endpoint (T25FW used as the primary endpoint)

The 10 Metre Walk Test


Sarepta Therapeutics recently released their confirmatory study results of gene therapy for the treatment of DMD (Duchenne Muscular Disease) and 10-meter walk test was one of the secondary efficacy endpoints. The 10-meter walk test results are shown here. The treatment differences are expressed in time (seconds). While all treatment differences are statistically significant, the clinical meaningfulness needs to be vetted by the experts and the regulators. 



Timed 25 Foot Walk (T25FW)


The T25FW is a quantitative mobility and leg function performance test based on a timed 25-walk. The patient is directed to one end of a clearly marked 25-foot course and is instructed to walk 25 feet as quickly as possible, but safely. The time is calculated from the initiation of the instruction to start and ends when the patient has reached the 25-foot mark. The task is immediately administered again by having the patient walk back the same distance. Patients may use assistive devices when doing this task.

The drug AMPYRA® (dalfampridine) was approved for improving walking in adult patients with multiple sclerosis (MS). The drug label stated that the T25FW was the primary efficacy endpoints:

The primary measure of efficacy in both trials was walking speed (in feet per second) as measured by the Timed 25-foot Walk (T25FW), using a responder analysis. A responder was defined as a patient who showed faster walking speed for at least three visits out of a possible four during the double-blind period than the maximum value achieved in the five non-double-blind no treatment visits (four before the double-blind period and one after). 

Acorda Therapeutics conducted phase 3 studies "Study of Fampridine-SR Tablets in Multiple Sclerosis Patients" and "Study of Oral Fampridine-SR in Multiple Sclerosis" where T25FW was used as the primary efficacy measure.

The paper by Cohen et al "A Phase 3, double-blind, placebo-controlled efficacy and safety study of ADS-5102 (Amantadine) extended-release capsules in people with multiple sclerosis and walking impairment" stated the following:
Walking speed ft/s was used for T25FW since walking speed is more normally distributed as compared to walking time, and is therefore a preferred approach. A 20% change in T25FW is considered a meaningful change in patients with MS


All four measures discussed above (6MWD, 2MWD, 10-meter walk test, T25FW) can be an acceptable endpoint for confirmatory trials. Which measure to use in a specific trial depends on the indication and the study population. The endpoint selection should be discussed with the review division of regulatory agencies such as FDA. 

Friday, October 20, 2023

Human Challenge Study Design in Action - a Dengue Fever vaccine trial

A human challenge study, also known as a controlled human infection model (CHIM), is a type of clinical research study in which healthy volunteers are intentionally exposed to a specific pathogen (such as a virus, bacterium, or parasite) under controlled conditions. The primary goal of these studies is to better understand the pathogen's behavior, the human immune response to it, and to test the effectiveness of potential treatments, vaccines, or preventive measures. Human challenge studies can provide valuable insights into disease progression, immunity, and treatment efficacy in a controlled and ethical manner.

These studies are typically conducted under strict ethical and safety guidelines to minimize the risk to participants. Participants are closely monitored, and their informed consent is obtained. Human challenge studies have been used to study a variety of diseases, including influenza, malaria, Dengue fever, and COVID-19, among others. They play a crucial role in advancing medical and scientific knowledge and can accelerate the development of treatments and vaccines.

A human challenge study was mentioned as an alternative clinical trial design at the beginning of the COVID-19 pandemic when the world was desperate to find an effective and safe vaccine. I wrote an article about this: "Human Challenge Study Design for Covid-19 Vaccine Clinical Trials?"

Just this morning, Janssen Announces Promising Antiviral Activity Against Dengue in a Phase 2a Human Challenge Model. The results were from a phase 2a study titled "A Phase 2a, Randomized, Double-blind, Placebo Controlled Trial to Evaluate the Antiviral Activity, Safety, and Pharmacokinetics of Repeated Oral Doses of JNJ-64281802 Against Dengue Serotype 3 Infection in a Dengue Human Challenge Model in Healthy Adult Participants" that was posted on clinicaltrials.gov. Unfortunately, the clinical trial registration did not contain any description of the 'Challenge' part (i.e., how the healthy volunteers are exposed to the infectious agents (in this case, the Dengue virus). We will just need to wait for the formal publication of the study to know the details. 

In a paper by Porter et al "A human Phase I/IIa malaria challenge trial of a polyprotein malaria vaccine", the whole details about the human challenge study including the 'challenge' part were discussed. The 'sporozoite challenge' to the healthy volunteers was described below: 

 

Friday, October 13, 2023

Drugs Approved by FDA Despite Failed Trials or Minimal/Insufficient Data

I have been trying to collect the cases of that drugs were approved by the FDA despite the failed trials or minimal/insufficient data. For drugs treating rare diseases or diseases with unmet medical needs, the FDA may apply flexibility in approving the drug with loosened criteria. 

For diseases with clearly unmet medical needs such as ALS (Amyotrophic Lateral Sclerosis) and Alzheimer's disease, FDA officials have recently emphasized the urgent need for new treatments and pledged to use maximum "regulatory flexibility" when reviewing the NDA/BLA packages. By applying the maximum "regulatory flexibility", FDA has approved some drugs which do not meet the agency's traditional approval standards. Some of the approvals are really controversial and make me wonder if there is any boundary for the maximum "regulatory flexibility". 

The following paper on BioSpace.com listed six drugs that earned FDA approval without substantial evidence of effectiveness.

6 Drugs Approved Despite Failed Trials or Minimal Data
  • Ipsen’s Sohonos (palovarotene) for the ultra-rare genetic disease fibrodysplasia ossificans progressive (FOP)
  • Sarepta’s Elevidys as the first gene therapy for Duchenne muscular dystrophy (DMD)
  • Biogen's Qalsody (tofersen) to treat patients with superoxide dismutase 1 (SOD1)-ALS, a rare subtype of the fatal neurodegenerative disease
  • Biogen and Eisai got the nod for Aduhelm (aducanumab) for Alzheimer's diease,
  • Jazz Pharmaceuticals and PharmaMar’s Zepzelca (lurbinectedin) for small cell lung cancer (SCLC) that had progressed on or after platinum-based chemotherapy
  • Acadia Pharmaceuticals’ Nuplazid (pimavanserin) to treat hallucinations and delusions associated with psychosis in Parkinson’s disease.
Some of the approvals gave the sponsors the false hope that an innovative drug could be approved by the FDA even if the study failed to demonstrate the effectiveness as long as the drug was for the treatment of diseases with urgent unmet medical needs. A recent story about BrainStorm's ALS drug is exactly the case about this. 

Friday, October 06, 2023

MCID (Minimum Clinical Important Difference) for 6MWD - how low can we go?

I went back to watch the FDA CRDAC (Cardiovascular and Renal Drugs Advisory Committee) meeting to discuss Alnylam's drug Patisiran for the treatment of ATTR-CM (Transthyretin Amyloidosis) - a rare form of heart disease. The meeting discussion was centered on the clinical meaningfulness of the efficacy measures in the primary efficacy endpoint of 6MWD (how many meters patients can walk in 6 minutes) and the secondary endpoint of KCCQ - a patient-reported quality of life measure. 

The sponsor, Alnylam, conducted a phase III study called "APOLLO-B: A Study to Evaluate Patisiran in Participants With Transthyretin Amyloidosis With Cardiomyopathy (ATTR Amyloidosis With Cardiomyopathy)". The study results showed statistically significant differences in 6MWD and in KCCQ total score. However, the magnitude of the treatment differences was very small: 14.7 meters in 6MWD and 3.7 points in KCCQ at month 12.

To judge if the treatment difference is clinically meaningful, people will compare the magnitude of the treatment differences from the study with the MCID (minimal clinically important difference). MCID. MCID is the smallest change in a treatment outcome that individual patients would identify as important and which would indicate a change in the patients' management.  The MCID is a patient-centered concept that captures both the magnitude of the improvement and the value patients place on the change.  In other words, the MCID is the smallest amount of change in the score of a scale recognized by the patient without considering the side effects and cost. 

In FDA's briefing book for CRADAC meeting, FDA casted doubts about the Patisiram's efficacy: 
The 6MWT, a performance outcome (PerfO), is a practical simple test that measures the distance that a patient can quickly walk on a flat, hard surface in a period of 6 minutes (the 6MWD). It evaluates the global and integrated responses of all the systems involved during exercise. The results of the APOLLO-B trial showed a statistically significant but small treatment effect for the primary efficacy endpoint. Subjects treated with patisiran experienced an average decrease in their 6MWD of 13 m at Month 12 from an average 6MWD of 361 m at baseline, while subjects in the placebo arm experienced an average decrease in their 6MWD of 31 m at Month 12 from an average 6MWD of 375 m at baseline. The change from baseline at Month 12 in 6MWT (Hodges-Lehmann [HL] estimate of median difference) for patisiran vs. placebo was 14.7 m (95% confidence interval [CI] 0.7, 28.7; p-value 0.04). Literature has reported a range of meaningful differences (22 to 90 m) reflective of the heterogeneity in cardiomyopathy patients (Mathai et al. 2012; Shoemaker et al. 2012).
 The KCCQ, a patient-reported outcome (PRO) and a disease-specific measure for HF, is a 23-item self-administered questionnaire developed to measure the patient’s perception of their health status, which includes heart failure symptoms, impact on physical and social function, and how heart failure impacts their quality of life (QOL) within a 2-week recall period. The KCCQ-OSS has a 0-100 transformed score range where higher scores reflect better health status (based on the Physical Limitation, Symptom Frequency, Symptom Burden, Quality of Life and Social Limitations Domain Scores). In the APOLLO-B trial,the treatment effect for the first secondary efficacy endpoint, change from baseline at Month 12 in KCCQ-OSS was small (3.7 points on a 0 to 100 transformed score range; 95% CI 0.2, 7.2; p-value 0.04). On average, subjects treated with patisiran had an increase in KCCQ-OSS of 0.3 points at Month 12 from the average baseline score of 69.8 points, while subjects in the placebo arm had a decrease in KCCQ-OSS of 3.4 points at Month 12 from the average baseline score of 70.3 points.  

Sponsor, Alnyam's briefing book and presentation spent a lot of effort to defend that the small, but statistically significant treatment differences are clinically meaningful. 


Sponsor attempted to derive an MCID using KCCQ category as an anchor based on the data from the study itself (APOLLO-B study). 


Not surprisingly, the MCID they generated were much smaller (MCID in the range of 7 - 8 meters) than the MCIDs reported in the literature. If the MCID is indeed in the range of 7 - 8 meters, the 14.7 meters (treatment difference observed in Apollo-B study) would be clinically meaningful. 



During the FDA advisory committee meeting, most of the members were not convinced by sponsor's presentation to defend the clinical meaningfulness of  small treatment difference in 6MWD (about 14 meter). However, majority of them (9-3) still voted in favor of the Patisiran's efficacy and the benefit-risk profile. 

Pfizer's tafamidis is the only approved drug for the treatment of ATTR-CM. According to the product label, the treatment difference in 6MWD was much larger - 76 meters with 95% confidence interval 58, 94 meters at month 30. 

For the same 6MWD, the MCID may be different depending on the treating diseases, different patient population, whether patients receiving the background therapies,... However, a treatment difference of 14 meters is still not a convincing number to be clinical meaningful. Putting on the relative scale, the 14 meters in patients with baseline 6MWD 361 meter is less than 5%. It is difficult to convince people a treatment difference less than 5% is clinically meaningful. 

I am particularlly interested in the MCID of 6MWD in lung diseases (especially the pulmonary arterial hypertension). 

Anne E. Holland (2014) "An official European Respiratory Society/American Thoracic Society technical standard: field walking tests in chronic respiratory disease" stated
“Available evidence suggests a minimal important difference (MID) of 30 m for the 6MWD in adults with chronic respiratory disease.”
Jude Moutchia (2023) "Minimal Clinically Important Difference in the 6-minute-walk Distance for Patients with Pulmonary Arterial Hypertension" found:
The minimal clinically important difference in the derivation sample was 33 meters (95% confidence interval, 27–38), which was almost identical to that in the validation sample (36 m [95% confidence interval, 29–43]). The minimal clinically important difference did not differ by age, sex, race, pulmonary hypertension etiology, body mass index, use of background therapy, or World Health Organization functional class.

Here is a table containing some literatures with estimated MCID. The MCID was found to be in the range of 20 - 54 meters depending on the indication/disease. 

Study/Article

Indication/Disease

MCID Range

MCID Midpoint

Chan (2015)

ARF

20 – 30

25

du Bois (2011)

IPF

24 – 45

35

Gilbert (2009)

PAH

41

41

Granger (2015)

Lung Cancer

22 – 42

32

Holland (2009)

DPLD/IPF

29 – 34

32

Holland (2010)

COPD

25

25

Mathai (2012)

PAH

33

33

Nathan (2015)

IPF

22 – 37

30

Polkey (2013)

COPD

30

30

Puhan (2008)

COPD

35

35

Puhan (2011)

COPD

24 – 28

26

Redelmeier (1997)

CLD

54

54

Swigris (2010)

IPF

28

28


Latest update: 

In the end, the FDA did not approve Patisiran for the treatment of ATTR-CM  because the treatment difference in 6MWD was too small (way below the MCID)  and not clinically meaningful even though the FDA advisory committee voted in favor of the Patisiran's benefit and there was no issue with the safety and the manufacturing. 

Alnylam Announces Receipt of Complete Response Letter from U.S. FDA for Supplemental New Drug Application for Patisiran for the Treatment of the Cardiomyopathy of ATTR Amyloidosis

 "In its Complete Response Letter (CRL), the regulator said that Alnylam had not provided enough evidence of the therapy’s benefit in the proposed indication. At the same time, the FDA did not flag any problems with patisiran’s clinical safety, drug quality, manufacturing processes or study conduct.

“The CRL indicated that the clinical meaningfulness of patisiran’s treatment effects for the cardiomyopathy of ATTR amyloidosis had not been established,” according to the company’s announcement. In light of the rejection, Alnylam will no long work toward an expanded label for Onpattro in the U.S."