Friday, December 15, 2023

Randomized start design (RSD), delayed start design, randomized withdrawal design to assess disease modification effect

In the latest Global CardioVascular Clinical Trialists (CVCT) Workshop", one of the topics was "How to assess the disease modification in pulmonary arterial hypertension". the academic and industry representatives discussed the definition of disease modification and if the various individual drugs met the criteria as disease modifiers. 

Disease modification requires that the intervention have an impact of the underlying pathology and pathophysiology of the disease. For regulatory purposes, a disease modifying effect is when an intervention delays the underlying pathological processes and is accompanied by improvement in clinical signs and symptoms of the disease. The opposite of the disease modifying effect is symptomatic improvement which is defined as "may improve symptoms but does not affect the long-term survival or outcome in the disease, for example, use of diuretics in PAH". 

For a drug to be defined as a disease modification therapy (DMT), disease modifier, or disease-modifying agent (DMA), the following criteria need to be met: 
  • a drug targeting the underlying pathophysiology
  • distinction should be made from the "symptomatic treatment" (do not affect underlying pathophysiology)
  • Can achieve the goal of remission (partial or complete)
  • endures sustained clinical benefit (referred to as DMA). 
The last criterion "endures sustained clinical benefit" is difficult to meet. The traditional randomized, controlled, parallel design will not be sufficient. Clinical trial designs like delayed start design and randomized withdrawal design are needed to assess the disease modification effect. 

A new article by Zamanian et al "Constructing the Framework for Disease Modification in Pulmonary Arterial Hypertension" attempted to define the disease modifier in PAH field and discussed designing the clinical trials to measure the disease modification effect.

Randomized Start Design or Delayed Start Design

This design was discussed in previous posts: 
Randomized withdrawal design: 

The randomized withdrawal design was extensively discussed in FDA's guidance "Enrichment Strategies for Clinical Trials to Support Determination of Effectiveness of Human Drugs and Biological Products". The following paragraphs are extracted from the guidance. 


The randomized withdrawal design was originally proposed as an approach to enrich the study and reduce the sample size. The randomized withdrawal design (if it is feasible to implement) can be used to evaluate the long-term disease-modifying effect. 

In practice, the randomized withdrawal design can be implemented after the RCT - the responders from both the experimental drug group and the placebo group are re-randomized to receive the experimental drug or placebo. This is exactly what we did in the ICE study ("Intravenous immune globulin (10% caprylate-chromatography purified) for the treatment of chronic inflammatory demyelinating polyradiculoneuropathy (ICE study): a randomised placebo-controlled trial") where the study contained a RCT portion to assess the treatment response and a re-randomized withdrawal portion to assess the relapse after the study drug withdrawal. 

The delayed start design and randomized withdrawal design have been mentioned frequently as a way to assess the disease modification effect. 

In FDA's guidance "Early Alzheimer’s Disease: Developing Drugs for Treatment", the randomized-start or randomized-withdrawal trial design was suggested:

In the FDA’s webinar on “Draft Guidance For Industry On Alzheimer’s Disease: Developing Drugs For The Treatment Of Early-Stage Disease”, the FDA presenters discussed the randomized start design or withdrawal design: 

“… If there is a significant effective treatment that couldn't serve as the basis of approval, we do not believe that that argument in and of itself does not demonstrate. This is where biomarkers come in. We learned in the trial results, the effect on up Alzheimer disease biomarker, it is still very and clear. Where the biomarker has been altered but there is no clinical effect. The clinical outcome was the opposite of what you want to see. The bottom line is that that understanding and how it relates to the clinical outcome still needs a bit of work. We would not be willing to accept the effect on the biomarker, as a basis for a circuit -- Sarah get approval. For that to be the case, it would be a fundamental itself in the disease process. In addition to biomarkers there are other ways to show disease modification, a randomized start design, or withdrawal design. These are based on clinical endpoints. These are difficult studies to design, and conduct, and interpret. We are open to use these approaches to show modification, let me show you what I mean by randomized start design. One would be on .8, the other will be on placebo, the patients on group to will be switched over to active treatment, patients in group 2 will be caught up to group 1, they will have a systematic effect of treatment. Patients that were switched to never really caught up to the first group, can argue for an effect on the disease, this is challenging to do, but we are open to the approach. It is a devastating condition, and an epidemic make -- particularly in late stages, the field is moving to conduct trials in early stages of the illness. As I pointed out they will pose regulatory challenges. We hope that's where our guidance will come in and suggest pathways forward. Thank you. I will have Russell Katz, come up and talk for the rest of the webinar.”

In one of the EMA presentations “The scientific and regulatory approaches to facilitating disease-modifying drug development and registration in a global environment”, the delayed start design (or randomized start) and randomized withdrawal design were mentioned.

 

In 2011, there was an FDA advisory committee meeting to discuss Teva's Parkinson's drug ((rasagiline mesylate)) for disease modification indication. Even though the disease modification claim was voted down), the delayed start design was confirmed to be adequate to evaluate the disease modification effect. Arterial Hypertension

Thursday, December 14, 2023

Defining 'disease modification effect', 'disease modification therapy (DMT)' or 'disease modifier'

The concept of "disease modification," "disease modification therapy (DMT)," and "disease modifier" has been a focal point in the realm of drug development for chronic diseases. The distinction reaches a heightened significance when a drug under development qualifies as a true disease modifier. Disease modification is a default for acute diseases (for example, the acute infections) and for gene therapies, transplants, some surgeries. The focus of our discussion about the disease modification is mainly for chronic diseases. 

Disease modification entails interventions or treatments designed not solely to alleviate symptoms but also to actively influence the trajectory of the disease, effectively impeding or halting its progression.

It's important to note that a unified definition for disease modification does not exist. The nuances of the term, as well as what qualifies as a disease modifier, can vary across different diseases. The understanding and criteria for disease modification may differ, reflecting the intricacies inherent to each specific medical condition.

In a review paper by Vollenhoven et al "Conceptual framework for defining disease modification in systemic lupus erythematosus: a call for formal criteria", authors put together a table summarizing various definitions for disease modification in different disease areas: 


As we are doing the clinical trials, the spectrum of treatment response can be listed as the following: 
Harm -> No Response -> Modest Response -> Strong Response -> Disease Modifying -> Cure. For most chronic diseases, the ultimate outcome of a 'cure' may not be achievable. A therapy with a disease modification effect will be desirable. 

There was a proposal to classify the disease modification into five different levels: 
Level 1: Slowing decline
Level 2: Arrest decline
Level 3: Disease improvement
Level 4: Remission
Level 5: Cure

In a review article for Alzheimer's disease, "Trial Designs Likely to Meet Valid Long-Term Alzheimer's Disease Progression Effects: Learning from the Past, Preparing for the Future", the changes in the level of functioning across time were depicted as the following. 'Slowing progression' would be considered as 'disease modification'. 


Alzheimer’s Disease: Towards a Personalized Polypharmacology Patient-Centered Approach", the following was said about the disease modification therapy in Alzheimer's disease:
A disease-modifying treatment (DMT) is defined as an intervention that produces an enduring change in the clinical progression of AD by interfering with the underlying pathophysiological mechanisms of the disease process that lead to neuronal death. Consequently, a true DMT cannot be established conclusively based on clinical outcome data alone, such a clinical effect must be accompanied by strong supportive evidence from a biomarker program.
In 2011, there was an FDA advisory committee meeting to discuss Teva's Parkingson's drug for disease modification indication. According to the FDA briefing book, to demonstrate the disease modification effect, three hypothesis tests are needed to analyze the data from the study with a delayed start design (even though the disease modification claim was voted down): 
The study was to be analyzed according to three hypotheses, in the following order: 
  • Hypothesis 1-the contrast between the slope of drug and placebo response at Week 36 (using data from weeks 12-36; Linear Mixed Model with random intercept and slope) 
  • Hypothesis 2-the contrast of scores between baseline and Week 72 (Repeated Measures) 
  • Hypothesis 3-a non-inferiority analysis of the slopes of the ES and DS patients from weeks 48-72 (Linear Mixed Model with random intercept and slope) 
The first hypothesis was designed to determine that a difference between treatments emerged in Phase 1, the second hypothesis was designed to determine that there was a difference between ES and DS patients at the end of the study, and the third hypothesis was to determine that an “absolute” difference between the ES and DS patients persisted during Phase 2 (that is, even though a difference between groups at the end of the study might have existed [what was 4 tested by Hypothesis 2], it was important to show that the two groups were not approaching each other). 
To delve into the realm of disease modification therapy research, it is imperative to establish a standardized definition for disease modification, particularly tailored to the nuances of a specific disease area. This foundational step serves as a compass guiding subsequent investigations. Following the definition, the identification of endpoints to measure the disease modification effect becomes paramount. Given the nuanced nature of disease modification effects, the conventional clinical trial designs may prove insufficient. Hence, a specialized approach involving clinical trial designs (such as delayed start design and randomized withdrawal design) with multiple hypothesis tests becomes a requisite. Such a methodological shift is essential to comprehensively capture and validate the nuanced impacts of disease modification therapies.

Sunday, December 10, 2023

Significant level versus p-value

Sometimes, the significant level and p-value are getting mixed up and confusing to some non-statisticians. It is not surprising to receive a question or request for statistician to design a study to obtain a p-value of 0.05 or 0.01. While the significant level and p-value are closed related, ,they are used in different stage of the trial - significant level is used in the study design stage and p-value is used in the analysis stage.


A significant level is usually set at 0.05 at the study design stage. After the study, data is analyzed and p-value is calculated. The p-value is then compared to the pre-specified significant level to determine if the study results is statistically significant. 

If the significant level is set at 0.01 at the study design stage, which is temped for avoiding doing two pivotal studies, it will set the unnecessary high bar for declaring the successful trial in the analysis stage. 

"The significance level," "alpha" (α), and "Type I error rate" are essentially referring to the same concept in the context of hypothesis testing. These terms are often used interchangeably and are closely related. Here's a brief explanation of each:

Significance Level (Alpha, α): The significance level is a pre-defined threshold (usually denoted as α) set by the researcher before conducting a statistical test. It represents the maximum acceptable probability of making a Type I error. Common choices for alpha include 0.05 (5%), 0.01 (1%), and others. It determines the level of stringency for the test, where a smaller alpha indicates a more stringent test.

Significant level is just one of the parameters in calculating the sample size during the study design stage. Other parameters include the effect size (assumed treatment difference), the standard deviation, statistical power (type 2 error), and alpha adjustment due to multiplicity issue, interim analyses,...

Alpha (α): Alpha is the symbol used to represent the significance level in statistical notation. When you see α, it's referring to the predetermined threshold for statistical significance.

Type I Error Rate: The Type I error rate is the probability of making a Type I error, which occurs when you reject the null hypothesis when it is actually true. The significance level (alpha) directly relates to the Type I error rate because the significance level sets the limit for how often you are willing to accept such an error. The Type I error rate is typically equivalent to the significance level (alpha), assuming the test is properly conducted.

P-value: The p-value is calculated as part of the statistical analysis after the data has been collected. It measures the strength of the evidence against the null hypothesis based on the collected data. A smaller p-value indicates stronger evidence against the null hypothesis, and a larger p-value suggests weaker evidence.

The p-value measures the strength of evidence against a null hypothesis. The p-value is the probability under the assumption of no effect or no difference (null hypothesis) of obtaining a result equal to or more extreme than what was actually observed. The 'p' stands for probability and measures how likely it is what any observed value between 0 and 1. Values close to 0 indicate that the observed difference is unlikely to be due to chance, whereas a p value close to 1 suggests that it is highly likely that the difference observed is due to chance. If the p-value is low, it suggests evidence against the null hypothesis, and then alternative hypothesis (assumption of the effect or difference) will be accepted. 

The p-value indicates how incompatible the data are with a specified statistical model constructed under a set of assumptions, together with a null hypothesis. The smaller the p-value, the greater the statistical incompatibility of the data with the null hypothesis. When we get a p-value that is greater than the pre-specified significant level, we fail to reject the null hypothesis - it means that there is insufficient evidence to reject. 

STAT national biotech reporter Damian Garde explains what p-value is:

Even though hypothesis testing and p-value have been criticized (see a previous post "Retire Statistical Significance and p-value?"), the p-value is still the primary indicator by the sponsor, regulator, medical community, and pretty much everybody to judge if a clinical trial is successful or not. 
,
Regulatory approval of a medicinal product depends on more than just a p-value. The approval depends on the totality of the evidence, the magnitude of the treatment difference, clinical significance or clinical meaningfulness, the confidence interval of the estimate, the safety profile, whether the benefit outweighs the risk.

We have seen the cases that the drug is approved even though the p-value was not statistically significant (i.e., did not reach the pre-specified significant level). See the previous post "Drugs Approved by FDA Despite Failed Trials or Minimal/Insufficient Data". We also see the cases that the drug was not approved even though the p-value was statistically significant. See the article "FDA blocks Alnylam's bid to expand Onpattro label" even though the study results were statistically significant and published in the NEJM "Patisiran Treatment in Patients with Transthyretin Cardiac Amyloidosis".

In the end, we can't retire the p-value. We relied on the p-value to measure how strong the evidence is. However, we should not be the slave of the p-value.