On Biostatistics and Clinical Trials
CQ's web blog on the issues in biostatistics and clinical trials.
Thursday, February 22, 2024
Advancing Psychedelic Clinical Study Design - Virtual Public Meeting Organized by Reagan-Udall Foundation
Monday, January 15, 2024
Terminal events as intercurrent events in clinical trials
ICH E9 "Addendum on Estimands and Sensitivity Analysis in Clinical Trials to the Guideline on Statistical Principles for Clinical Trials" contained discussions about intercurrent events and strategies for handling intercurrent events. Intercurrent events were defined as:
Events occurring after treatment initiation that affect either the interpretation or the existence of the measurements associated with the clinical question of interest. It is necessary to address intercurrent events when describing the clinical question of interest in order to precisely define the treatment effect that is to be estimated.
The terminal events are one kind of intercurrent event. ICH E9 Addendum did not provide the formal definition for 'terminal events', but gave examples of the terminal events:
Examples of intercurrent events that would affect the existence of the measurements include terminal events such as death and leg amputation (when assessing symptoms of diabetic foot ulcers), when these events are not part of the variable itself.
In a paper by Siegel et al "The role of occlusion: potential extension of the ICH E9 (R1) Addendum on Estimands and Sensitivity Analysis for Time-to-Event oncology studies", the terminal events were described as the following:
The estimands guidance also introduces the concept of a terminal event. Terminal events prevent the possibility of subsequent measurement. "For terminal events such as death, the variable cannot be measured after the intercurrent event, but neither should these data generally be regarded as missing." There are two examples given in the guidance, death and leg amputation. These examples clarify that terminal events physically prevent subsequent measurement, for any estimand in any study.
Terminality is an objective property of an event which renders further observation physically impossible. If an event is terminal, it is impossible to devise a study that can look beyond it. Indeed there is no meaningful clinical question regarding the treatment effect that manifests after a terminal event.
Terminal events can be defined as events that make the outcome measures impossible and the events are not part of the outcome such as death and ankle amputation in a trial assessing ankle function). Sometimes, the outcome measure after the terminal events may still be possible, but the measures after the terminal events are not meaningful. For example, in clinical trials of pulmonary diseases with spirometry measure as the primary outcome, lung transplantation will be a terminal event. After the lung transplantation, the spirometry measure can still be performed, but the spirometry measure is a reflection of the transplanted lungs, not the intended measure of the clinical trial endpoint.
Terminal events should be separated as fatal (death, mortality) and non-fatal terminal events (may be called 'terminal events excluding mortality'). While they are all considered intercurrent events, the strategies for handling the fatal and non-fatal terminal events need to be different.Strategies for Handling the Fatal Terminal Events
In general, the treatment policy strategy cannot be implemented for intercurrent events that are terminal events, since values for the variable after the intercurrent event do not exist. For example, an estimand based on this strategy cannot be constructed with respect to a variable that cannot be measured due to death.
Composite strategies (or composite variable strategies) are particularly useful for handling fatal terminal events (deaths). The occurrence of the fatal terminal intercurrent event is informative about the effect of the treatment and so it is incorporated in the endpoint. In practice, the outcomes after the fatal terminal intercurrent event can not be observed, but need to be assumed to have the worst values.
With the composite strategy, the terminal intercurrent events will be assigned a failed value. A failed value may be:
- Worse possible measure (for example, 0 for 6MWD and 0 for FEV1 or FVC measures)
- Worst observed value across all subjects at the endpoint visit
- Trimmed means (trimmed
means and quantiles were mentioned in ICH
E9 addendum training materials)
- The worst change (from baseline) of all subjects plus a random error. The error can be randomly drawn from a normal distribution with a mean of 0 and a variance equal to the residual variance estimated from the mixed model for all observed values of change from baseline
Functional endpoints can be confounded by loss of data because of patient deaths. To address this, FDA recommends sponsors use an analysis method that combines survival and function into a single overall measure, such as the joint rank test.In pivotal clinical trials in ALS, the joint rank test is almost the default method for analyzing the primary efficacy endpoint of the ALSFRS-R. The Joint Rank statistic ranks study participants in each treatment group, first by survival and then by ALSFRS-R score. The Joint Rank can increase power relative to analysis of either ALSFRS-R or survival analysis alone in some circumstances, for example when mortality rates are high.
It is acceptable to use hypothetical strategy to handle the non-fatal terminal intercurrent events. "Hypothetical strategies: A scenario is envisaged in which the intercurrent event would not occur: the value of the variable to reflect the clinical question of interest is the value which the variable would have taken in the hypothetical scenario defined."
The value of the variable to reflect the clinical question of interest is the value which the variable would have taken in the hypothetical scenario defined. The value to be considered would have been the one collected if patients had not had the non-fatal terminal event. Outcomes after the non-fatal terminal events do not need to be measured. If the outcomes after the non-fatal terminal events are measured (for example, the spirometry measure after lung transplantation), the measures can be disregarded and not used in the analyses. The outcomes after the non-fatal terminal events cannot be observed, can be left as missing values, and usually need to be implicitly or explicitly predicted/imputed.
Friday, January 12, 2024
Post-Marketing Requirement (PMR) versus Post-Marketing Commitment (PMC)
Post-approval studies can be classified by FDA as a postmarketing requirement (PMR) or a postmarketing commitment (PMC).
A PMR is a study or clinical trial that an applicant (or sponsor) is required by statute or regulation to conduct postapproval. A PMC is a study or clinical trial that an applicant (or sponsor) agrees in writing to conduct postapproval, but that is not required by statute or regulation. PMRs and PMCs can be issued upon approval of a drug or postapproval, if warranted.
As a result, failure to conduct a PMR would be a violation of the Federal Food, Drug, and Cosmetic Act (FDCA) and/or implementing regulations, subject to enforcement action. Potential enforcement actions can include an FDA Warning Letter, charges under section 505(o)(1) of the FDCA, misbranding charges under section 502(z), or civil monetary penalties. In contrast, failure to conduct a PMC would not be a violation of the FDCA or regulations, and therefore not subject to enforcement action.
The table below compares the features of the PMR versus PMC:
Feature Post-Marketing Requirements (PMR) Post-Marketing Commitments (PMC) Definition Regulatory obligations imposed by authorities Voluntary commitments made by the sponsor Purpose Gather additional data on safety, efficacy, etc. Obtain more information post-approval Enforcement Mandatory; non-compliance may lead to penalties Voluntary, but sponsors are expected to fulfill Imposition Imposed by health regulatory agencies Made voluntarily by the sponsor during approval Consequences of Non-compliance Regulatory actions, fines, or product withdrawal Regulatory actions; may impact marketing authorization Flexibility Typically less flexible; regulatory mandates Voluntary, but commitment should be honored Origin External (regulatory agency) Internal (sponsor during regulatory approval) Examples Post-approval safety studies, surveillance Additional clinical trials, long-term safety studies
These PMCs were generally agreed upon by FDA and the applicant. Prior to the passage of FDAAA, FDA required postmarketing studies or clinical trials only in the situations described below:
• Subpart H and subpart E accelerated approvals for products approved under 505(b) of the Act or section 351 of the PHS Act, respectively, which require postmarketing studies to demonstrate clinical benefit (21 CFR 314.510 and 601.41, respectively);
• Deferred pediatric studies, where studies are required under section 505B of the Act (21 CFR 314.55(b) and 601.27(b)); 6 and
• Subpart I and subpart H Animal Efficacy Rule approvals, where studies to demonstrate safety and efficacy in humans are required at the time of use (21 CFR 314.610(b)(1) and 601.91(b)(1), respectively). 7Is the confirmatory trial after the accelerated approval PMR or PMC?
Is Post-Approval Pregnancy Study PMR or PMC?
"Notably, of the 99 postmarketing pregnancy studies in the 10-year period, all but one were PMRs. The only example of a pregnancy PMC is for Paxlovid, for treatment of COVID-19, which is a distinguishable example because the sponsor committed to this study while the drug was still under an Emergency Use Authorization (EUA), not an NDA."
In general, the post-approval pregnancy's studies are PMR, not PMC.
What are examples of the PMR versus PMC?
Large pharmaceutical companies posted their PMRs and PMCs only for the purpose of transparency. For example, here are the lists of PMRs and PMCs for Amgen and Janssen. These PMRs and PMCs provide great examples what kind of studies they are.
Monday, January 01, 2024
Exposure adjusted event rate (EAER) and exposure adjusted incidence rate (EAIR)
Comparing the incidence of AEs between treatment groups is valid and fair if the length of exposure is balanced between two treatment groups. If the treatment exposure in one group is significantly longer than another group, the incidence of AEs may give a biased comparison.
"An analysis of the overall rate of serious events and the rate of specific serious events for each treatment group in critical subgroups (e.g., demographic, disease severity, excretory function, concomitant therapy) and by dose. The median duration of exposure should be examined across treatment groups. If there is a substantial difference in exposure across treatment groups, incidence rates should be calculated using person-time exposure in the denominator, rather than number of subjects in the denominator,...by treatment group"
"Analyses should be corrected for differences in drug exposure using person-time in the denominator to calculate mortality rates. If person-time exposure is not included in the submission (ideally, it should be requested at the pre-NDA/pre-BLA meeting), it should be requested as soon as the need is recognized..."
"We recommend that you provide in the BLA descriptive statistics for the number of serious infection episodes per person-year during the period of study observation. Additional information important to our review includes a frequency table giving the number of subjects with 0, 1, 2… serious infections, a description of each serious infection, and summary statistics for the length of observation of each subject."
"Based on our examination of historical data, we believe that a statistical demonstration of a serious infection rate per person-year less than 1.0 is adequate to provide substantial evidence of efficacy. You may test the null hypothesis that the serious infection rate is greater than or equal to 1.0 per person-year at the 0.01 level of significance or, equivalently, the upper one-sided 99% confidence limit would be less than 1.0."
However, in practice, the number of subjects has often been used as the numerator in incidence rate calculation. It turned out both the number of events and the number of subjects can be used in the calculation depending on how the denominator of exposure (person-time) is calculated.
Exposure-adjusted event rate (EAER): The number of events (if a patient has more than one occurrence of the same event, all occurrences are counted) divided by the total time exposed. This is sometimes referred to as person-time absolute rate. Total time exposed is calculated as the sum of each patient’s time in the interval, whether or not the patient experienced the event. The time unit used can be changed (e.g. if the original units are events per person-year, this can easily be converted to events per 100 person-years by multiplying by 100). The exposure time should be based on the same time interval in which any events that occur would be counted.
Exposure-adjusted incidence rate (EAIR): The number of patients with an event divided by the total time at risk for the event. Total time at risk will be calculated as the sum of time from the first dose (or randomisation) to first event for patients who experienced the event and the time during the entire assessment interval for patients who do not experience the event. This is sometimes referred to as the incidence rate or person-time incidence rate . As noted above, we believe the addition of “exposure-adjusted” or “person-time” is beneficial for clarity.
Time-at-risk EAIR considers patient’ exposure of a specific AE in quantifying the risk of AE, defined as the number of pts who experienced at least 1 specific AE, divided by the total exposure time (patient-year of exposure [PYE]) in each arm. For patients who experienced specific AEs, exposure time was calculated from first dose date up to the first AE onset, and for patients who did not, from first dose up to data cutoff (if still on study treatment) or up to last dose (if discontinued study treatment).
Friday, December 15, 2023
Randomized start design (RSD), delayed start design, randomized withdrawal design to assess disease modification effect
In the latest Global CardioVascular Clinical Trialists (CVCT) Workshop", one of the topics was "How to assess the disease modification in pulmonary arterial hypertension". the academic and industry representatives discussed the definition of disease modification and if the various individual drugs met the criteria as disease modifiers.
Disease modification requires that the intervention have an impact of the underlying pathology and pathophysiology of the disease. For regulatory purposes, a disease modifying effect is when an intervention delays the underlying pathological processes and is accompanied by improvement in clinical signs and symptoms of the disease. The opposite of the disease modifying effect is symptomatic improvement which is defined as "may improve symptoms but does not affect the long-term survival or outcome in the disease, for example, use of diuretics in PAH".
- a drug targeting the underlying pathophysiology
- distinction should be made from the "symptomatic treatment" (do not affect underlying pathophysiology)
- Can achieve the goal of remission (partial or complete)
- endures sustained clinical benefit (referred to as DMA).
The randomized withdrawal design was originally proposed as an approach to enrich the study and reduce the sample size. The randomized withdrawal design (if it is feasible to implement) can be used to evaluate the long-term disease-modifying effect.
In FDA's guidance "Early Alzheimer’s Disease: Developing Drugs for Treatment", the randomized-start or randomized-withdrawal trial design was suggested:
In the FDA’s webinar on “Draft
Guidance For Industry On Alzheimer’s Disease: Developing Drugs For The
Treatment Of Early-Stage Disease”, the FDA presenters discussed the randomized start design or withdrawal design:
“… If there is a significant
effective treatment that couldn't serve as the basis of approval, we do not
believe that that argument in and of itself does not demonstrate. This is where
biomarkers come in. We learned in the trial results, the effect on up Alzheimer
disease biomarker, it is still very and clear. Where the biomarker has been
altered but there is no clinical effect. The clinical outcome was the opposite
of what you want to see. The bottom line is that that understanding and how it
relates to the clinical outcome still needs a bit of work. We would not be
willing to accept the effect on the biomarker, as a basis for a circuit --
Sarah get approval. For
that to be the case, it would be a fundamental itself in the disease process.
In addition to biomarkers there are other ways to show disease modification, a randomized start design, or
withdrawal design. These are based on clinical endpoints. These are
difficult studies to design, and conduct, and interpret. We are open to use
these approaches to show modification, let me show you what I mean by
randomized start design. One would be on .8, the other will be on placebo, the
patients on group to will be switched over to active treatment, patients in
group 2 will be caught up to group 1, they will have a systematic effect of
treatment. Patients that were switched to never really caught up to the first
group, can argue for an effect on the disease, this is challenging to do, but
we are open to the approach. It is a devastating condition, and an epidemic make
-- particularly in late stages, the field is moving to conduct trials in early
stages of the illness. As I pointed out they will pose regulatory challenges.
We hope that's where our guidance will come in and suggest pathways forward.
Thank you. I will have Russell Katz, come up and talk for the rest of the
webinar.”
In one of the EMA presentations “The scientific and regulatory approaches to facilitating disease-modifying drug development and registration in a global environment”, the delayed start design (or randomized start) and randomized withdrawal design were mentioned.
Thursday, December 14, 2023
Defining 'disease modification effect', 'disease modification therapy (DMT)' or 'disease modifier'
Disease modification entails interventions or treatments designed not solely to alleviate symptoms but also to actively influence the trajectory of the disease, effectively impeding or halting its progression.
It's important to note that a unified definition for disease modification does not exist. The nuances of the term, as well as what qualifies as a disease modifier, can vary across different diseases. The understanding and criteria for disease modification may differ, reflecting the intricacies inherent to each specific medical condition.
In a review paper by Vollenhoven et al "Conceptual framework for defining disease modification in systemic lupus erythematosus: a call for formal criteria", authors put together a table summarizing various definitions for disease modification in different disease areas:
Level 1: Slowing decline
Level 2: Arrest decline
Level 3: Disease improvement
Level 4: Remission
Level 5: Cure
A disease-modifying treatment (DMT) is defined as an intervention that produces an enduring change in the clinical progression of AD by interfering with the underlying pathophysiological mechanisms of the disease process that lead to neuronal death. Consequently, a true DMT cannot be established conclusively based on clinical outcome data alone, such a clinical effect must be accompanied by strong supportive evidence from a biomarker program.
The study was to be analyzed according to three hypotheses, in the following order:
- Hypothesis 1-the contrast between the slope of drug and placebo response at Week 36 (using data from weeks 12-36; Linear Mixed Model with random intercept and slope)
- Hypothesis 2-the contrast of scores between baseline and Week 72 (Repeated Measures)
- Hypothesis 3-a non-inferiority analysis of the slopes of the ES and DS patients from weeks 48-72 (Linear Mixed Model with random intercept and slope)
The first hypothesis was designed to determine that a difference between treatments emerged in Phase 1, the second hypothesis was designed to determine that there was a difference between ES and DS patients at the end of the study, and the third hypothesis was to determine that an “absolute” difference between the ES and DS patients persisted during Phase 2 (that is, even though a difference between groups at the end of the study might have existed [what was 4 tested by Hypothesis 2], it was important to show that the two groups were not approaching each other).
Sunday, December 10, 2023
Significant level versus p-value
Significance Level (Alpha, α): The significance level is a pre-defined threshold (usually denoted as α) set by the researcher before conducting a statistical test. It represents the maximum acceptable probability of making a Type I error. Common choices for alpha include 0.05 (5%), 0.01 (1%), and others. It determines the level of stringency for the test, where a smaller alpha indicates a more stringent test.
Alpha (α): Alpha is the symbol used to represent the significance level in statistical notation. When you see α, it's referring to the predetermined threshold for statistical significance.
Type I Error Rate: The Type I error rate is the probability of making a Type I error, which occurs when you reject the null hypothesis when it is actually true. The significance level (alpha) directly relates to the Type I error rate because the significance level sets the limit for how often you are willing to accept such an error. The Type I error rate is typically equivalent to the significance level (alpha), assuming the test is properly conducted.
Even though hypothesis testing and p-value have been criticized (see a previous post "Retire Statistical Significance and p-value?"), the p-value is still the primary indicator by the sponsor, regulator, medical community, and pretty much everybody to judge if a clinical trial is successful or not.