Monday, November 11, 2019

Breakthrough or Political Pressure: GV-971 and aducanumab for treatment of Alzheimer's Disease


Recently, we saw the breakthroughs in drug development for Alzheimer disease. The Chinese regulatory authority (NMPA) approved GV-971 (Oligomannate) for the treatment of mild and moderate Alzheimer’s disease. GV-971 is the first drug approved anywhere in the world for Alzheimer’s disease since 2003. While other Alzheimer’s drugs are developed based on the beta-amyloid theory, GV-971 targets a different mechanism. Based on the pre-clinical studies, GV-971 remodeled the gut microbiome in a way that reduced the accumulation of neuroinflammatory cells – a pathway that a lot of people are not convinced with. The GV-971 approval triggered a lot of discussions.

GV-971 approval is also a hot topic in the Chinese social network (WeChat) discussions. There are supports and doubts, even some reports of the key paper with inappropriate handling of the data.
GV-971的讨论已经遍布全网,错峰评论几句:1)不能因为Biogen做了36个月的AD trial就来要求971也做36个月,靶子不是由Biogen来画的。研究一下历史就知道,FDAAriceptNamenda,一个用的是24周数据、一个用的是28周数据;2)安慰剂效果到了临床实验末期骤然恶化也不是什么特别稀奇的,again,详见AriceptNamenda的曲线也是到了后期陡降;3)对于生物标记物或者临床终点的选择,我同意浙江大学王立铭教授的看法,beta-amyloid或者tau的存在和ADpathogenesis虽有相关性,但因果性从未确立。所以,要求测定脑脊液生物标记物,虽然说不过分,但未必relevant。况且,听说中国病人家属极度反对收集病人脑脊液(有创检查),此举恐降低病人依从性;4AD机理本来本来就是一锅粥,971maybe this is sth we don’t know that we don’t know, 所以要有一点开放心态;5)此药极安全,如果医保保,吃了,只有upside!没啥downside!这和癌症治疗不一样,癌症有很多好药,病人用了有争议的治疗,就可能产生机会成本。而AD,不用971,没有失去任何治疗机会;6)在中国大环境下,支持批准,支持医保覆盖。不覆盖这个,也是覆盖什么鼠神经营养因子这种具有中国特色的神药!还不如覆盖这个有点科学根据的呢;7)当然,以上一切讨论都是建立于对于给绿谷和耿教授的credibility 充足的benefit of doubt的前提下。至于绿谷作为一个药企,耿教授作为一名学者,他们的track record是不是支持NMPA对于数据的信赖,这是另外的讨论了。不过既然能批准,就还是信赖了嘛,哈哈哈
Two weeks ago, we saw the news about Biogen (and Eisai) was planning to submit the new drug approval for Aducanumab for treatment of early Alzheimer’s disease.
“Biogen plans to pursue regulatory approval for aducanumab, an investigational treatment for early Alzheimer’s disease (AD). The Phase 3 EMERGE Study met its primary endpoint showing a significant reduction in clinical decline, and Biogen believes that results from a subset of patients in the Phase 3 ENGAGE Study who received sufficient exposure to high dose aducanumab support the findings from EMERGE. Patients who received aducanumab experienced significant benefits on measures of cognition and function such as memory, orientation, and language. Patients also experienced benefits on activities of daily living including conducting personal finances, performing household chores such as cleaning, shopping, and doing laundry, and independently traveling out of the home. If approved, aducanumab would become the first therapy to reduce the clinical decline of Alzheimer’s disease and would also be the first therapy to demonstrate that removing amyloid beta resulted in better clinical outcomes.
The decision to file is based on a new analysis, conducted by Biogen in consultation with the FDA, of a larger dataset from the Phase 3 clinical studies that were discontinued in March 2019 following a futility analysis….”
Early this year, the same phase 3 EMERGE and ENGAGE studies were terminated earlier for futility based on the suggestion by data monitoring committee “BIOGEN AND EISAI TO DISCONTINUE PHASE 3 ENGAGE AND EMERGE TRIALS OF ADUCANUMAB IN ALZHEIMER’S DISEASE”. The futility analyses indicated that primary efficacy endpoints would unlikely be met if the studies continued to their completion.  

 Both of the GV-971 approval and Biogen’s planned submission of aducanumab for approval benefit from the overall environment in the field of drug development for Alzheimer’s diseases. With so many failed trials and not a single new drug approval in the last 17 years, the pressures are on the regulatory authorities. We see that the US FDA has been willing to work with the sponsors to find a way to bring a new drug to millions of AD patients. This can be exemplified by the release of the guidance “Early Alzheimer’s Disease: Developing Drugs for Treatment Guidance for Industry

Traditionally, an Alzheimer drug approval will require positive results from two phase 3 studies and each phase 3 study can demonstrate the co-primary efficacy endpoints: the AD Assessment Scale-cognitive (ADAS-cog) and one of these followings: activities of daily living global severity, or global change ratings. GV-971’s approval was based on a single pivotal clinical trial.

GV-971’s approval in China has the politics factor there. It is approved at a time when China needs to demonstrate it can develop the new, home-grown drug for deadly diseases and at a time when no other Alzheimer drugs have been successful in the last 17 years.
China Alzheimer's Approval Raises Hope But Also Questions
"While the approval in China of a novel algae-derived drug for Alzheimer's appears to be a breakthrough, it has also left some wondering about long-term efficacy and the reasons for the apparent rush to grant the clearance ahead of a major pending filing for the disease in the US."
If Biogen’s aducanumab can get FDA’s nods, politics will play a role. FDA verdict on Biogen’s Alzheimer’s drug likely to be a ‘political decision,’ analyst says

This is not the first time that politics play a role in the drug approval. Some of the examples are listed below:


Monday, October 14, 2019

A clinical trial with sample size = 1?

In the October issue of New England Journal of Medicine, Kim and Hu et al from Boston Children's Hospital published a paper "Patient-Customized Oligonucleotide Therapy for a Rare Genetic Disease" for their study with only one patient - the so-called 'N of 1' or 'N of one' trial. 

Drs. Woodcock and Marks from FDA wrote an editorial for this study "Drug Regulation in the Era of Individualized Therapies". 

While the "N of 1" design fits into the paradigm of patient-centric drug development and precision medicine, a sample size of 1 doesn't fit into the current drug development and drug approval process.  We can see this from the editorial by Drs Woodcock and Marks. They raised a long list of questions with no answers: 
In these “N-of-one” situations, 
  • what type of evidence is needed before exposing a human to a new drug? 
  • Even in rapidly progressing, fatal illnesses, precipitating severe complications or death is not acceptable, so what is the minimum assurance of safety that is needed? 
  • How persuasive should the mechanistic or functional data be?
  • How should the dose and regimen be selected?
  • How much characterization of the product should be undertaken? 
  • How should the urgency of the patient’s situation or the number of people who could ultimately be treated affect the decisionmaking process?
  • In addition, how will efficacy be evaluated? At the very least, during the time needed to discover and develop an intervention, quantifiable, objective measures of the patient’s disease status should be identified and tracked, since, in an N-of-one experiment, evaluation of disease trends before and after treatment will usually be the primary method of assessing effectiveness. In this regard, there is precedent for the application of new efficacy measures to the study of small numbers of patients.
In a previous post, "How Low in Sample Size Can We Go? FDA approves ultra-orphan drug on a 4-patient trial", we discussed a case that FDA approved a drug based on a 4-patient trial - that was the drug approval with the fewest sample size I was aware of. 

With four patients, we can still do some statistical calculations, for example, mean and standard deviation. With one patient, no statistical calculation is needed. 

The previous ASA president, Dr. Barry D. Nussbaum wrote in his president's corner article "Bigger Isn’t Always Better When It Comes to Data" about a sample size one - he was sarcastic!. 
"While I am thinking in terms of humorous determinations of sample size, I sometimes suggest we should stop at samples of size one, since otherwise variance starts to get in the way. I always have a smile, but sometimes think the audience is taking me seriously."
But the YouTube video clip he suggested was interesting - a dialog between a biologist and a statistician about the sample size. 
"This reminded me of a YouTube video clip many of you may have seen in which a scientist and statistician try to collaborate. The scientist is hung up on a sample of size three since that is what was always used. The clip is humorous, and in reflection, sad as well. Look for yourself at https://goo.gl/9qdfjK"

In my previous post "N of 1 Clinical Trial Design and its Use in Rare Disease Studies", "N of 1" design is really meant for a study with multiple crossovers in the same patient, but will at least several sets of patients - some replicates are needed. 

Sunday, October 13, 2019

Real-World Evidence for Regulatory Decision Making - Some Updates

This year, real-world data and real-world evidence is the theme in every conference in the clinical trial and drug development area. Just a couple of months ago, I had discussed "Generate Real-World Data (RWD) and Real-World Evidence (RWE) for Regulatory Purposes". Last month at the annual Regulatory-Industry Statistics Workshop, the real-world evidence was again the main topic. Below are some topics (with slides) that were discussed in the workshop.
There is also an upcoming conference "Real-World Evidence Conference" this November 2019 in Cambridge, MA

Where is the real-world data coming from?
The real-world data will mainly come from the claim data, the EHR (electronic health records), registry, and maybe social media data. The data from a clinical trial using the real-world device (for example, using actigraphy/accelerometry to measure the patient's function in the real world) may also be considered as real-world data - different types and more acceptable real-world data.

Even though the real-world evidence is discussed everywhere, the application and the acceptability of the real-world data are still limited. It is unlikely to have real-world data or real-world evidence to replace the clinical trials, especially the gold standard of RCT (randomized, controlled trials).  In this DIA discussion between Ms. Kunz and Ms. Mahoney, "Advancing the Use of Real-World Evidence for Regulatory Decision-Making", the potential application of real-world evidence was mentioned to be for label expansion and for fulfilling the post-marketing requirement. I would say that real-world evidence may also be applied in the regulatory decision making for ultra-rare diseases.

For real-world data, data quality is always an issue and a concern. In a recent presentation by Dr. Bob Temple, "Leveraging Randomized Designs toGenerate RWE", he discussed the areas and examples that real-world evidence was used. He also had the following comments about data quality and precision.


In a most recent article, US FDA's Temple On Real-World Evidence: 'I Find The Whole Thing Very Frustrating'. and also this article: Real-World Evidence: Sponsors Look To US FDA Drug Reviews For Potential Pitfalls.

Officials from the European Medicines Agency (EMA) said in an article published recently in Clinical Pharmacology & Therapeutics that there will need to be adequate statistical methods to extract, analyze and interpret real-world evidence before they can translate into credible evidence.

Sunday, September 08, 2019

Cox proportional hazards regression model: univariate, multivariate, adjusted, stratified

Cox proportional hazards regression model has been called different names (Cox model, Cox regression model, Proportional hazards model, ... can be used interchangeably). The original paper by D.R. Cox "Regression models and life tables" is one of the most cited papers. Paired with the Kaplan-Meier method (and the log-rank test), the Cox proportional hazards model is the cornerstone for the survival analyses or all analyses with time to event endpoints.

With Cox regression model, we are now able to analyze the time to event data just like the linear regressions and logistic regressions to compare the treatment difference and to investigate the explanatory variables. The explanatory variables may be called independent variables, covariates,  confounding factors,...

Depending on whether the explanatory variables are continuous or categorical and depending on how the explanatory variables are used in the Cox regression model (as covariates, as stratification factors), Cox regression model can be fitted in different ways:

Univariate Cox regression
Unstratified and unadjusted
Multivariate Cox regression
Unstratified and adjusted
Stratified Cox regression
Stratified and unadjusted
Stratified Multivariate Cox regression
Stratified and adjusted

The variables used in adjusted Cox regression can be categorical or continuous, but the variables used in stratified Cox regression should be categorical.

The following are compiled from various sources listed below:



Univariate Cox Proportional Hazards Regression Model (Unstratified Unadjusted Proportional Hazards Regression Model)

The unstratified, unadjusted proportional hazards regression model is more commonly called Univariate Cox proportional hazards regression model and its assumptions are illustrated by:


h (.) hazard function as a function of time (relative to the start date), the patient’s
treatment and the unknown regression parameter
h0(t) unspecified baseline hazard function at time t
xj1 treatment for patient j coded as 0 (placebo group) or 1 (treatment group); this coding leads to a hazard ratio less than 1 if the estimated effect suggests a lower hazard in the treatment group relative to the placebo group
β1 unknown regression parameter for the treatment effect on the log-scale (the log hazard ratio).
The hazard ratio (HR) is a multiplicative constant λ1 = exp(β1) comparing the hazard function in the treatment group relative to the hazard function in the placebo group (identical to the baseline hazard function). A HR less than 1 indicates decreased hazard for the event of interest in the treatment group compared to the placebo group.

SAS PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is used to estimate the hazard ratio based on the partial maximum likelihood function; a Wald-test based two-sided CI is requested by the RISKLIMITS option (additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat / TIES=EXACT RISKLIMITS ALPHA=;
RUN;
* tte represents variable containing event/censoring times;
* cnsr represents censoring variable (0=event, 1=censored);
* treat represents treatment group variable (0=placebo, 1=treatment);

Stratified (Unadjusted) Proportional Hazards Regression

The stratified unadjusted proportional hazards regression model and its assumptions are described by:



h (.) hazard function as a function of time (relative to the start date), the patient’s
stratum and treatment and the unknown regression parameter
h0k(t) unspecified baseline hazard function for stratum k at time t
xjk1 treatment for patient j in stratum k coded as 0 (placebo group) or 1 (treatment group); this coding leads to a hazard ratio less than 1 if the estimated effect suggests a lower hazard in the treatment group relative to the placebo group
β1 unknown regression parameter for the treatment effect on the log-scale (the log hazard ratio).
The hazard ratio is a multiplicative constant λ1 = exp(β1) comparing the hazard functions in the treatment group relative to the hazard functions in the placebo group. The latter are allowed varying across strata. However, the hazard ratio is assumed to be common across strata. An HR less than 1 indicates decreased hazard for the event of interest in the treatment group compared to the placebo group.

SAS PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is
used to estimate the hazard ratio based on the partial maximum likelihood function; a Wald-test based two-sided CI is requested by the RISKLIMITS option (additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat / TIES=EXACT RISKLIMITS ALPHA=;
STRATA strat1 .. stratj;
RUN;
* tte represents variable containing event/censoring times;
* cnsr represents censoring variable (0=event, 1=censored);
* treat represents treatment group variable (0=placebo, 1=treatment);
* strat1 to stratj represent stratification variables;
Adjusted Cox Proportional Hazards Regression Model (including Univariate Cox Proportional Hazards Regression Model and Multivariate Cox Proportional Hazards Regression Model)
The purpose of the model is to evaluate the effect of a single factor (univariate) or simultaneously the effect of several factors (multivariate) on survival. In other words, it allows us to examine how specified factor(s) influence the rate of a particular event happening (e.g., infection, death) at a particular point in time. This rate is commonly referred as the hazard rate. Predictor variable(s) (or factor(s)) are usually termed covariates in the survival-analysis literature.
The Cox model is expressed by the hazard function denoted by h(t). Briefly, the hazard function can be interpreted as the risk of dying at time t. It can be estimated as follow:
where,
·         t represents the survival time
·         h(t) is the hazard function determined by a set of p covariates (x1,x2,...,xp)
·         the coefficients (b1,b2,...,bp) measure the impact (i.e., the effect size) of covariates.
·         the term h0 is called the baseline hazard. It corresponds to the value of the hazard if all the xi are equal to zero (the quantity exp(0) equals 1). The ‘t’ in h(t) reminds us that the hazard may vary over time.

The Cox model can be written as a multiple linear regression of the logarithm of the hazard on the variables xi, with the baseline hazard being an ‘intercept’ term that varies with time.

The quantities exp(bi) are called hazard ratios (HR). A value of bi greater than zero, or equivalently a hazard ratio greater than one, indicates that as the value of the ith covariate increases, the event hazard increases and thus the length of survival decreases.

Put another way, a hazard ratio above 1 indicates a covariate that is positively associated with the event probability, and thus negatively associated with the length of survival.
      Univariate Cox proportional hazards regression model
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat /
TIES=EXACT RISKLIMITS ALPHA;RUN;

      Multivariate Cox proportional hazards regression model
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat cov1 .. covk /
TIES=EXACT RISKLIMITS ALPHA;
RUN;
* tte represents variable containing event/censoring times;
* cnsr represents censoring variable (0=event, 1=censored);
* treat represents treatment group variable (0=placebo, 1=treatment);
* cov1 to covk represent covariates other than treatment group;

Stratified Multivariate Cox Regression (Stratified Adjusted Proportional Hazards Regression)

The stratified adjusted proportional hazards regression model and its assumptions are illustrated by:


h (.) hazard function as a function of time (relative to the start date), the patient’s
stratum and covariates, and the unknown vector of regression parameters
h0k(t) unspecified baseline hazard function for patients with covariate vector (0, .., 0) for stratum k at time t
xjk vector of Baseline covariates for patient j in stratum k, with xjk1 representing
treatment coded as 0 (placebo) or 1 (treatment)
β: unknown regression parameter vector on the log-scale (β1 represents the log hazard ratio for treatment).

The hazard ratio for the i-th covariate is a multiplicative constant exp(βi) comparing the hazard functions between the levels of the i-th covariate. Baseline hazard functions are allowed varying across strata without any restrictions. However, the hazard ratio comparing a covariate (including treatment) is assumed to be common across strata. The estimates of βi are adjusted for the other covariates in the model.

SAS PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is
used to estimate the hazard ratios based on the partial maximum likelihood function; a Wald-test based two-sided CI is requested by the RISKLIMITS option (additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat cov1 .. covk /
TIES=EXACT RISKLIMITS ALPHA;
STRATA strat1 .. stratj;
RUN;
* tte represents variable containing event/censoring times;
* cnsr represents censoring variable (0=event, 1=censored);
* treat represents treatment group variable (0=placebo, 1=treatment);
* cov1 to covk represent covariates other than treatment group;
* strat1 to stratj represent stratification variables;

Tuesday, September 03, 2019

Unstratified log-rank test and stratified log-rank test


The logrank test, or log-rank test, is a hypothesis test to compare the survival distributions of two samples. It is a nonparametric test and appropriate to use when the data are right skewed and censored (technically, the censoring must be non-informative). It is widely used in clinical trials to establish the efficacy of a new treatment in comparison with a control treatment when the measurement is the time to event (such as the time from initial treatment to a heart attack).

There are unstratified log-rank test and stratified log-rank test (as described below). In a clinical trial with stratified randomization, the stratified log-rank test is commonly used and the stratification factors used for randomization will be included in the log-rank test. 

Even for a study with time to event primary efficacy endpoint, the sample size calculation is usually based on the unstratifed log-rank test (i.e., the stratification factors are not considered). 

Log-rank test can provide a p-value for comparing the survival distributions of two samples (between two treatment groups), however, as described in an early post, "Splitting p-value and estimate of the treatment difference", the magnitude of the treatment difference (hazard ratio) needs to be calculated using Cox proportional regression - a semi-parametric method.



Unstratified log-rank test
Let ST(t) and Sp(t) denote the survival functions for the treatment group and placebo group, respectively.

The null hypothesis
  H0: ST(t) = Sp(t) for all t ≥ 0 “identical survival functions for both treatment groups”
is tested against the one-sided alternative hypothesis
  HA: ST(t) ≥ Sp(t) for all t ≥ 0 and ST(t) > Sp(t) for at least some t > 0
        “survival function in the treatment group superior to the survival function in the placebo group”

The unstratified log-rank test can be conducted by SAS PROC LIFETEST where the STRATA statement includes only the treatment group variable (treat). The TIME statement includes a variable with times to event (TTE) and an indicator variable for right censoring (cnsr) with 1 representing censoring (additional options controlling the output may be added):
PROC LIFETEST DATA=dataset METHOD=KM;
TIME tte*cnsr(1);
STRATA treat;RUN;

As an output of the procedure, the rank statistic S and variance Var(S) are obtained. Under the null hypothesis, the test statistic Z = S/sqrt[Var(S)] is approximately normally distributed. The one-sided p-value is therefore obtained from normally distributed Z statistic and H0 is rejected if the p-value does not exceed the (nominal) significance level assigned to the test.

Stratified log-rank test
Let ST,k(t) and Sp,k(t) denote the survival functions for the treatment group and placebo group in stratum k, k=1,..,K, K = m1 x m2 x .. x mj where mi denotes the number of categories for stratification
factor i, 1≤ i j, respectively. The null hypothesis
  H0: ST,k(t) = Sp,k(t) for all t ≥ 0 and all k (“identical survival functions for both treatment groups in each stratum”) is tested against the one-sided alternative hypothesis
  HA: ST,k(t) ≥ Sp,k(t) for all t ≥ 0 and all k=1,..,K and ST,m(t) > Sp,m(t) for at least some t > 0 and some m (“survival function in the treatment group superior to the survival in the placebo group in at least one stratum”)

Log-rank tests are performed for each of the strata separately, obtaining the rank statistic Sk and variance Var(Sk) where k=1, 2, …, K. The stratified log-rank test statistic is constructed as Z = [S1 +…+ SK] / sqrt[Var(S1) +…+ Var(SK)]. Under the null hypothesis, Z is approximately normally distributed. The one-sided stratified log-rank p-value is therefore obtained from normally distributed Z statistic and H0 is rejected if p does not exceed the (nominal) significance level assigned to the test.

The stratified log-rank test can be conducted with SAS PROC LIFETEST where the STRATA
statement includes the j strata variables (strat_1, .., strat_j) and the GROUP option
includes the treatment variable (treat). The TIME statement includes a variable with times to event (TTE) and an indicator variable for right censoring (cnsr) with 1 representing censoring (additional options controlling the output may be added):
PROC LIFETEST DATA=dataset METHOD=KM;
TIME tte*cnsr(1);
STRATA strat_1 .. strat_j GROUP=treat;
RUN;