Monday, October 14, 2019

A clinical trial with sample size = 1?

In the October issue of New England Journal of Medicine, Kim and Hu et al from Boston Children's Hospital published a paper "Patient-Customized Oligonucleotide Therapy for a Rare Genetic Disease" for their study with only one patient - the so-called 'N of 1' or 'N of one' trial. 

Drs. Woodcock and Marks from FDA wrote an editorial for this study "Drug Regulation in the Era of Individualized Therapies". 

While the "N of 1" design fits into the paradigm of patient-centric drug development and precision medicine, a sample size of 1 doesn't fit into the current drug development and drug approval process.  We can see this from the editorial by Drs Woodcock and Marks. They raised a long list of questions with no answers: 
In these “N-of-one” situations, 
  • what type of evidence is needed before exposing a human to a new drug? 
  • Even in rapidly progressing, fatal illnesses, precipitating severe complications or death is not acceptable, so what is the minimum assurance of safety that is needed? 
  • How persuasive should the mechanistic or functional data be?
  • How should the dose and regimen be selected?
  • How much characterization of the product should be undertaken? 
  • How should the urgency of the patient’s situation or the number of people who could ultimately be treated affect the decisionmaking process?
  • In addition, how will efficacy be evaluated? At the very least, during the time needed to discover and develop an intervention, quantifiable, objective measures of the patient’s disease status should be identified and tracked, since, in an N-of-one experiment, evaluation of disease trends before and after treatment will usually be the primary method of assessing effectiveness. In this regard, there is precedent for the application of new efficacy measures to the study of small numbers of patients.
In a previous post, "How Low in Sample Size Can We Go? FDA approves ultra-orphan drug on a 4-patient trial", we discussed a case that FDA approved a drug based on a 4-patient trial - that was the drug approval with the fewest sample size I was aware of. 

With four patients, we can still do some statistical calculations, for example, mean and standard deviation. With one patient, no statistical calculation is needed. 

The previous ASA president, Dr. Barry D. Nussbaum wrote in his president's corner article "Bigger Isn’t Always Better When It Comes to Data" about a sample size one - he was sarcastic!. 
"While I am thinking in terms of humorous determinations of sample size, I sometimes suggest we should stop at samples of size one, since otherwise variance starts to get in the way. I always have a smile, but sometimes think the audience is taking me seriously."
But the YouTube video clip he suggested was interesting - a dialog between a biologist and a statistician about the sample size. 
"This reminded me of a YouTube video clip many of you may have seen in which a scientist and statistician try to collaborate. The scientist is hung up on a sample of size three since that is what was always used. The clip is humorous, and in reflection, sad as well. Look for yourself at https://goo.gl/9qdfjK"

In my previous post "N of 1 Clinical Trial Design and its Use in Rare Disease Studies", "N of 1" design is really meant for a study with multiple crossovers in the same patient, but will at least several sets of patients - some replicates are needed. 

Sunday, October 13, 2019

Real-World Evidence for Regulatory Decision Making - Some Updates

This year, real-world data and real-world evidence is the theme in every conference in the clinical trial and drug development area. Just a couple of months ago, I had discussed "Generate Real-World Data (RWD) and Real-World Evidence (RWE) for Regulatory Purposes". Last month at the annual Regulatory-Industry Statistics Workshop, the real-world evidence was again the main topic. Below are some topics (with slides) that were discussed in the workshop.
There is also an upcoming conference "Real-World Evidence Conference" this November 2019 in Cambridge, MA

Where is the real-world data coming from?
The real-world data will mainly come from the claim data, the EHR (electronic health records), registry, and maybe social media data. The data from a clinical trial using the real-world device (for example, using actigraphy/accelerometry to measure the patient's function in the real world) may also be considered as real-world data - different types and more acceptable real-world data.

Even though the real-world evidence is discussed everywhere, the application and the acceptability of the real-world data are still limited. It is unlikely to have real-world data or real-world evidence to replace the clinical trials, especially the gold standard of RCT (randomized, controlled trials).  In this DIA discussion between Ms. Kunz and Ms. Mahoney, "Advancing the Use of Real-World Evidence for Regulatory Decision-Making", the potential application of real-world evidence was mentioned to be for label expansion and for fulfilling the post-marketing requirement. I would say that real-world evidence may also be applied in the regulatory decision making for ultra-rare diseases.

For real-world data, data quality is always an issue and a concern. In a recent presentation by Dr. Bob Temple, "Leveraging Randomized Designs toGenerate RWE", he discussed the areas and examples that real-world evidence was used. He also had the following comments about data quality and precision.


In a most recent article, US FDA's Temple On Real-World Evidence: 'I Find The Whole Thing Very Frustrating'. and also this article: Real-World Evidence: Sponsors Look To US FDA Drug Reviews For Potential Pitfalls.

Officials from the European Medicines Agency (EMA) said in an article published recently in Clinical Pharmacology & Therapeutics that there will need to be adequate statistical methods to extract, analyze and interpret real-world evidence before they can translate into credible evidence.

Sunday, September 08, 2019

Cox proportional hazards regression model: univariate, multivariate, adjusted, stratified

Cox proportional hazards regression model has been called different names (Cox model, Cox regression model, Proportional hazards model, ... can be used interchangeably). The original paper by D.R. Cox "Regression models and life tables" is one of the most cited papers. Paired with the Kaplan-Meier method (and the log-rank test), the Cox proportional hazards model is the cornerstone for the survival analyses or all analyses with time to event endpoints.

With Cox regression model, we are now able to analyze the time to event data just like the linear regressions and logistic regressions to compare the treatment difference and to investigate the explanatory variables. The explanatory variables may be called independent variables, covariates,  confounding factors,...

Depending on whether the explanatory variables are continuous or categorical and depending on how the explanatory variables are used in the Cox regression model (as covariates, as stratification factors), Cox regression model can be fitted in different ways:

Univariate Cox regression
Unstratified and unadjusted
Multivariate Cox regression
Unstratified and adjusted
Stratified Cox regression
Stratified and unadjusted
Stratified Multivariate Cox regression
Stratified and adjusted

The variables used in adjusted Cox regression can be categorical or continuous, but the variables used in stratified Cox regression should be categorical.

The following are compiled from various sources listed below:



Univariate Cox Proportional Hazards Regression Model (Unstratified Unadjusted Proportional Hazards Regression Model)

The unstratified, unadjusted proportional hazards regression model is more commonly called Univariate Cox proportional hazards regression model and its assumptions are illustrated by:


h (.) hazard function as a function of time (relative to the start date), the patient’s
treatment and the unknown regression parameter
h0(t) unspecified baseline hazard function at time t
xj1 treatment for patient j coded as 0 (placebo group) or 1 (treatment group); this coding leads to a hazard ratio less than 1 if the estimated effect suggests a lower hazard in the treatment group relative to the placebo group
β1 unknown regression parameter for the treatment effect on the log-scale (the log hazard ratio).
The hazard ratio (HR) is a multiplicative constant λ1 = exp(β1) comparing the hazard function in the treatment group relative to the hazard function in the placebo group (identical to the baseline hazard function). A HR less than 1 indicates decreased hazard for the event of interest in the treatment group compared to the placebo group.

SAS PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is used to estimate the hazard ratio based on the partial maximum likelihood function; a Wald-test based two-sided CI is requested by the RISKLIMITS option (additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat / TIES=EXACT RISKLIMITS ALPHA=;
RUN;
* tte represents variable containing event/censoring times;
* cnsr represents censoring variable (0=event, 1=censored);
* treat represents treatment group variable (0=placebo, 1=treatment);

Stratified (Unadjusted) Proportional Hazards Regression

The stratified unadjusted proportional hazards regression model and its assumptions are described by:



h (.) hazard function as a function of time (relative to the start date), the patient’s
stratum and treatment and the unknown regression parameter
h0k(t) unspecified baseline hazard function for stratum k at time t
xjk1 treatment for patient j in stratum k coded as 0 (placebo group) or 1 (treatment group); this coding leads to a hazard ratio less than 1 if the estimated effect suggests a lower hazard in the treatment group relative to the placebo group
β1 unknown regression parameter for the treatment effect on the log-scale (the log hazard ratio).
The hazard ratio is a multiplicative constant λ1 = exp(β1) comparing the hazard functions in the treatment group relative to the hazard functions in the placebo group. The latter are allowed varying across strata. However, the hazard ratio is assumed to be common across strata. An HR less than 1 indicates decreased hazard for the event of interest in the treatment group compared to the placebo group.

SAS PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is
used to estimate the hazard ratio based on the partial maximum likelihood function; a Wald-test based two-sided CI is requested by the RISKLIMITS option (additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat / TIES=EXACT RISKLIMITS ALPHA=;
STRATA strat1 .. stratj;
RUN;
* tte represents variable containing event/censoring times;
* cnsr represents censoring variable (0=event, 1=censored);
* treat represents treatment group variable (0=placebo, 1=treatment);
* strat1 to stratj represent stratification variables;
Adjusted Cox Proportional Hazards Regression Model (including Univariate Cox Proportional Hazards Regression Model and Multivariate Cox Proportional Hazards Regression Model)
The purpose of the model is to evaluate the effect of a single factor (univariate) or simultaneously the effect of several factors (multivariate) on survival. In other words, it allows us to examine how specified factor(s) influence the rate of a particular event happening (e.g., infection, death) at a particular point in time. This rate is commonly referred as the hazard rate. Predictor variable(s) (or factor(s)) are usually termed covariates in the survival-analysis literature.
The Cox model is expressed by the hazard function denoted by h(t). Briefly, the hazard function can be interpreted as the risk of dying at time t. It can be estimated as follow:
where,
·         t represents the survival time
·         h(t) is the hazard function determined by a set of p covariates (x1,x2,...,xp)
·         the coefficients (b1,b2,...,bp) measure the impact (i.e., the effect size) of covariates.
·         the term h0 is called the baseline hazard. It corresponds to the value of the hazard if all the xi are equal to zero (the quantity exp(0) equals 1). The ‘t’ in h(t) reminds us that the hazard may vary over time.

The Cox model can be written as a multiple linear regression of the logarithm of the hazard on the variables xi, with the baseline hazard being an ‘intercept’ term that varies with time.

The quantities exp(bi) are called hazard ratios (HR). A value of bi greater than zero, or equivalently a hazard ratio greater than one, indicates that as the value of the ith covariate increases, the event hazard increases and thus the length of survival decreases.

Put another way, a hazard ratio above 1 indicates a covariate that is positively associated with the event probability, and thus negatively associated with the length of survival.
      Univariate Cox proportional hazards regression model
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat /
TIES=EXACT RISKLIMITS ALPHA;RUN;

      Multivariate Cox proportional hazards regression model
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat cov1 .. covk /
TIES=EXACT RISKLIMITS ALPHA;
RUN;
* tte represents variable containing event/censoring times;
* cnsr represents censoring variable (0=event, 1=censored);
* treat represents treatment group variable (0=placebo, 1=treatment);
* cov1 to covk represent covariates other than treatment group;

Stratified Multivariate Cox Regression (Stratified Adjusted Proportional Hazards Regression)

The stratified adjusted proportional hazards regression model and its assumptions are illustrated by:


h (.) hazard function as a function of time (relative to the start date), the patient’s
stratum and covariates, and the unknown vector of regression parameters
h0k(t) unspecified baseline hazard function for patients with covariate vector (0, .., 0) for stratum k at time t
xjk vector of Baseline covariates for patient j in stratum k, with xjk1 representing
treatment coded as 0 (placebo) or 1 (treatment)
β: unknown regression parameter vector on the log-scale (β1 represents the log hazard ratio for treatment).

The hazard ratio for the i-th covariate is a multiplicative constant exp(βi) comparing the hazard functions between the levels of the i-th covariate. Baseline hazard functions are allowed varying across strata without any restrictions. However, the hazard ratio comparing a covariate (including treatment) is assumed to be common across strata. The estimates of βi are adjusted for the other covariates in the model.

SAS PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is
used to estimate the hazard ratios based on the partial maximum likelihood function; a Wald-test based two-sided CI is requested by the RISKLIMITS option (additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat cov1 .. covk /
TIES=EXACT RISKLIMITS ALPHA;
STRATA strat1 .. stratj;
RUN;
* tte represents variable containing event/censoring times;
* cnsr represents censoring variable (0=event, 1=censored);
* treat represents treatment group variable (0=placebo, 1=treatment);
* cov1 to covk represent covariates other than treatment group;
* strat1 to stratj represent stratification variables;

Tuesday, September 03, 2019

Unstratified log-rank test and stratified log-rank test


The logrank test, or log-rank test, is a hypothesis test to compare the survival distributions of two samples. It is a nonparametric test and appropriate to use when the data are right skewed and censored (technically, the censoring must be non-informative). It is widely used in clinical trials to establish the efficacy of a new treatment in comparison with a control treatment when the measurement is the time to event (such as the time from initial treatment to a heart attack).

There are unstratified log-rank test and stratified log-rank test (as described below). In a clinical trial with stratified randomization, the stratified log-rank test is commonly used and the stratification factors used for randomization will be included in the log-rank test. 

Even for a study with time to event primary efficacy endpoint, the sample size calculation is usually based on the unstratifed log-rank test (i.e., the stratification factors are not considered). 

Log-rank test can provide a p-value for comparing the survival distributions of two samples (between two treatment groups), however, as described in an early post, "Splitting p-value and estimate of the treatment difference", the magnitude of the treatment difference (hazard ratio) needs to be calculated using Cox proportional regression - a semi-parametric method.



Unstratified log-rank test
Let ST(t) and Sp(t) denote the survival functions for the treatment group and placebo group, respectively.

The null hypothesis
  H0: ST(t) = Sp(t) for all t ≥ 0 “identical survival functions for both treatment groups”
is tested against the one-sided alternative hypothesis
  HA: ST(t) ≥ Sp(t) for all t ≥ 0 and ST(t) > Sp(t) for at least some t > 0
        “survival function in the treatment group superior to the survival function in the placebo group”

The unstratified log-rank test can be conducted by SAS PROC LIFETEST where the STRATA statement includes only the treatment group variable (treat). The TIME statement includes a variable with times to event (TTE) and an indicator variable for right censoring (cnsr) with 1 representing censoring (additional options controlling the output may be added):
PROC LIFETEST DATA=dataset METHOD=KM;
TIME tte*cnsr(1);
STRATA treat;RUN;

As an output of the procedure, the rank statistic S and variance Var(S) are obtained. Under the null hypothesis, the test statistic Z = S/sqrt[Var(S)] is approximately normally distributed. The one-sided p-value is therefore obtained from normally distributed Z statistic and H0 is rejected if the p-value does not exceed the (nominal) significance level assigned to the test.

Stratified log-rank test
Let ST,k(t) and Sp,k(t) denote the survival functions for the treatment group and placebo group in stratum k, k=1,..,K, K = m1 x m2 x .. x mj where mi denotes the number of categories for stratification
factor i, 1≤ i j, respectively. The null hypothesis
  H0: ST,k(t) = Sp,k(t) for all t ≥ 0 and all k (“identical survival functions for both treatment groups in each stratum”) is tested against the one-sided alternative hypothesis
  HA: ST,k(t) ≥ Sp,k(t) for all t ≥ 0 and all k=1,..,K and ST,m(t) > Sp,m(t) for at least some t > 0 and some m (“survival function in the treatment group superior to the survival in the placebo group in at least one stratum”)

Log-rank tests are performed for each of the strata separately, obtaining the rank statistic Sk and variance Var(Sk) where k=1, 2, …, K. The stratified log-rank test statistic is constructed as Z = [S1 +…+ SK] / sqrt[Var(S1) +…+ Var(SK)]. Under the null hypothesis, Z is approximately normally distributed. The one-sided stratified log-rank p-value is therefore obtained from normally distributed Z statistic and H0 is rejected if p does not exceed the (nominal) significance level assigned to the test.

The stratified log-rank test can be conducted with SAS PROC LIFETEST where the STRATA
statement includes the j strata variables (strat_1, .., strat_j) and the GROUP option
includes the treatment variable (treat). The TIME statement includes a variable with times to event (TTE) and an indicator variable for right censoring (cnsr) with 1 representing censoring (additional options controlling the output may be added):
PROC LIFETEST DATA=dataset METHOD=KM;
TIME tte*cnsr(1);
STRATA strat_1 .. strat_j GROUP=treat;
RUN;