Cox proportional hazards regression model has been called different names (Cox model,
Cox regression model, Proportional hazards model, ... can be used interchangeably). The original paper by D.R. Cox "
Regression models and life tables" is one of the most cited papers. Paired with the Kaplan-Meier method (and the log-rank test), the Cox proportional hazards model is the cornerstone for the survival analyses or all analyses with time to event endpoints.
With Cox regression model, we are now able to analyze the time to event data just like the linear regressions and logistic regressions to compare the treatment difference and to investigate the explanatory variables. The explanatory variables may be called independent variables, covariates, confounding factors,...
Depending on whether the explanatory variables are continuous or categorical and depending on how the explanatory variables are used in the Cox regression model (as covariates, as stratification factors), Cox regression model can be fitted in different ways:
Univariate Cox regression
|
Unstratified and unadjusted
|
Multivariate Cox regression
|
Unstratified and adjusted
|
Stratified Cox regression
|
Stratified and unadjusted
|
Stratified Multivariate Cox regression
|
Stratified and adjusted
|
The variables used in adjusted Cox regression can be categorical or continuous, but the variables used in stratified Cox regression should be categorical.
The following are compiled from various sources listed below:
Univariate Cox Proportional Hazards Regression Model (Unstratified Unadjusted Proportional Hazards Regression Model)
The
unstratified, unadjusted proportional hazards regression model is more commonly
called Univariate Cox proportional hazards regression model and its assumptions
are illustrated by:
h (.) hazard
function as a function of time (relative to the start date), the patient’s
treatment
and the unknown regression parameter
h0(t) unspecified
baseline hazard function at time t
xj1
treatment for patient j coded as 0 (placebo group) or 1 (treatment group); this coding
leads to a hazard ratio less than 1 if the estimated effect suggests a lower hazard
in the treatment group relative to the placebo group
β1
unknown regression parameter for the treatment effect on the log-scale (the log
hazard ratio).
The
hazard ratio (HR) is a multiplicative constant λ1 = exp(β1)
comparing the hazard function in the treatment group relative to the hazard
function in the placebo group (identical to the baseline hazard function). A HR less than 1 indicates decreased hazard for the event of interest in the treatment
group compared to the placebo group.
SAS
PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is used
to estimate the hazard ratio based on the partial maximum likelihood function;
a Wald-test based two-sided CI is requested by the RISKLIMITS option
(additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat
/
TIES=EXACT RISKLIMITS ALPHA=;
RUN;
* tte represents
variable containing event/censoring times;
* cnsr represents
censoring variable (0=event, 1=censored);
* treat represents
treatment group variable (0=placebo, 1=treatment);
Stratified (Unadjusted) Proportional Hazards Regression
The
stratified unadjusted proportional hazards regression model and its assumptions
are described by:
h (.) hazard
function as a function of time (relative to the start date), the patient’s
stratum
and treatment and the unknown regression parameter
h0k(t) unspecified
baseline hazard function for stratum k
at time t
xjk1
treatment for patient j in stratum k coded as 0 (placebo group) or 1 (treatment group); this coding
leads to a hazard ratio less than 1 if the estimated effect suggests a lower hazard
in the treatment group relative to the placebo group
β1
unknown regression parameter for the treatment effect on the log-scale (the log
hazard ratio).
The
hazard ratio is a multiplicative constant λ1 = exp(β1) comparing the hazard functions in the treatment
group relative to the hazard functions in the placebo group. The latter are allowed
varying across strata. However, the hazard ratio is assumed to be common across
strata. An HR less than 1 indicates decreased hazard for the event of interest in the
treatment group compared to the placebo group.
SAS
PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is
used
to estimate the hazard ratio based on the partial maximum likelihood function;
a Wald-test based two-sided CI is requested by the RISKLIMITS option
(additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat
/
TIES=EXACT RISKLIMITS ALPHA=;
STRATA strat1
..
stratj;
RUN;
* tte represents
variable containing event/censoring times;
* cnsr represents
censoring variable (0=event, 1=censored);
* treat represents treatment
group variable (0=placebo, 1=treatment);
* strat1 to stratj
represent
stratification variables;
Adjusted Cox Proportional Hazards Regression Model
(including Univariate Cox Proportional Hazards Regression Model and Multivariate Cox Proportional Hazards Regression Model)
The purpose of the model is to evaluate the effect
of a single factor (univariate) or simultaneously the effect of several factors
(multivariate) on survival. In other words, it allows us to examine how
specified factor(s) influence the rate of a particular event happening (e.g.,
infection, death) at a particular point in time. This rate is commonly referred
as the hazard rate. Predictor variable(s) (or factor(s)) are usually
termed covariates in the survival-analysis literature.
The Cox model is expressed by the hazard
function denoted by h(t). Briefly, the hazard function can be
interpreted as the risk of dying at time t. It can be estimated as follow:
·
t represents the survival time
·
h(t) is the hazard function determined by a
set of p covariates (x1,x2,...,xp)
·
the coefficients (b1,b2,...,bp) measure the
impact (i.e., the effect size) of covariates.
·
the term h0 is called the baseline hazard. It
corresponds to the value of the hazard if all the xi are
equal to zero (the quantity exp(0) equals 1). The ‘t’ in h(t) reminds us that
the hazard may vary over time.
The Cox model can be written as a multiple
linear regression of the logarithm of the hazard on the variables xi, with the baseline hazard being an
‘intercept’ term that varies with time.
The quantities exp(bi) are
called hazard ratios (HR). A value of bi greater
than zero, or equivalently a hazard ratio greater than one, indicates that as
the value of the ith covariate
increases, the event hazard increases and thus the length of survival
decreases.
Put another way, a hazard ratio above 1 indicates a covariate
that is positively associated with the event probability, and thus negatively
associated with the length of survival.
Univariate Cox proportional hazards
regression model
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat
/
TIES=EXACT RISKLIMITS ALPHA;RUN;
Multivariate Cox proportional hazards regression
model
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat
cov1 .. covk /
TIES=EXACT RISKLIMITS ALPHA;
RUN;
* tte represents
variable containing event/censoring times;
* cnsr represents
censoring variable (0=event, 1=censored);
* treat represents
treatment group variable (0=placebo, 1=treatment);
* cov1 to covk
represent
covariates other than treatment group;
Stratified Multivariate Cox Regression (Stratified Adjusted Proportional Hazards Regression)
The
stratified adjusted proportional hazards regression model and its assumptions
are illustrated by:
h (.) hazard
function as a function of time (relative to the start date), the patient’s
stratum
and covariates, and the unknown vector of regression parameters
h0k(t) unspecified
baseline hazard function for patients with covariate vector (0, .., 0) for stratum
k at time t
xjk vector of
Baseline covariates for patient j
in stratum k, with xjk1
representing
treatment
coded as 0 (placebo) or 1 (treatment)
β: unknown regression parameter vector on
the log-scale (β1 represents the log hazard ratio for treatment).
The
hazard ratio for the i-th covariate is a multiplicative constant exp(βi) comparing the hazard
functions between the levels of the i-th covariate. Baseline hazard functions are allowed varying
across strata without any restrictions. However, the hazard ratio comparing a
covariate (including treatment) is assumed to be common across strata. The estimates
of βi are adjusted for the other covariates in the model.
SAS
PROC PHREG (with the TIES=EXACT option for the “exact” handling of ties) is
used
to estimate the hazard ratios based on the partial maximum likelihood function;
a Wald-test based two-sided CI is requested by the RISKLIMITS option
(additional options controlling the output may be added):
PROC PHREG DATA=dataset;
MODEL tte*cnsr(1)=treat
cov1 .. covk /
TIES=EXACT RISKLIMITS ALPHA;
STRATA strat1
..
stratj;
RUN;
* tte represents
variable containing event/censoring times;
* cnsr represents
censoring variable (0=event, 1=censored);
* treat represents
treatment group variable (0=placebo, 1=treatment);
* cov1 to covk
represent
covariates other than treatment group;
* strat1 to stratj
represent
stratification variables;