\[ %% % Add your macros here; they'll be included in pdf and html output. %% \newcommand{\R}{\mathbb{R}} % reals \newcommand{\E}{\mathbb{E}} % expectation \renewcommand{\P}{\mathbb{P}} % probability \DeclareMathOperator{\logit}{logit} \DeclareMathOperator{\logistic}{logistic} \DeclareMathOperator{\SE}{SE} \DeclareMathOperator{\sd}{sd} \DeclareMathOperator{\var}{var} \DeclareMathOperator{\cov}{cov} \DeclareMathOperator{\cor}{cor} \DeclareMathOperator{\Normal}{Normal} \DeclareMathOperator{\MVN}{MVN} \DeclareMathOperator{\LogNormal}{logNormal} \DeclareMathOperator{\Poisson}{Poisson} \DeclareMathOperator{\Beta}{Beta} \DeclareMathOperator{\Binom}{Binomial} \DeclareMathOperator{\Gam}{Gamma} \DeclareMathOperator{\Exp}{Exponential} \DeclareMathOperator{\Cauchy}{Cauchy} \DeclareMathOperator{\Unif}{Unif} \DeclareMathOperator{\Dirichlet}{Dirichlet} \DeclareMathOperator{\Wishart}{Wishart} \DeclareMathOperator{\StudentsT}{StudentsT} \DeclareMathOperator{\Weibull}{Weibull} \newcommand{\given}{\;\vert\;} \]

Cox’s Proportional Hazards

Peter Ralph

Advanced Biological Statistics

Cox’s proportional hazards

Cox’s orange pippin (credit: Wikimedia)

Proportional Hazards

Goal: understand how covariates affect survival time.

Idea: modify a nonparametric hazard rate by a linear predictor: \[ h(t) = h_0(t) e^{\beta x} .\]

plot of chunk r plotcoxph

Covariates

How does survival probability depend on the covariates?

age:        Age in years                                                        
sex:        Male=1 Female=2                                                     
ph.ecog:    ECOG performance score as rated by the physician. 
            (good=0-bad=4)
ph.karno:   Karnofsky performance score (bad=0-good=100) rated by physician     
pat.karno:  Karnofsky performance score as rated by patient                     
meal.cal:   Calories consumed at meals                                          
wt.loss:    Weight loss in last six months     

Covariates

How does survival probability depend on the covariates?

age:        Age in years                                                        
sex:        Male=1 Female=2                                                     
ph.ecog:    ECOG performance score as rated by the physician. 
            (good=0-bad=4)
ph.karno:   Karnofsky performance score (bad=0-good=100) rated by physician     
pat.karno:  Karnofsky performance score as rated by patient                     
meal.cal:   Calories consumed at meals                                          
wt.loss:    Weight loss in last six months     

\[\begin{aligned} h(t) &= \text{(baseline hazard rate at $t$)} \\ y_i &= \beta_0 + \sum_j \beta_j x_{ij} \\ h_i(t) &= \text{(hazard rate for $i$ at $t$)} \\ &= h(t) \times \exp\left( y_i \right) \\ S_i(t) &= \P\left\{\text{survival of $i$ until $t$}\right\} \\ &= \exp\left( - \int_0^t h_i(s) ds \right) . \end{aligned}\]

Fitting a Cox model

fullcox <- coxph(Surv(time, status)
                     ~ age + sex + ph.ecog + ph.karno + pat.karno
                     + meal.cal + wt.loss,
                 data = lung)
summary(fullcox)
## Call:
## coxph(formula = Surv(time, status) ~ age + sex + ph.ecog + ph.karno + 
##     pat.karno + meal.cal + wt.loss, data = lung)
## 
##   n= 168, number of events= 121 
##    (60 observations deleted due to missingness)
## 
##                 coef  exp(coef)   se(coef)      z Pr(>|z|)   
## age        1.065e-02  1.011e+00  1.161e-02  0.917  0.35906   
## sex       -5.509e-01  5.765e-01  2.008e-01 -2.743  0.00609 **
## ph.ecog    7.342e-01  2.084e+00  2.233e-01  3.288  0.00101 **
## ph.karno   2.246e-02  1.023e+00  1.124e-02  1.998  0.04574 * 
## pat.karno -1.242e-02  9.877e-01  8.054e-03 -1.542  0.12316   
## meal.cal   3.329e-05  1.000e+00  2.595e-04  0.128  0.89791   
## wt.loss   -1.433e-02  9.858e-01  7.771e-03 -1.844  0.06518 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##           exp(coef) exp(-coef) lower .95 upper .95
## age          1.0107     0.9894    0.9880    1.0340
## sex          0.5765     1.7347    0.3889    0.8545
## ph.ecog      2.0838     0.4799    1.3452    3.2277
## ph.karno     1.0227     0.9778    1.0004    1.0455
## pat.karno    0.9877     1.0125    0.9722    1.0034
## meal.cal     1.0000     1.0000    0.9995    1.0005
## wt.loss      0.9858     1.0144    0.9709    1.0009
## 
## Concordance= 0.651  (se = 0.029 )
## Likelihood ratio test= 28.33  on 7 df,   p=2e-04
## Wald test            = 27.58  on 7 df,   p=3e-04
## Score (logrank) test = 28.41  on 7 df,   p=2e-04

Model comparison

subcox <- coxph(Surv(time, status) ~ sex + ph.ecog, data = lung, subset=rowSums(is.na(lung[,2:10]))==0)
anova(fullcox, subcox)
## Analysis of Deviance Table
##  Cox model: response is  Surv(time, status)
##  Model 1: ~ age + sex + ph.ecog + ph.karno + pat.karno + meal.cal + wt.loss
##  Model 2: ~ sex + ph.ecog
##    loglik  Chisq Df P(>|Chi|)
## 1 -498.75                    
## 2 -503.14 8.7792  5    0.1182

Predictions

subcox_pred <- survfit(subcox, newdata=expand.grid(ph.ecog=0:3, sex=2))
plot(subcox_pred, lty=1, col=1:4, main='predicted survival, females')
legend("topright", lty=1, col=1:4, legend=paste("ECOG=", 0:3))

plot of chunk r prediction

In class

Interpretation:

  • describe the data
  • explain and explore the results
  • interpret conclusions.

Describe the data

We have survival time data for 228 lung cancer patients, of whom 90 are female and 138 are male. Most of the patients are between 55 and 75 years old, and the extremes are 39 and 82 years old. The median survival time was around 1-1.5 years, with females having around 20% longer survival times. The longest-living pateients lived for (at least) three years. Note: explain variables also.

Describe the covariates

pairs(as.data.frame(sapply(lung[c("age", "ph.ecog", "ph.karno", "pat.karno")], jitter)), col=lung$sex, pch=20)

plot of chunk r plotit

The ECOG scores and Karnoffsky scores (say something about the ranges of these).

The physician- and patient-assigned Karnofsky scores were positively correlated (\(r=0.5\)), and were negatively correlated with ECOG score (\(r=-0.8\) for physician-assigned). These were not strongly correlated with age.

Results

We fit a Cox proportional hazards model (…) The covariates that most strongly predicted survival were sex and ECOG score, and both were strongly statistically significant. Females were predicted to have about a 15% higher surivival probability at one year (todo: get precise number). Lower ECOG scores (ie, better patient health at initial diagnosis) predicted higher survival, with a 30% survival for ECOG 3 to 80% survival for ECOG 0. The difference between ECOG scores was larger than between sexes.

TODO: say something about uncertainty in estimates.

TODO: re-fit with ECOG as a factor, and compare

TODO: figure out why Karnofsky score is going the opposite direction

only_karno <- coxph(Surv(time, status) ~ sex + ph.karno, data = lung)
summary(only_karno)
## Call:
## coxph(formula = Surv(time, status) ~ sex + ph.karno, data = lung)
## 
##   n= 227, number of events= 164 
##    (1 observation deleted due to missingness)
## 
##               coef exp(coef)  se(coef)      z Pr(>|z|)   
## sex      -0.504210  0.603982  0.167723 -3.006  0.00265 **
## ph.karno -0.015155  0.984959  0.005727 -2.646  0.00814 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##          exp(coef) exp(-coef) lower .95 upper .95
## sex          0.604      1.656    0.4348    0.8391
## ph.karno     0.985      1.015    0.9740    0.9961
## 
## Concordance= 0.638  (se = 0.026 )
## Likelihood ratio test= 17.05  on 2 df,   p=2e-04
## Wald test            = 17.18  on 2 df,   p=2e-04
## Score (logrank) test = 17.5  on 2 df,   p=2e-04
// reveal.js plugins