Peter Ralph
Advanced Biological Statistics
Goal: understand how covariates affect survival time.
Idea: modify a nonparametric hazard rate by a linear predictor: \[ h(t) = h_0(t) e^{\beta x} .\]
How does survival probability depend on the covariates?
age: Age in years
sex: Male=1 Female=2
ph.ecog: ECOG performance score as rated by the physician.
(good=0-bad=4)
ph.karno: Karnofsky performance score (bad=0-good=100) rated by physician
pat.karno: Karnofsky performance score as rated by patient
meal.cal: Calories consumed at meals
wt.loss: Weight loss in last six months
How does survival probability depend on the covariates?
age: Age in years
sex: Male=1 Female=2
ph.ecog: ECOG performance score as rated by the physician.
(good=0-bad=4)
ph.karno: Karnofsky performance score (bad=0-good=100) rated by physician
pat.karno: Karnofsky performance score as rated by patient
meal.cal: Calories consumed at meals
wt.loss: Weight loss in last six months
\[\begin{aligned} h(t) &= \text{(baseline hazard rate at $t$)} \\ y_i &= \beta_0 + \sum_j \beta_j x_{ij} \\ h_i(t) &= \text{(hazard rate for $i$ at $t$)} \\ &= h(t) \times \exp\left( y_i \right) \\ S_i(t) &= \P\left\{\text{survival of $i$ until $t$}\right\} \\ &= \exp\left( - \int_0^t h_i(s) ds \right) . \end{aligned}\]
fullcox <- coxph(Surv(time, status)
~ age + sex + ph.ecog + ph.karno + pat.karno
+ meal.cal + wt.loss,
data = lung)
summary(fullcox)
## Call:
## coxph(formula = Surv(time, status) ~ age + sex + ph.ecog + ph.karno +
## pat.karno + meal.cal + wt.loss, data = lung)
##
## n= 168, number of events= 121
## (60 observations deleted due to missingness)
##
## coef exp(coef) se(coef) z Pr(>|z|)
## age 1.065e-02 1.011e+00 1.161e-02 0.917 0.35906
## sex -5.509e-01 5.765e-01 2.008e-01 -2.743 0.00609 **
## ph.ecog 7.342e-01 2.084e+00 2.233e-01 3.288 0.00101 **
## ph.karno 2.246e-02 1.023e+00 1.124e-02 1.998 0.04574 *
## pat.karno -1.242e-02 9.877e-01 8.054e-03 -1.542 0.12316
## meal.cal 3.329e-05 1.000e+00 2.595e-04 0.128 0.89791
## wt.loss -1.433e-02 9.858e-01 7.771e-03 -1.844 0.06518 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## age 1.0107 0.9894 0.9880 1.0340
## sex 0.5765 1.7347 0.3889 0.8545
## ph.ecog 2.0838 0.4799 1.3452 3.2277
## ph.karno 1.0227 0.9778 1.0004 1.0455
## pat.karno 0.9877 1.0125 0.9722 1.0034
## meal.cal 1.0000 1.0000 0.9995 1.0005
## wt.loss 0.9858 1.0144 0.9709 1.0009
##
## Concordance= 0.651 (se = 0.029 )
## Likelihood ratio test= 28.33 on 7 df, p=2e-04
## Wald test = 27.58 on 7 df, p=3e-04
## Score (logrank) test = 28.41 on 7 df, p=2e-04
subcox <- coxph(Surv(time, status) ~ sex + ph.ecog, data = lung, subset=rowSums(is.na(lung[,2:10]))==0)
anova(fullcox, subcox)
## Analysis of Deviance Table
## Cox model: response is Surv(time, status)
## Model 1: ~ age + sex + ph.ecog + ph.karno + pat.karno + meal.cal + wt.loss
## Model 2: ~ sex + ph.ecog
## loglik Chisq Df P(>|Chi|)
## 1 -498.75
## 2 -503.14 8.7792 5 0.1182
Interpretation:
We have survival time data for 228 lung cancer patients, of whom 90 are female and 138 are male. Most of the patients are between 55 and 75 years old, and the extremes are 39 and 82 years old. The median survival time was around 1-1.5 years, with females having around 20% longer survival times. The longest-living pateients lived for (at least) three years. Note: explain variables also.
pairs(as.data.frame(sapply(lung[c("age", "ph.ecog", "ph.karno", "pat.karno")], jitter)), col=lung$sex, pch=20)
The ECOG scores and Karnoffsky scores (say something about the ranges of these).
The physician- and patient-assigned Karnofsky scores were positively correlated (\(r=0.5\)), and were negatively correlated with ECOG score (\(r=-0.8\) for physician-assigned). These were not strongly correlated with age.
We fit a Cox proportional hazards model (…) The covariates that most strongly predicted survival were sex and ECOG score, and both were strongly statistically significant. Females were predicted to have about a 15% higher surivival probability at one year (todo: get precise number). Lower ECOG scores (ie, better patient health at initial diagnosis) predicted higher survival, with a 30% survival for ECOG 3 to 80% survival for ECOG 0. The difference between ECOG scores was larger than between sexes.
TODO: say something about uncertainty in estimates.
TODO: re-fit with ECOG as a factor, and compare
TODO: figure out why Karnofsky score is going the opposite direction
## Call:
## coxph(formula = Surv(time, status) ~ sex + ph.karno, data = lung)
##
## n= 227, number of events= 164
## (1 observation deleted due to missingness)
##
## coef exp(coef) se(coef) z Pr(>|z|)
## sex -0.504210 0.603982 0.167723 -3.006 0.00265 **
## ph.karno -0.015155 0.984959 0.005727 -2.646 0.00814 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## sex 0.604 1.656 0.4348 0.8391
## ph.karno 0.985 1.015 0.9740 0.9961
##
## Concordance= 0.638 (se = 0.026 )
## Likelihood ratio test= 17.05 on 2 df, p=2e-04
## Wald test = 17.18 on 2 df, p=2e-04
## Score (logrank) test = 17.5 on 2 df, p=2e-04