\[%% % Add your macros here; they'll be included in pdf and html output. %% \newcommand{\R}{\mathbb{R}} % reals \newcommand{\E}{\mathbb{E}} % expectation \renewcommand{\P}{\mathbb{P}} % probability \DeclareMathOperator{\logit}{logit} \DeclareMathOperator{\logistic}{logistic} \DeclareMathOperator{\sd}{sd} \DeclareMathOperator{\var}{var} \DeclareMathOperator{\cov}{cov} \DeclareMathOperator{\cor}{cor} \DeclareMathOperator{\Normal}{Normal} \DeclareMathOperator{\LogNormal}{logNormal} \DeclareMathOperator{\Poisson}{Poisson} \DeclareMathOperator{\Beta}{Beta} \DeclareMathOperator{\Binom}{Binomial} \DeclareMathOperator{\Gam}{Gamma} \DeclareMathOperator{\Exp}{Exponential} \DeclareMathOperator{\Cauchy}{Cauchy} \DeclareMathOperator{\Unif}{Unif} \DeclareMathOperator{\Dirichlet}{Dirichlet} \DeclareMathOperator{\Wishart}{Wishart} \DeclareMathOperator{\StudentsT}{StudentsT} \DeclareMathOperator{\Weibull}{Weibull} \newcommand{\given}{\;\vert\;} \]

Homework, week 11: Survival analysis

Assignment: Your task is to use Rmarkdown to write a short report, readable by a technically literate person. The code you used should not be visible in the final report (unless you have a good reason to show it).

Due: Submit your work via Canvas by the end of the day (midnight) on Thursday, January 14th. Please submit both the Rmd file and the resulting html or pdf file. You can work with other members of class, but I expect each of you to construct and run all of the scripts yourself.

The Problem

For this assignment you’ll analyze the veteran built-in dataset in the survival package. In this dataset, 137 males with advanced inoperable lung cancer were randomized to either a standard or an experimental chemotherapy; subsequent survival times are recorded (in days), along with the type of treatment received (“test” means experimental) and several other variables (see help(veteran)).

Please describe the data (overall mortality rates, survival times, and patterns) both in words and with graphical summaries, and perform an analysis to determine which variables are significantly associated with survival, and in particular whether the experimental chemotherapy differs significantly from the standard treatment. Be sure to communicate both real-world significance as well as statistical uncertainty.

Your report should include both Kaplan-Meier plots with multiple curves for different categories, as well as survival curves predicted by a fitted model that incorporates other covariates.