R use models from survival analysis and do prediction
Entering edit mode
9.6 years ago
hdy ▴ 160

I am learning survival analysis in R, especially the Cox proportional hazard model. I read a paper talking about using 80% of the sample as training set and 20% of sample as test set.

As quoted

On the training set, we first performed a pre-selection step to keep the top significant features correlated with overall survival (univariate Cox model, likelihood ratio test, P< 0.05). ...We used two computational methods to train the models: (i) Cox: the Cox proportional hazards model with LASSO for feature selection ...We then applied the models thereby obtained to the test set for prediction, and calculated the C-index using the R package survcomp.

I do not know how they actually did to apply the models from Cox model to the test set. I mean, for the training set, I can simply perform a coxph function. But the returned results are "coef,exp(coef),se(coef)),z,p" and likelihood ratio test p-value. How can I treat this as a model and use it on the 20% test set data?

machine-learning R model survival • 20k views
Entering edit mode

could you give the reference, please

Entering edit mode

paper name "Assessing the clinical utility of cancer genomic and proteomic data across tumor types" is on nature biotechnology. Thanks!

Entering edit mode
9.5 years ago
sarajbc ▴ 50

You can try to do something like this:

# Derive model in the training data (after feature selection - I believe that in the paper you mentioned they use LASSO: R has a good package for this: glmnet)
cox_model = coxph(Surv(training_data$Survival,training_data$Status) ~ ., data=training_data)

# Create survival estimates on validation data
pred_validation = predict (cox_model, newdata = validation_data)

# Determine concordance
cindex_validation = concordance.index (pred_validation, surv.time = validation_data$Survival,
                                       surv.event=validation_data$Status, method = "noether")

See more here

Hope it helps

Entering edit mode

Hi @sarajbc and hey, apologies for raising this old thread again. I am also struggling with a similar problem where I want to predict survival of the patient from methylation data. Can you please help me understanding what does this glmnet cox regression actually predict? For my datasets as I am getting all negative values as my predictions and I have no clue what does these actually mean.


Login before adding your answer.

Traffic: 1089 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6