Question

R use models from survival analysis and do prediction

2

Entering edit mode

10.1 years ago

hdy ▴ 180

I am learning survival analysis in R, especially the Cox proportional hazard model. I read a paper talking about using 80% of the sample as training set and 20% of sample as test set.

As quoted

On the training set, we first performed a pre-selection step to keep the top significant features correlated with overall survival (univariate Cox model, likelihood ratio test, P< 0.05). ...We used two computational methods to train the models: (i) Cox: the Cox proportional hazards model with LASSO for feature selection ...We then applied the models thereby obtained to the test set for prediction, and calculated the C-index using the R package survcomp.

I do not know how they actually did to apply the models from Cox model to the test set. I mean, for the training set, I can simply perform a coxph function. But the returned results are "coef,exp(coef),se(coef)),z,p" and likelihood ratio test p-value. How can I treat this as a model and use it on the 20% test set data?

machine-learning R model survival • 20k views

ADD COMMENT • link updated 2.7 years ago by Ram 44k • written 10.1 years ago by hdy ▴ 180

0

Entering edit mode

could you give the reference, please

ADD REPLY • link 10.1 years ago by russhh 5.7k

1

Entering edit mode

paper name "Assessing the clinical utility of cancer genomic and proteomic data across tumor types" is on nature biotechnology. Thanks!

ADD REPLY • link 10.1 years ago by hdy ▴ 180

Ram · Accepted Answer · 2014-08-26

5

Entering edit mode

10.0 years ago

sarajbc ▴ 50

You can try to do something like this:

# Derive model in the training data (after feature selection - I believe that in the paper you mentioned they use LASSO: R has a good package for this: glmnet)
cox_model = coxph(Surv(training_data$Survival,training_data$Status) ~ ., data=training_data)

# Create survival estimates on validation data
pred_validation = predict (cox_model, newdata = validation_data)

# Determine concordance
cindex_validation = concordance.index (pred_validation, surv.time = validation_data$Survival,
                                       surv.event=validation_data$Status, method = "noether")

See more here

Hope it helps

ADD COMMENT • link updated 2.7 years ago by Ram 44k • written 10.0 years ago by sarajbc ▴ 50

0

Entering edit mode

Hi @sarajbc and hey, apologies for raising this old thread again. I am also struggling with a similar problem where I want to predict survival of the patient from methylation data. Can you please help me understanding what does this glmnet cox regression actually predict? For my datasets as I am getting all negative values as my predictions and I have no clue what does these actually mean.

ADD REPLY • link 5.3 years ago by Researcher ▴ 130