Question: R use models from survival analysis and do prediction
2
gravatar for hdy
5.9 years ago by
hdy110
United States
hdy110 wrote:

I am learning survival analysis in R, especially the Cox proportional hazard model. I read a paper talking about using 80% of the sample as training set and 20% of sample as test set.

 

As quoted "On the training set, we first performed a pre-selection step to keep the top significant features correlated with overall survival (univariate Cox model, likelihood ratio test, P < 0.05). ... We used two computational methods to train the models: (i) Cox: the Cox proportional hazards model with LASSO for feature selection ... We then applied the models thereby obtained to the test set for prediction, and calculated the C-index using the R package survcomp."

 

I do not know how they actually did to apply the models from Cox model to the test set. I mean, for the training set, I can simply perform a coxph function. But the returned results are "coef,exp(coef),se(coef)),z,p"  and likelood ratio test p-value. How can I treat this as a model and use it on the 20% test set data?

ADD COMMENTlink modified 4.4 years ago by openabstract0 • written 5.9 years ago by hdy110

could you give the reference, please

ADD REPLYlink written 5.9 years ago by russhh5.4k
1

paper name "Assessing the clinical utility of cancer genomic and proteomic data across tumor types" is on nature biotechnology. Thanks!

ADD REPLYlink written 5.9 years ago by hdy110
5
gravatar for sarajbc
5.9 years ago by
sarajbc50
sarajbc50 wrote:

You can try to do something like this:

# Derive model in the training data (after feature selection - I believe that in the paper you mentioned they use LASSO: R has a good package for this: glmnet)

cox_model = coxph(Surv(training_data$Survival,training_data$Status) ~ ., data=training_data) 

# Create survival estimates on validation data
pred_validation = predict (cox_model, newdata = validation_data)

# Determine concordance
cindex_validation = concordance.index (pred_validation, surv.time = validation_data$Survival,
                                       surv.event=validation_data$Status, method = "noether")

 

See more here: http://stats.stackexchange.com/questions/48298/computing-c-index-for-an-external-validation-of-a-cox-ph-model-with-r

 

Hope it helps

ADD COMMENTlink modified 5.9 years ago • written 5.9 years ago by sarajbc50

Hi @sarajbc and hey, apologies for raising this old thread again. I am also struggling with a similar problem where I want to predict survival of the patient from methylation data. Can you please help me understanding what does this glmnet cox regression actually predict? For my datasets as I am getting all negative values as my predictions and I have no clue what does these actually mean.

ADD REPLYlink modified 14 months ago • written 14 months ago by Researcher50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1801 users visited in the last hour