Hi, I want to perform survival analysis on TCGA dataset. I use “survival” package in R to do it. For each gene, the equation for the model is “coxph(Surv(time,censor) ~ exprs)”, where time is survival time (for dead patients) or last follow up time (for alive patients), censor is dead or alive (alive=0 and dead=1) for each cancer sample, and exprs is the gene expression value. I have about 1000 genes. So I do it for 1000 times.
I also try almost the same equation just changing censor from “alive=0 and dead=1” to “alive=1 and dead=0”. The p-value changes a lot. The number of significant genes is almost the same. But the overlapping of significant genes for these two options is quite small (~30%).
From my understanding, the code for alive or dead cannot affect anything. However, why does it affect the result?