Here I have two groups of genes. There are over 2000 genes in each group. Then I check the clinical outcome of each gene in these two groups by survival analysis using “survival” in R, and the equation for the model is
coxph(Surv(time,censor) ~ exprs). Here time is survival time (for dead) or last follow up time (for alive). Censor is dead or alive for each sample. Exprs is gene expression value measured by RPKM.
Every model should give you P-value as well as coefficient for each gene. Previous study said that “the Cox model also provides a coefficient for each term, which is related to its contribution to the hazard ratio. A positive coefficient indicates that the gene increases the hazard ratio, while a negative coefficient indicates that expression of the gene is protective.”
I marked every genes in each group whose P-value <=0.05 as prognositc genes. Then I plotted the distribution of coefficients of prognositc genes in each group by
boxplot. I found that the coefficients from group1 is significant lower than group2 (
wilcox.test in R). I interpreted this result as there are more protective genes or less harmful genes in group1, since lower coefficient means more negative coefficient as well as smaller positive coefficients.
Is it meaningful to do this comparison? What does this result mean for you? Can you please tell me your interpretation?