I would like to find genes correlated with poor prognosis. I am doing a simple survival analysis:
- divide patients into two groups by gene expression (using median as cutoff).
- find genes significantly correlated with overall survival time (using coxph function in R).
- check whether my list of genes are up or down regulated in cancer samples compared to normal samples.
- finding genes with hazard ratio larger than 1 (low expression group lives longer) that are up regulated in cancer sample and also genes with hazard ratio smaller than 1 that are down in tumors.
Am I doing it right? Is the 4th step necessary? Must the genes with hazard ratio larger than 1 be up regulated in tumor compared to normal tissue (or the hazard ratio won't make any sense)?