Hi I just want to do some simple survival analysis using TCGA expression data. I use survival package in R. Firstly I must divide the data into two parts. The one is patients with low expression of gene A and the other is patients with high expression of gene A. Then I plot the survival curve in R and calculate significant P value. The criterion to divide the data which I used is the mean value of mRNA expression value of whole data. My question is whether this criterion is suitable or not. Do you have some suggestions? Could you please give me some reference?
If all that you need to hear is whether or not it is okay to divide 'high' and 'low' expression based on the mean, then the answer is: 'yes, it is okay'. You can also do the following:
- divide based on the median
- divide based on upper-, middle-, and lower tertiles
- divide by quartiles
- divide expression based on some other clinical trait
- et cetera.
You can take a look here, in addition: A: cox proportional hazard model
I'm giving an answer so that this thread will not be bumped up further to the main page.