Question: Gene expression comparison of Cancer and Normal tissue
I have a basic question about RNAseq normalisation while comparing cancer/tumor tissue expression data to normal tissue expression.

Assuming that cancer cells will be highly proliferating, most of the cells are in S/G2/M phase whereas normal cells will be more in G1 phase. Hence if we are comparing a gene which is expressed during S-phase, it will show that tumor cells have high expression tumors, whereas if we are comparing a gene which is suppressed during S/G2/M it will be highly expressed in normal cells. How this is accounted for in RNA-seq analysis?

For example, I want to know a specific pathway and/or expression of a gene is expressed highly in tumor cell lines because of aberrant activation of pathway/gene or just because of the predominance of the specific phase of cell cycle, although in many cases both mean the same.

Most RNA Seq normalization methods assume that most genes are unchanged between conditions, and should have roughly equal expression (see Robinson and Oshlack, 2010 for example). The data is accordingly adjusted such that the bulk of genes in the middle of the expression range have the same mean in each data set. Genes which are differentially expressed, by definition will not show the same characteristic as “most genes”, so even after some adjustment of the data with a normalization constant, they will show a difference, up or down, between conditions. Thus you should be able to see cell cycle genes, or any other pathway genes, as differentially expressed, as long as the assumption that most genes are not DE is true. On the other hand, if a large fraction of genes is DE between conditions, this is where externally added standards can help, as then the assumption shifts to the standards being equal between conditions.

