Hi all,
I want to integrate 19 RNAseq studies to create a coexpression network. First, I median normalized samples of each study and then used ComBat from sva to handle the batch effects. Now, I see some genes have a very high correlation in different studies. However, when I look into their expressions, the expression level of all of them is very negligible. In other words, I feel like their expression values are noise rather than signal.
I think I should create a coexpression network only for some genes rather than all of them. The genes whose expression changes significantly in different experiments. If I am correct, how should I find these genes?