Dear all, I am engaged in studying gene expression in forebrain. What I would like to know is if there are specific group of genes like that of RBPs, TFs, cell adhesion molecules, neurotransmitters etc specifically expressed in this region. Therefore I am assimilating data from different parts of the forebrain and studying differential gene expression. Since I am gathering data mostly from sequencing and profiling studies (already published), I need to understand exactly how many samples should I study? Can I compare data sets from two different studies? Since I will be studying and comparing the control (Wt) samples, most data generated from similar platfrom like either Illumina RNA-Seq or Microarray should have similar pattern. I have a fear that if I randomly take two RNA-Seq studies (generated on Illumina platform), there will be a disparity in gene expression. Also, if I study only one data set, the outcome might be biased and not true. How do you think I should approach this problem?
Directly comparing datasets from two studies might impact interpretation because of individual dataset biases. Usually for these there exist packages that enable meta analyses i.e combining data from platforms. Mtea Seq is one such package. I would urge to you explore on these lines.