Dear all, I am engaged in studying gene expression in forebrain. What I would like to know is if there are specific group of genes like that of RBPs, TFs, cell adhesion molecules, neurotransmitters etc specifically expressed in this region. Therefore I am assimilating data from different parts of the forebrain and studying differential gene expression. Since I am gathering data mostly from sequencing and profiling studies (already published), I need to understand exactly how many samples should I study? Can I compare data sets from two different studies? Since I will be studying and comparing the control (Wt) samples, most data generated from similar platfrom like either Illumina RNA-Seq or Microarray should have similar pattern. I have a fear that if I randomly take two RNA-Seq studies (generated on Illumina platform), there will be a disparity in gene expression. Also, if I study only one data set, the outcome might be biased and not true. How do you think I should approach this problem?
Dear c.chakraborty, if you want to do a gene expression meta-analysis eg between case and control you need to select studies that have case and paired control samples and you could not compare cases from one study and controls from different study because of huge noise and batch to batch variation bias. I recommend you to search for differentially expression genes(DEG) in each dataset separately and then merge the obtained significant DEGs by fisher's method.
Dear Shamim, I am not going to use controls from one study and experiment from other. That is not the goal at all! :P Studies so far done have on brain sub-parts, have control data sets (i.e, one which is supposed to be homeostatic- no drug, no gene knock-out). I am interested in using only the control data sets say from two hypothalamic gene expression studies, and two from say cortex, and compare their gene expression. The question is can I use two control data sets obtained from Illumina RNA seq platform from two different groups or should I use only one control data set from one group? If I use both I have to agree, on things like that biases from RNA degradation during actual experiment, efficiency of poly A purification or not, read alignment, and coverage obtained by the transcriptome studies were similar. And that is a dice.