Hi, I have performed RNA-seq on samples from two different conditions (each with atleast two or three replicates). I have aligned the samples using tophat and now wish to perform differential expression analysis using Cufflinks. I have a query about cuffmerge. I suspect a replicate sample from one condition to be an outlier. So, should I include this sample to generate the merged assembly file using cuffmerge or exclude it. What is the likely effect of this on differential expression analysis if I include/exclude it?
It depends on which basis you believe it to be an outlier? If it just has unusually high expression over the same transcripts that the other samples have, then including it should not be much of a problem for cuffmerge. However, if you believe that it shows expression in regions that it shouldn't (like, expression 'noise'), then cuffmerge will interpret these as genuine transcripts and will include them in the consensus transcriptome.