I'm dealing with a classical dilemma: I performed RNA-seq experiment on two biological replicates for condition A and two others for condition B. After alignment and differential expression analysis using DESeq package, I have a whole list of genes with fold changes of A vs B. Now, mu question is: where do I put a cutoff?
- From a biological point of view, I'm tempted (as others have done the same) to set a FoldChage of 2 as a cutoff. 2 times more transcripts is somewhat significant at biological level for a cell. But is it really? If we assume it is, it brings me to the next point:
- What is a cutoff for p-value? I'm tempted to use padj (hence FDR-corrected) and the hits I'll get are almost surely genuine (in fact, I tested those by qPCR and indeed they are differentially expressed from A vs B). However, am-I missing potentially interesting hits by being too much restrictive? Then, where do I set my cutoff?
FYI: I'm dealing with Illumina, single strand 50pb, non strand-specific, bacterial RNA-seq data.
Thank you all for your input on this,