I am processing RNA-Seq data (2 plant genotypes collected over 3 time points). I have 2 conditions -- control and treatment for each genotype and each condition has 3 biological replicates. For 2 out of 3 time points, I notice one biological replicate either in control or treatment has significantly low read counts compared to the other replicates of the same condition (e.g. 1-3 millions vs . 9-12 millions). I normalize the data with TMM before calling DEGs using edgeR, which I think it should handle the differences in read counts. However, no. of DEGs is almost double when I repeat the analysis without the sample with low-read count (110 vs 210 DEGs).
I am considering whether I should remove the samples with low read count from the analysis. The downside is I would have 2 biological replicates left for DEGs. Would you mind sharing your thoughts or suggestions?
Thanks a lot in advance!