During differential expression analysis using DESeq2, I found odd results from it; some of genes with many missing data were detected as significant DEGs with very low pvalues as follow.
log2FoldChange pvalue padj Sample1 Sample2 Sample3 Sample4 Sample5 Sample6 Control1 Control2 Control3 Control4 Control5 Actc1 25.34 7.34E-72 4.45E-68 0.00 0.00 0.00 0.00 0.00 796.48 0.00 0.00 0.00 0.00 0.00 132.75 0.00
Sorry, it's not so easy to see hear, but you can see that only one sample out of 11 samples + controls actually has a count values which is unlikely detectable as a significant DEG. I observed many genes in this condition were detected as significant DEGs.
The code to obtain these results is like below,
conds <- c(rep("Sample",6),rep("Control",5)) res <- results(dds, contrast = c("conds", Sample, Control))
How come these genes have so low pvalues and padj so they are detected as DEGs?
Can somebody explain to me?