I am not a statistician and my knowledge of statistics in limited. I have been working on a project where we used RNA seq, and analysed the results with DESeq2 method. I have been experimenting with log 2 fold change thresholding and noticed that p values change whether I set the threshold for lfc inside the results function or I do thresholding manually afterwards. To make it more clear, this is the code I used:
res<- results(dds, contrast = c("treatment", "AICAr", "ctrl"), lfcThreshold = 1, alpha = 0.05)
res<- results(dds, contrast = c("treatment", "AICAr", "ctrl")) res <- res[abs(res$log2FoldChange) > 1 & padj < 0.05, ]
First strategy seems very conservative - give much larger p-values, and much smaller number of genes that are significantly different. Can someone explain why is this so? Also, what do you think is the better approach?