Huge difference between P and adj P value
1
2
Entering edit mode
3.2 years ago
Dr SKY ▴ 20

Hello friends, I recently analysed a RNA-Seq data and yielded DEGs using DESeq2. Everything was fine until I figured out that on applying P value 0.05, i am left with 1,700 DEGs whereas on considering and applying Padj 0.05, I am left with only 127 DEGs??? isn't this surprising ? How can there be such a huge difference in number of significant data on just considering adj P value?

Thank you to anyone who can afford any little energy and time to guide me. I am stuck!!

rna-seq • 1.1k views
ADD COMMENT
1
Entering edit mode

Check the expression of your DEG. The ones with a robust expression give more robust statistics (padj). Does this mean that the lesser expressed genes aren't differentially expressed? No, but we can't assure it statistically. A way of dealing with this issue is by adding more replicates (look for more experiments in public databases like GEO)

ADD REPLY
0
Entering edit mode

Thank you for your reply

ADD REPLY
8
Entering edit mode
3.2 years ago

Its completely expected. Under classic multiple testing correction you'd expect your adjusted P-values to be 20,000 times larger than you non-adjusted values. Obviously its not as bad as this because we don't use Bonferroni in RNAseq, but the multiple testing burden is still high.

Another way of looking at it is that under a standard hypothesis test, we would expect 5% of tests to give a false positive, thats around 1000 genes, so we would expect the number of genes passing 5% threshold to be at least 1000 less for an adjusted p-value.

ADD COMMENT
0
Entering edit mode

Thank you so very much. I can now finally proceed to the next level.

ADD REPLY

Login before adding your answer.

Traffic: 2494 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6