Random dataset and DESeq2
1
0
Entering edit mode
10 months ago
Saleh • 0

Hi everyone,

I have real gene counts data which I reshuffled (the counts for each gene in a column were shuffled) and I ran DESeq2 to check If I get any signifcant genes and I got more than hundred significant genes. Isn't this result surprising? why am I getting so many significant genes? Is there something wrong with my approach here?

DESeq2 • 601 views
ADD COMMENT
0
Entering edit mode

I don't think you've given enough detail about your approach (what's the data? are you correcting for covariates? how did you shuffle? how are you defining significant? how are you estimating library sizes?) to enable anyone to comment on whether it's right or wrong.

ADD REPLY
1
Entering edit mode
10 months ago
LauferVA 4.2k

Saleh

Not only is this not surprising, it is the basis for an entire class of statistical techniques referred to as permutation based testing, which can be used to derive accurate test statistics when one is worried about insufficient type I error control for some reason or another - I think that is the place to start reading (try https://en.wikipedia.org/wiki/Permutation_test to start, move to literature from there).

If these test statistics did not generate significant results, they would have no ability to be used to empirically derive alpha (threshold for Type I Error).

You havent told us how many gene annotations you are using, but nearly all the gene sets used are 18000 - 50000 or so. 100 DEGs is therefore between 1 in 180 and 1 in 500, which not at all unreasonable even after controlling for multiple testing using False discovery rate, or FDR.

ADD COMMENT

Login before adding your answer.

Traffic: 2409 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6