GSEA with fgsea package and using stat from DESeq2 output - what to do with NA padj values?
2
1
Entering edit mode
11 weeks ago
Fossil ▴ 20

Hello, I am running GSEA in R using the fgsea package and using stat from the DESeq2 output. With my DESeq2 output, I removed genes that have no DESeq2 data (log2FC, stat, p, etc...).

Question: I was wondering if I also need to remove rows with padj=NA. I have a couple of genes that have baseMean, log2FC, lfcSE, stat, pval but no padj (probably due to the threshold).

What should I do with these genes? They are still processed in fgsea as I am working with stat and not the p value.

Thanks!

RNA-seq DESeq2 GSEA • 484 views
ADD COMMENT
2
Entering edit mode
11 weeks ago
Ming Tommy Tang ★ 4.3k

first, take a look at those genes, what are the gene types? (non-coding RNA?). Are they expressed at very low level? how many of them? most of the time, it is probably okay to just remove them and rank the gene list using stat or the p-value.

btw, this video may help a little too

ADD COMMENT
2
Entering edit mode
11 weeks ago
ATpoint 84k

The NAs come from the independent filtering and outlier removal of DESeq2.

The statistics behind fgsea rely on ranks and permutations. Hence, you cannot just put the NAs to some value (like 0) as this results in many arbitrary rank ties which might skrew that pvalue calculation.

Two options:

  • Remove NAs (I would probably do that) since these did not have good evidence to even be considered in the final DE analysis
  • Turn off all independent filtering and outlier detection in DESeq2. If so, I would definitely recommend prefiltering before running DESeq() as suggested in the vignette, or using edgeR::filterByExpr to really focus on genes with sufficient counts for analysis
ADD COMMENT
1
Entering edit mode

Thank you for the help! That makes sense.

I will probably just remove the NAs (there are about 6500 padj NAs out of 20,000 rows) and keep the independent filtering on.

ADD REPLY

Login before adding your answer.

Traffic: 1289 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6