Question: Weird R behaviour with NAs in DESeq2 results
1
gravatar for Carlo Yague
9 days ago by
Carlo Yague4.3k
Belgium
Carlo Yague4.3k wrote:

Hello !

I have a DESeq2 result dataframe-like structure in R with some NAs in the p.adj column. Strangely, those NAs are handled seemingly randomly by two extremely similar functions used to test the genes dow- or up- regulation. What is happening here ? A quick fix would be to change those NAs into "1" (never significant) but I want to understand :) Here is my code:

significant_D <- function(x){return(x$padj < 0.01 & (x$log2FoldChange) < -0.584)}

significant_O <- function(x){return(x$padj < 0.01 & (x$log2FoldChange) > 0.584)}

head(DESeq_results[whichis.na(DESeq_results$padj)),])
               baseMean log2FoldChange     lfcSE        stat    pvalue      padj
               <numeric>      <numeric> <numeric>   <numeric> <numeric> <numeric>
WBGene00021406 20.718704    -0.21384520 0.5063939 -0.42229021 0.6728132        NA
WBGene00021407  3.961096     0.66807041 1.1159760  0.59864226 0.5494115        NA
WBGene00021405  1.939649    -1.16416923 1.6007395 -0.72726966 0.4670608        NA
WBGene00021409  1.719862     1.91952086 2.2842210  0.84033940 0.4007181        NA
WBGene00235257  7.055687     0.23653150 0.8488041  0.27866442 0.7805024        NA
WBGene00015246 15.699319     0.06366663 0.6514634  0.09772863 0.9221478        NA

sumis.na(significant_D(DESeq_results)))
[1] 2558

sumis.na(significant_O(DESeq_results)))
[1] 1452

sumis.na(significant_D(DESeq_results)) & is.na(significant_O(DESeq_results)))
[1] 0

sumis.na(DESeq_results$padj))
[1] 8582
rna-seq deseq2 na R • 151 views
ADD COMMENTlink modified 8 days ago by RamRS17k • written 9 days ago by Carlo Yague4.3k
4
gravatar for Devon Ryan
9 days ago by
Devon Ryan84k
Freiburg, Germany
Devon Ryan84k wrote:

The confusion is due to the following:

> TRUE & NA
NA
> FALSE & NA
FALSE
> NA & NA
NA

The output of significant_D and significant_O are boolean vectors with some NA values. It will necessarily be true that any NA or TRUE values output by one of these will be FALSE in the other (after all, the only time you can get a TRUE or NA is when the fold-change passes filtering, which means it will fail in the other function). Since FALSE & NA is FALSE the penultimate sum is 0. As an aside, it makes sense that FALSE & NA is FALSE, since NA can be considered "unknown".

ADD COMMENTlink written 9 days ago by Devon Ryan84k

Good catch ! Thanks !

ADD REPLYlink written 9 days ago by Carlo Yague4.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1386 users visited in the last hour