Question: Error in `[.data.frame`(res.df, , cols) : undefined columns selected when exporting glimma
0
gravatar for n.tear
5 weeks ago by
n.tear20
n.tear20 wrote:

Hi,

I am trying to export my RNAseq results with glimma and I am following this workshop (https://bioconductor.github.io/BiocWorkshops/rna-seq-data-analysis-with-deseq2.html#building-reports) unfortuneatly I am geting an error as per the title.

Please would someone be able to educate me as to why this is occuring? My code below:

#Glimma
library(org.Hs.eg.db)
res.s$symbol <- mapIds(org.Hs.eg.db,keys = rownames(res.s),column = "SYMBOL",keytype="ENSEMBL")
table(rownames(res.s) %in% keys(org.Hs.eg.db, "ENSEMBL"))

library("Glimma")
tmp <- "Z:/R.DESeq2"
status <- as.numeric(res.s$padj < .1)
anno <- data.frame(GeneID=rownames(res.s), symbol=res.s$symbol)
glMDPlot(res.sLFC, status=status, counts=counts(dds.s,normalized=TRUE),
         groups=dds.s$comparison, transform=FALSE,
         samples=colnames(dds.s), anno=anno,
         path=tmp, folder="glimma", launch=FALSE)

Error in [.data.frame(res.df, , cols) : undefined columns selected

Many thanks in advance Nathan

rna-seq • 174 views
ADD COMMENTlink written 5 weeks ago by n.tear20

Hello Nathan. What is the output of:

str(res.sLFC)
colnames(res.sLFC)
colnames(dds.s)

?

ADD REPLYlink written 5 weeks ago by Kevin Blighe61k

Hi Kevin thank you for your reply

str(res.sLFC)
    Formal class 'DESeqResults' [package "DESeq2"] with 7 slots
      ..@ priorInfo      :List of 4
      .. ..$ type         : chr "apeglm"
      .. ..$ package      : chr "apeglm"
      .. ..$ version      :Classes 'package_version', 'numeric_version'  hidden list of 1
      .. .. ..$ : int [1:3] 1 10 0
      .. ..$ prior.control:List of 7
      .. .. ..$ no.shrink            : int [1:2] 1 2
      .. .. ..$ prior.mean           : num 0
      .. .. ..$ prior.scale          : num 0.865
      .. .. ..$ prior.df             : num 1
      .. .. ..$ prior.no.shrink.mean : num 0
      .. .. ..$ prior.no.shrink.scale: num 15
      .. .. ..$ prior.var            : num 0.749
      ..@ rownames       : chr [1:31172] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" "ENSG00000000457" ...
      ..@ nrows          : int 31172
      ..@ listData       :List of 4
      .. ..$ baseMean      : num [1:31172] 30899.28 2.93 2457.81 1310.71 417.27 ...
      .. ..$ log2FoldChange: Named num [1:31172] -2.98 -3.053 0.289 -0.931 0.92 ...
      .. .. ..- attr(*, "names")= chr [1:31172] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" "ENSG00000000457" ...
      .. ..$ lfcSE         : Named num [1:31172] 0.314 1.084 0.165 0.183 0.334 ...
      .. .. ..- attr(*, "names")= chr [1:31172] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" "ENSG00000000457" ...
      .. ..$ svalue        : num [1:31172] 0.00011 0.03615 0.88092 0.86355 0.81761 ...
      ..@ elementType    : chr "ANY"
      ..@ elementMetadata:Formal class 'DFrame' [package "S4Vectors"] with 6 slots
      .. .. ..@ rownames       : NULL
      .. .. ..@ nrows          : int 4
      .. .. ..@ listData       :List of 2
      .. .. .. ..$ type       : chr [1:4] "intermediate" "results" "results" "results"
      .. .. .. ..$ description: chr [1:4] "
      .. .. ..@ elementType    : chr "ANY"
      .. .. ..@ elementMetadata: NULL
      .. .. ..@ metadata       : list()
      ..@ metadata       :List of 6
      .. ..$ filterThreshold: Named num 0.986
      .. .. ..- attr(*, "names")= chr "11.63265%"
      .. ..$ filterTheta    : num 0.116
      .. ..$ filterNumRej   :'data.frame':  50 obs. of  2 variables:
      .. .. ..$ theta : num [1:50] 0 0.0194 0.0388 0.0582 0.0776 ...
      .. .. ..$ numRej: num [1:50] 8324 8372 8443 8492 8551 ...
      .. ..$ lo.fit         :List of 2
      .. .. ..$ x: num [1:50] 0 0.0194 0.0388 0.0582 0.0776 ...
      .. .. ..$ y: num [1:50] 8322 8379 8436 8493 8551 ...
      .. ..$ alpha          : num 0.1
      .. ..$ lfcThreshold   : num 2
    > colnames(res.sLFC)
    [1] "baseMean"       "log2FoldChange" "lfcSE"          "svalue"        
    > colnames(dds.s)
     [1] "C18"     "C54"     "C56"     "C59"     "C23"     "C27"     "C38"
ADD REPLYlink modified 4 weeks ago • written 5 weeks ago by n.tear20

Should res.sLFC not just be a data-frame? - and you will have to specify the x and y values via xval and yval, respectively.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Kevin Blighe61k

I generated the res.sLFC using:

res.sLFC <- lfcShrink(dds.s, coef="co", type="apeglm", lfcThreshold=2)

As a data frame:

> res.sLFCdf <- as.data.frame(res.sLFC)
> res.sLFCdf
                    baseMean log2FoldChange      lfcSE       svalue
ENSG00000000003 3.089928e+04   -2.980109835 0.31391008 1.098978e-04
ENSG00000000005 2.927032e+00   -3.052505162 1.08427276 3.614998e-02
ENSG00000000419 2.457810e+03    0.288835065 0.16471434 8.809206e-01
ENSG00000000457 1.310707e+03   -0.930850511 0.18271970 8.635484e-01
ENSG00000000460 4.172665e+02    0.919899394 0.33440589 8.176108e-01

Now I have changed toa data frame the error is:

>glMDPlot(res.sLFCdf, status=status, counts=counts(dds.s,normalized=TRUE),
+          groups=dds.s$comparison, transform=FALSE,
+          samples=colnames(dds.s), anno=anno,
+          path=tmp, folder="glimma", launch=FALSE)
Error in if (isDefaultStates(x) && isDefaultCols(cols)) { : 
  missing value where TRUE/FALSE needed
ADD REPLYlink modified 4 weeks ago • written 5 weeks ago by n.tear20

I see. You will have to also set the following:

glMDPlot(..., xval <- 'baseMean', yval <- 'log2FoldChange', ...)

In fact, you may even have to create a new column for the logged base mean, and then use that:

res.sLFCdf$logmean <- log(res.sLFCdf$baseMean + 1)
glMDPlot(..., xval <- 'logmean', yval <- 'log2FoldChange', ...)
ADD REPLYlink written 5 weeks ago by Kevin Blighe61k

Hi Kevin, many thanks for your continued support.

Unfortunately I am still having no luck

> res.sLFCdf$logmean <- log(res.sLFCdf$baseMean + 1)
> glMDPlot(res.sLFCdf, xval <- 'logmean', yval <- 'log2FoldChange', 
+          status=status, counts=counts(dds.s,normalized=TRUE),
+          groups=dds.s$comparison, transform=FALSE,
+          samples=colnames(dds.s), anno=anno,
+          path=tmp, folder="glimma", launch=TRUE)
Error in if (isDefaultStates(x) && isDefaultCols(cols)) { : 
  missing value where TRUE/FALSE needed
ADD REPLYlink written 5 weeks ago by n.tear20

Seems that there are missing or infinite values in your logmean variable. Can check the output of this:

table( is.na(res.sLFCdf$logmean) )

table( res.sLFCdf$logmean == 'Inf' )
ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Kevin Blighe61k

output suggests not

> table( is.na(res.sLFCdf$logmean) )

FALSE 
31172 
> table( res.sLFCdf$logmean == 'Inf' )

FALSE 
31172
ADD REPLYlink written 5 weeks ago by n.tear20

Can you literally check your input data, in both formatting (string, numeric, factor, etc) and also for other missing values? Like, literally, look through it or output it to disk and open it in Excel.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Kevin Blighe61k

Yes looked in exel and no missing values and checked the class of each column in res.sLFCdf and they are correct

could it be an issue that not all the rownames of res.s are in org.Hs.eg.db when I make the anno?

> library(org.Hs.eg.db)
> res.s$symbol <- mapIds(org.Hs.eg.db,keys = rownames(res.s),column = "SYMBOL",keytype="ENSEMBL")
'select()' returned 1:many mapping between keys and
columns
> table(rownames(res.s) %in% keys(org.Hs.eg.db, "ENSEMBL"))

FALSE  TRUE 
10649 20523

I also have some NAs in my status object. could this be a problem?

status <- as.numeric(res.s$padj < .1)
status
   [1]  1  1  0  1  1  0  1  0  1  0  0  1  0  1  1  1  0
  [18]  0  0  0  0  0  0  1  1  0  1  0  0  1  0  0  0  1
  [35]  1  1  0  1  0  1  1  0  0  1  1  0  1  1  0  0  0
  [52]  0  0  0  1  0  0  0  0  0  0  0  1  0  0  0  1  0
  [69]  0 NA  1  0  1  0  1  0  0 NA  0  0  0  0  0  0  0
ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by n.tear20

Did you manage to get any further with this? I am limited in what I can do from a distance. Best idea would be to find the last known working example, and then work from there

ADD REPLYlink written 4 weeks ago by Kevin Blighe61k
1

Hi Kevin,

Thanks for your help. Ive done exacly as you have said above and managed to work it out in the end.

Cheers

Nathan

ADD REPLYlink written 4 weeks ago by n.tear20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1703 users visited in the last hour