Question: DEXseq Error - SummarizedExperiment
0
gravatar for BM
5 months ago by
BM40
United Kingdom
BM40 wrote:

I have an error when performing a DEXseq count for exon usuage from RNA-seq data.

DEXSeqDataSetFromSE( SE, design= ~ sample + exon + condition:exon )

I receive the error: Error in DEXSeqDataSetFromSE(SE, design = ~sample + exon + condition:exon) : 
make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part

Any help would be appreciated.

This is the full code

library(DEXSeq)
library(GenomicRanges)
library(GenomicFeatures)
library(GenomicAlignments)
library(biomaRt)
hse = makeTxDbFromBiomart( biomart="ensembl", dataset="mmusculus_gene_ensembl", host="ensembl.org")
library(txdb.mmusculus.ucsc.mm10.knowngene)
exonicParts = exonicParts( hse, linked.to.single.gene.only = TRUE )
list.files(pattern=".bam$")
fls = list.files(pattern="bam$", full=TRUE )
bamlst = BamFileList( fls, index=character(), yieldSize=100000, obeyQname=TRUE )
SE = summarizeOverlaps( exonicParts, bamlst, mode="Union", singleEnd=FALSE, ignore.strand=TRUE, inter.feature=FALSE, fragments=TRUE )
DEXSeqDataSetFromSE( SE, design= ~ sample + exon + condition:exon )
Error in DEXSeqDataSetFromSE(SE, design = ~sample + exon + condition:exon) : 
make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part

**Session Info:**

> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.38.0              GenomicAlignments_1.18.1    Rsamtools_1.34.1           
 [4] Biostrings_2.50.2           XVector_0.22.0              GenomicFeatures_1.34.8     
 [7] DEXSeq_1.28.3               RColorBrewer_1.1-2          AnnotationDbi_1.44.0       
[10] DESeq2_1.22.2               SummarizedExperiment_1.12.0 DelayedArray_0.8.0         
[13] matrixStats_0.54.0          GenomicRanges_1.34.0        GenomeInfoDb_1.18.2        
[16] IRanges_2.16.0              S4Vectors_0.20.1            Biobase_2.42.0             
[19] BiocGenerics_0.28.0         BiocParallel_1.16.6        

loaded via a namespace (and not attached):
 [1] httr_1.4.0             bit64_0.9-7            splines_3.5.2          assertthat_0.2.1      
 [5] Formula_1.2-3          statmod_1.4.30         BiocManager_1.30.4     latticeExtra_0.6-28   
 [9] blob_1.1.1             GenomeInfoDbData_1.2.0 progress_1.2.1         pillar_1.4.0          
[13] RSQLite_2.1.1          backports_1.1.4        lattice_0.20-38        digest_0.6.18         
[17] checkmate_1.9.3        colorspace_1.4-1       htmltools_0.3.6        Matrix_1.2-17         
[21] plyr_1.8.4             XML_3.98-1.19          pkgconfig_2.0.2        genefilter_1.64.0     
[25] zlibbioc_1.28.0        xtable_1.8-4           snow_0.4-3             scales_1.0.0          
[29] htmlTable_1.13.1       tibble_2.1.1           annotate_1.60.1        ggplot2_3.1.1         
[33] nnet_7.3-12            lazyeval_0.2.2         survival_2.44-1.1      magrittr_1.5          
[37] crayon_1.3.4           memoise_1.1.0          hwriter_1.3.2          foreign_0.8-71        
[41] prettyunits_1.0.2      tools_3.5.2            data.table_1.12.2      hms_0.4.2             
[45] stringr_1.4.0          munsell_0.5.0          locfit_1.5-9.1         cluster_2.0.9         
[49] compiler_3.5.2         rlang_0.3.4            grid_3.5.2             RCurl_1.95-4.12       
[53] rstudioapi_0.10        htmlwidgets_1.3        bitops_1.0-6           base64enc_0.1-3       
[57] gtable_0.3.0           curl_3.3               DBI_1.0.0              R6_2.4.0              
[61] gridExtra_2.3          rtracklayer_1.42.2     knitr_1.22             bit_1.1-14            
[65] Hmisc_4.2-0            stringi_1.4.3          Rcpp_1.0.1             geneplotter_1.60.0    
[69] rpart_4.1-15           acepack_1.4.1          xfun_0.7
ADD COMMENTlink modified 4 months ago by benformatics1.1k • written 5 months ago by BM40

make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part**

Did you do that? Check the DEXseq manual on #Requirements on GTF files.

ADD REPLYlink modified 5 months ago • written 5 months ago by ATpoint24k
0
gravatar for benformatics
4 months ago by
benformatics1.1k
ETH Zurich
benformatics1.1k wrote:

This is a Bioconductor issue it seems... In the past it looks like DEXSeq took an "exonic parts" object created through a disjointExons call despite the current manual suggesting that users use an exonicParts call.

If it is no problem for you to reorder your ranges then a simple fix is the following:

orderByGeneName <- order(unlist(exonicParts$gene_id, use.names=FALSE))
exonic_rle <- runLength(Rle(unlist(exonicParts$gene_id[orderByGeneName],use.names=FALSE))) 
exonicParts <- exonicParts[orderByGeneName]
exonic_part <- unlist(lapply(exonic_rle, seq_len), use.names=FALSE)
exonicParts$exonic_part <- exonic_part

If you already have your ranges and want to keep them in the same order since you already made your SE object then use the following (simply uses a temporary variable):

orderByGeneName <- order(unlist(exonicParts$gene_id, use.names=FALSE))
exonic_rle <- runLength(Rle(unlist(exonicParts$gene_id[orderByGeneName],use.names=FALSE))) 
exonicParts2 <- exonicParts[orderByGeneName]
exonic_part <- unlist(lapply(exonic_rle, seq_len), use.names=FALSE)
exonicParts2$exonic_part <- exonic_part
exonicParts$exonic_part <- exonicParts2$exonic_part[match(as.character(exonicParts),as.character(exonicParts2))]
rm(exonicParts2)
## double check that the following is TRUE
identical(rowRanges(SE),exonicParts)
rowData(SE)$exonic_part <-  exonicParts$exonic_part

Although I did notice some minor discrepancies (~3-7%) between the disjointExons and exonicParts; which I think relates to associations between genes and exons.

ADD COMMENTlink written 4 months ago by benformatics1.1k

I tried what you suggested using option 1,now I have the following error:

converting counts to integer mode Error in DESeqDataSet(rse, design, ignoreRank = TRUE) : all samples have 0 counts for all genes. check the counting script.

ADD REPLYlink written 4 months ago by BM40

I assume you checked that your count table actually contains non-zero counts?

Ultimately, this should only provide some meta-data to the final object... the counts should remain unchanged.

ADD REPLYlink written 4 months ago by benformatics1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1976 users visited in the last hour