Question

load BSgenome.Hsapiens.UCSC.hg38 or hg19 in R

0

Entering edit mode

5.9 years ago

sophialovechan ▴ 80

Hi everyone, I am using chromVAR to analyze ATAC-seq data. But when I tried to add GC contents to my data, I am running into problems to load sequences. The code is below.

counts_GC <- addGCBias(counts, genome = BSgenome.Hsapiens.UCSC.hg38)

The error is:

Error in loadFUN(x, seqname, ranges) : 
  trying to load regions beyond the boundaries of non-circular sequence "chr1"

When I used BSgenome.Hsapiens.UCSC.hg19, the error changed into:

Error in loadFUN(x, seqname, ranges) : 
  trying to load regions beyond the boundaries of non-circular sequence "chr17"

Can anyone give me a hand on what happened here and how I can solve this problem? Thank you so much and I really appreciate it.

R • 4.4k views

ADD COMMENT • link updated 5.9 years ago by GenoMax 141k • written 5.9 years ago by sophialovechan ▴ 80

0

Entering edit mode

Please post example records from counts object and also check str of object counts.

ADD REPLY • link 5.9 years ago by cpad0112 21k

0

Entering edit mode

That's the result from rowRanges.

GRanges object with 51883 ranges and 3 metadata columns:
          seqnames               ranges strand |     score      qval
             <Rle>            <IRanges>  <Rle> | <integer> <numeric>
      [1]     chr1     [ 10358,  10857]      * |        82   8.28541
      [2]     chr1     [ 11151,  11650]      * |        53   5.35280
      [3]     chr1     [ 29063,  29562]      * |       150  15.03696
      [4]     chr1     [ 32325,  32824]      * |       244  24.46690
      [5]     chr1     [114730, 115229]      * |        53   5.35280
      ...      ...                  ...    ... .       ...       ...
  [51879]     chrY [58989595, 58990094]      * |       104  10.43868
  [51880]     chrY [58991072, 58991571]      * |        68   6.83379
  [51881]     chrY [58992165, 58992664]      * |       108  10.81628
  [51882]     chrY [59004913, 59005412]      * |        64   6.41733
  [51883]     chrY [59213724, 59214223]      * |        95   9.54581
                             name
                      <character>
      [1]     MLC_new_rep2_peak_1
      [2]             SplM_peak_1
      [3]     MLC_new_rep2_peak_2
      [4]             SplM_peak_3
      [5]             SplM_peak_4
      ...                     ...
  [51879] MLC_new_rep2_peak_32686
  [51880] MLC_new_rep2_peak_32687
  [51881] MLC_new_rep2_peak_32688
  [51882] MLC_new_rep2_peak_32689
  [51883] MLC_new_rep2_peak_32690

ADD REPLY • link updated 5.9 years ago by GenoMax 141k • written 5.9 years ago by sophialovechan ▴ 80

0

Entering edit mode

The strand information is like that: counts@rowRanges@strand@values [1] * Levels: + - *

Is it something wrong with this? I just imported bam files from bowtie alignment followed by samtool sort.

ADD REPLY • link 5.9 years ago by sophialovechan ▴ 80

0

Entering edit mode

what is the class of the object? Object class must be RangedSummarizedExperiment or SummarizedExperiment

This works for me:

> data(example_counts, package = "chromVAR")
> class(example_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"
> subset_counts <- example_counts[1:10,]
> class(subset_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"
> example_counts <- addGCBias(subset_counts, genome = BSgenome.Hsapiens.UCSC.hg19)
> class(example_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"

session info:

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_IN.UTF-8       LC_NUMERIC=C               LC_TIME=en_IN.UTF-8       
 [4] LC_COLLATE=en_IN.UTF-8     LC_MONETARY=en_IN.UTF-8    LC_MESSAGES=en_IN.UTF-8   
 [7] LC_PAPER=en_IN.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] chromVAR_1.0.2                    BSgenome.Hsapiens.UCSC.hg19_1.4.0
 [3] BSgenome_1.46.0                   rtracklayer_1.38.3               
 [5] Biostrings_2.46.0                 XVector_0.18.0                   
 [7] GenomicRanges_1.30.3              GenomeInfoDb_1.14.0              
 [9] IRanges_2.12.0                    S4Vectors_0.16.0                 
[11] BiocGenerics_0.24.0              

loaded via a namespace (and not attached):
 [1] bitops_1.0-6                matrixStats_0.53.1          DirichletMultinomial_1.20.0
 [4] TFBSTools_1.16.0            bit64_0.9-7                 httr_1.3.1                 
 [7] rprojroot_1.3-2             tools_3.4.4                 backports_1.1.2            
[10] R6_2.2.2                    DT_0.4                      seqLogo_1.44.0             
[13] DBI_1.0.0                   lazyeval_0.2.1              colorspace_1.3-2           
[16] bit_1.1-12                  compiler_3.4.4              Biobase_2.38.0             
[19] DelayedArray_0.4.1          plotly_4.7.1                caTools_1.17.1             
[22] scales_0.5.0                readr_1.1.1                 stringr_1.3.1              
[25] digest_0.6.15               Rsamtools_1.30.0            rmarkdown_1.9              
[28] R.utils_2.6.0               pkgconfig_2.0.1             htmltools_0.3.6            
[31] htmlwidgets_1.2             rlang_0.2.0                 RSQLite_2.1.1              
[34] VGAM_1.0-5                  shiny_1.0.5                 bindr_0.1.1                
[37] jsonlite_1.5                BiocParallel_1.12.0         gtools_3.5.0               
[40] dplyr_0.7.4                 R.oo_1.22.0                 RCurl_1.95-4.10            
[43] magrittr_1.5                GO.db_3.5.0                 GenomeInfoDbData_1.0.0     
[46] Matrix_1.2-14               Rcpp_0.12.16                munsell_0.4.3              
[49] R.methodsS3_1.7.1           stringi_1.2.2               yaml_2.1.19                
[52] SummarizedExperiment_1.8.1  zlibbioc_1.24.0             plyr_1.8.4                 
[55] grid_3.4.4                  blob_1.1.1                  promises_1.0.1             
[58] miniUI_0.1.1                CNEr_1.14.0                 lattice_0.20-35            
[61] splines_3.4.4               annotate_1.56.2             hms_0.4.2                  
[64] KEGGREST_1.18.1             knitr_1.20                  pillar_1.2.2               
[67] reshape2_1.4.3              TFMPvalue_0.0.8             glue_1.2.0                 
[70] XML_3.98-1.11               evaluate_0.10.3             data.table_1.11.2          
[73] png_0.1-8                   httpuv_1.4.3                tidyr_0.8.0                
[76] purrr_0.2.4                 gtable_0.2.0                poweRlaw_0.70.1            
[79] assertthat_0.2.0            ggplot2_2.2.1               mime_0.5.1                 
[82] xtable_1.8-2                later_0.7.2                 viridisLite_0.3.0          
[85] tibble_1.4.2                GenomicAlignments_1.14.2    AnnotationDbi_1.40.0       
[88] memoise_1.1.0               bindrcpp_0.2.2

ADD REPLY • link 5.9 years ago by cpad0112 21k

0

Entering edit mode

I did the same as the same as yours. By using a subset of my data, it worked. But when I used the whole data again, the same error.

> class(counts) [1] "RangedSummarizedExperiment" attr(,"package") [1] "SummarizedExperiment"
> subset_counts <- counts[1:10,]
> class(subset_counts) [1] "RangedSummarizedExperiment" attr(,"package") [1] "SummarizedExperiment"
> example_counts <- addGCBias(subset_counts, genome = BSgenome.Hsapiens.UCSC.hg19) adGC_counts <- addGCBias(counts, genome = BSgenome.Hsapiens.UCSC.hg19) Error in loadFUN(x, seqname, ranges) :    trying to load regions beyond the boundaries of non-circular sequence "chr17"

ADD REPLY • link updated 5.9 years ago by GenoMax 141k • written 5.9 years ago by sophialovechan ▴ 80

0

Entering edit mode

Sometimes, it could be simply version problems. Make sure that you have updated libraries and R. Cross check with session info posted above. Example I gave was copy/pasted from manual except for number (500 and 10)

ADD REPLY • link 5.9 years ago by cpad0112 21k