Question: load BSgenome.Hsapiens.UCSC.hg38 or hg19 in R
0
gravatar for sophialovechan
12 months ago by
sophialovechan40 wrote:

Hi everyone, I am using chromVAR to analyze ATAC-seq data. But when I tried to add GC contents to my data, I am running into problems to load sequences. The code is below.

counts_GC <- addGCBias(counts, genome = BSgenome.Hsapiens.UCSC.hg38)

The error is:

Error in loadFUN(x, seqname, ranges) : 
  trying to load regions beyond the boundaries of non-circular sequence "chr1"

When I used BSgenome.Hsapiens.UCSC.hg19, the error changed into:

Error in loadFUN(x, seqname, ranges) : 
  trying to load regions beyond the boundaries of non-circular sequence "chr17"

Can anyone give me a hand on what happened here and how I can solve this problem? Thank you so much and I really appreciate it.

R • 606 views
ADD COMMENTlink modified 12 months ago by genomax67k • written 12 months ago by sophialovechan40

Please post example records from counts object and also check str of object counts.

ADD REPLYlink written 12 months ago by cpad011211k

That's the result from rowRanges.

GRanges object with 51883 ranges and 3 metadata columns:
          seqnames               ranges strand |     score      qval
             <Rle>            <IRanges>  <Rle> | <integer> <numeric>
      [1]     chr1     [ 10358,  10857]      * |        82   8.28541
      [2]     chr1     [ 11151,  11650]      * |        53   5.35280
      [3]     chr1     [ 29063,  29562]      * |       150  15.03696
      [4]     chr1     [ 32325,  32824]      * |       244  24.46690
      [5]     chr1     [114730, 115229]      * |        53   5.35280
      ...      ...                  ...    ... .       ...       ...
  [51879]     chrY [58989595, 58990094]      * |       104  10.43868
  [51880]     chrY [58991072, 58991571]      * |        68   6.83379
  [51881]     chrY [58992165, 58992664]      * |       108  10.81628
  [51882]     chrY [59004913, 59005412]      * |        64   6.41733
  [51883]     chrY [59213724, 59214223]      * |        95   9.54581
                             name
                      <character>
      [1]     MLC_new_rep2_peak_1
      [2]             SplM_peak_1
      [3]     MLC_new_rep2_peak_2
      [4]             SplM_peak_3
      [5]             SplM_peak_4
      ...                     ...
  [51879] MLC_new_rep2_peak_32686
  [51880] MLC_new_rep2_peak_32687
  [51881] MLC_new_rep2_peak_32688
  [51882] MLC_new_rep2_peak_32689
  [51883] MLC_new_rep2_peak_32690
ADD REPLYlink modified 12 months ago by genomax67k • written 12 months ago by sophialovechan40

The strand information is like that: counts@rowRanges@strand@values [1] * Levels: + - *

Is it something wrong with this? I just imported bam files from bowtie alignment followed by samtool sort.

ADD REPLYlink written 12 months ago by sophialovechan40

what is the class of the object? Object class must be RangedSummarizedExperiment or SummarizedExperiment

This works for me:

> data(example_counts, package = "chromVAR")
> class(example_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"
> subset_counts <- example_counts[1:10,]
> class(subset_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"
> example_counts <- addGCBias(subset_counts, genome = BSgenome.Hsapiens.UCSC.hg19)
> class(example_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"

session info:

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_IN.UTF-8       LC_NUMERIC=C               LC_TIME=en_IN.UTF-8       
 [4] LC_COLLATE=en_IN.UTF-8     LC_MONETARY=en_IN.UTF-8    LC_MESSAGES=en_IN.UTF-8   
 [7] LC_PAPER=en_IN.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] chromVAR_1.0.2                    BSgenome.Hsapiens.UCSC.hg19_1.4.0
 [3] BSgenome_1.46.0                   rtracklayer_1.38.3               
 [5] Biostrings_2.46.0                 XVector_0.18.0                   
 [7] GenomicRanges_1.30.3              GenomeInfoDb_1.14.0              
 [9] IRanges_2.12.0                    S4Vectors_0.16.0                 
[11] BiocGenerics_0.24.0              

loaded via a namespace (and not attached):
 [1] bitops_1.0-6                matrixStats_0.53.1          DirichletMultinomial_1.20.0
 [4] TFBSTools_1.16.0            bit64_0.9-7                 httr_1.3.1                 
 [7] rprojroot_1.3-2             tools_3.4.4                 backports_1.1.2            
[10] R6_2.2.2                    DT_0.4                      seqLogo_1.44.0             
[13] DBI_1.0.0                   lazyeval_0.2.1              colorspace_1.3-2           
[16] bit_1.1-12                  compiler_3.4.4              Biobase_2.38.0             
[19] DelayedArray_0.4.1          plotly_4.7.1                caTools_1.17.1             
[22] scales_0.5.0                readr_1.1.1                 stringr_1.3.1              
[25] digest_0.6.15               Rsamtools_1.30.0            rmarkdown_1.9              
[28] R.utils_2.6.0               pkgconfig_2.0.1             htmltools_0.3.6            
[31] htmlwidgets_1.2             rlang_0.2.0                 RSQLite_2.1.1              
[34] VGAM_1.0-5                  shiny_1.0.5                 bindr_0.1.1                
[37] jsonlite_1.5                BiocParallel_1.12.0         gtools_3.5.0               
[40] dplyr_0.7.4                 R.oo_1.22.0                 RCurl_1.95-4.10            
[43] magrittr_1.5                GO.db_3.5.0                 GenomeInfoDbData_1.0.0     
[46] Matrix_1.2-14               Rcpp_0.12.16                munsell_0.4.3              
[49] R.methodsS3_1.7.1           stringi_1.2.2               yaml_2.1.19                
[52] SummarizedExperiment_1.8.1  zlibbioc_1.24.0             plyr_1.8.4                 
[55] grid_3.4.4                  blob_1.1.1                  promises_1.0.1             
[58] miniUI_0.1.1                CNEr_1.14.0                 lattice_0.20-35            
[61] splines_3.4.4               annotate_1.56.2             hms_0.4.2                  
[64] KEGGREST_1.18.1             knitr_1.20                  pillar_1.2.2               
[67] reshape2_1.4.3              TFMPvalue_0.0.8             glue_1.2.0                 
[70] XML_3.98-1.11               evaluate_0.10.3             data.table_1.11.2          
[73] png_0.1-8                   httpuv_1.4.3                tidyr_0.8.0                
[76] purrr_0.2.4                 gtable_0.2.0                poweRlaw_0.70.1            
[79] assertthat_0.2.0            ggplot2_2.2.1               mime_0.5.1                 
[82] xtable_1.8-2                later_0.7.2                 viridisLite_0.3.0          
[85] tibble_1.4.2                GenomicAlignments_1.14.2    AnnotationDbi_1.40.0       
[88] memoise_1.1.0               bindrcpp_0.2.2
ADD REPLYlink modified 12 months ago • written 12 months ago by cpad011211k

I did the same as the same as yours. By using a subset of my data, it worked. But when I used the whole data again, the same error.

> class(counts) [1] "RangedSummarizedExperiment" attr(,"package") [1] "SummarizedExperiment"
> subset_counts <- counts[1:10,]
> class(subset_counts) [1] "RangedSummarizedExperiment" attr(,"package") [1] "SummarizedExperiment"
> example_counts <- addGCBias(subset_counts, genome = BSgenome.Hsapiens.UCSC.hg19) adGC_counts <- addGCBias(counts, genome = BSgenome.Hsapiens.UCSC.hg19) Error in loadFUN(x, seqname, ranges) :    trying to load regions beyond the boundaries of non-circular sequence "chr17"
ADD REPLYlink modified 12 months ago by genomax67k • written 12 months ago by sophialovechan40

Sometimes, it could be simply version problems. Make sure that you have updated libraries and R. Cross check with session info posted above. Example I gave was copy/pasted from manual except for number (500 and 10)

ADD REPLYlink modified 12 months ago • written 12 months ago by cpad011211k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1963 users visited in the last hour