load BSgenome.Hsapiens.UCSC.hg38 or hg19 in R
0
0
Entering edit mode
5.9 years ago

Hi everyone, I am using chromVAR to analyze ATAC-seq data. But when I tried to add GC contents to my data, I am running into problems to load sequences. The code is below.

counts_GC <- addGCBias(counts, genome = BSgenome.Hsapiens.UCSC.hg38)

The error is:

Error in loadFUN(x, seqname, ranges) : 
  trying to load regions beyond the boundaries of non-circular sequence "chr1"

When I used BSgenome.Hsapiens.UCSC.hg19, the error changed into:

Error in loadFUN(x, seqname, ranges) : 
  trying to load regions beyond the boundaries of non-circular sequence "chr17"

Can anyone give me a hand on what happened here and how I can solve this problem? Thank you so much and I really appreciate it.

R • 4.4k views
ADD COMMENT
0
Entering edit mode

Please post example records from counts object and also check str of object counts.

ADD REPLY
0
Entering edit mode

That's the result from rowRanges.

GRanges object with 51883 ranges and 3 metadata columns:
          seqnames               ranges strand |     score      qval
             <Rle>            <IRanges>  <Rle> | <integer> <numeric>
      [1]     chr1     [ 10358,  10857]      * |        82   8.28541
      [2]     chr1     [ 11151,  11650]      * |        53   5.35280
      [3]     chr1     [ 29063,  29562]      * |       150  15.03696
      [4]     chr1     [ 32325,  32824]      * |       244  24.46690
      [5]     chr1     [114730, 115229]      * |        53   5.35280
      ...      ...                  ...    ... .       ...       ...
  [51879]     chrY [58989595, 58990094]      * |       104  10.43868
  [51880]     chrY [58991072, 58991571]      * |        68   6.83379
  [51881]     chrY [58992165, 58992664]      * |       108  10.81628
  [51882]     chrY [59004913, 59005412]      * |        64   6.41733
  [51883]     chrY [59213724, 59214223]      * |        95   9.54581
                             name
                      <character>
      [1]     MLC_new_rep2_peak_1
      [2]             SplM_peak_1
      [3]     MLC_new_rep2_peak_2
      [4]             SplM_peak_3
      [5]             SplM_peak_4
      ...                     ...
  [51879] MLC_new_rep2_peak_32686
  [51880] MLC_new_rep2_peak_32687
  [51881] MLC_new_rep2_peak_32688
  [51882] MLC_new_rep2_peak_32689
  [51883] MLC_new_rep2_peak_32690
ADD REPLY
0
Entering edit mode

The strand information is like that: counts@rowRanges@strand@values [1] * Levels: + - *

Is it something wrong with this? I just imported bam files from bowtie alignment followed by samtool sort.

ADD REPLY
0
Entering edit mode

what is the class of the object? Object class must be RangedSummarizedExperiment or SummarizedExperiment

This works for me:

> data(example_counts, package = "chromVAR")
> class(example_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"
> subset_counts <- example_counts[1:10,]
> class(subset_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"
> example_counts <- addGCBias(subset_counts, genome = BSgenome.Hsapiens.UCSC.hg19)
> class(example_counts)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"

session info:

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_IN.UTF-8       LC_NUMERIC=C               LC_TIME=en_IN.UTF-8       
 [4] LC_COLLATE=en_IN.UTF-8     LC_MONETARY=en_IN.UTF-8    LC_MESSAGES=en_IN.UTF-8   
 [7] LC_PAPER=en_IN.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] chromVAR_1.0.2                    BSgenome.Hsapiens.UCSC.hg19_1.4.0
 [3] BSgenome_1.46.0                   rtracklayer_1.38.3               
 [5] Biostrings_2.46.0                 XVector_0.18.0                   
 [7] GenomicRanges_1.30.3              GenomeInfoDb_1.14.0              
 [9] IRanges_2.12.0                    S4Vectors_0.16.0                 
[11] BiocGenerics_0.24.0              

loaded via a namespace (and not attached):
 [1] bitops_1.0-6                matrixStats_0.53.1          DirichletMultinomial_1.20.0
 [4] TFBSTools_1.16.0            bit64_0.9-7                 httr_1.3.1                 
 [7] rprojroot_1.3-2             tools_3.4.4                 backports_1.1.2            
[10] R6_2.2.2                    DT_0.4                      seqLogo_1.44.0             
[13] DBI_1.0.0                   lazyeval_0.2.1              colorspace_1.3-2           
[16] bit_1.1-12                  compiler_3.4.4              Biobase_2.38.0             
[19] DelayedArray_0.4.1          plotly_4.7.1                caTools_1.17.1             
[22] scales_0.5.0                readr_1.1.1                 stringr_1.3.1              
[25] digest_0.6.15               Rsamtools_1.30.0            rmarkdown_1.9              
[28] R.utils_2.6.0               pkgconfig_2.0.1             htmltools_0.3.6            
[31] htmlwidgets_1.2             rlang_0.2.0                 RSQLite_2.1.1              
[34] VGAM_1.0-5                  shiny_1.0.5                 bindr_0.1.1                
[37] jsonlite_1.5                BiocParallel_1.12.0         gtools_3.5.0               
[40] dplyr_0.7.4                 R.oo_1.22.0                 RCurl_1.95-4.10            
[43] magrittr_1.5                GO.db_3.5.0                 GenomeInfoDbData_1.0.0     
[46] Matrix_1.2-14               Rcpp_0.12.16                munsell_0.4.3              
[49] R.methodsS3_1.7.1           stringi_1.2.2               yaml_2.1.19                
[52] SummarizedExperiment_1.8.1  zlibbioc_1.24.0             plyr_1.8.4                 
[55] grid_3.4.4                  blob_1.1.1                  promises_1.0.1             
[58] miniUI_0.1.1                CNEr_1.14.0                 lattice_0.20-35            
[61] splines_3.4.4               annotate_1.56.2             hms_0.4.2                  
[64] KEGGREST_1.18.1             knitr_1.20                  pillar_1.2.2               
[67] reshape2_1.4.3              TFMPvalue_0.0.8             glue_1.2.0                 
[70] XML_3.98-1.11               evaluate_0.10.3             data.table_1.11.2          
[73] png_0.1-8                   httpuv_1.4.3                tidyr_0.8.0                
[76] purrr_0.2.4                 gtable_0.2.0                poweRlaw_0.70.1            
[79] assertthat_0.2.0            ggplot2_2.2.1               mime_0.5.1                 
[82] xtable_1.8-2                later_0.7.2                 viridisLite_0.3.0          
[85] tibble_1.4.2                GenomicAlignments_1.14.2    AnnotationDbi_1.40.0       
[88] memoise_1.1.0               bindrcpp_0.2.2
ADD REPLY
0
Entering edit mode

I did the same as the same as yours. By using a subset of my data, it worked. But when I used the whole data again, the same error.

> class(counts) [1] "RangedSummarizedExperiment" attr(,"package") [1] "SummarizedExperiment"
> subset_counts <- counts[1:10,]
> class(subset_counts) [1] "RangedSummarizedExperiment" attr(,"package") [1] "SummarizedExperiment"
> example_counts <- addGCBias(subset_counts, genome = BSgenome.Hsapiens.UCSC.hg19) adGC_counts <- addGCBias(counts, genome = BSgenome.Hsapiens.UCSC.hg19) Error in loadFUN(x, seqname, ranges) :    trying to load regions beyond the boundaries of non-circular sequence "chr17"
ADD REPLY
0
Entering edit mode

Sometimes, it could be simply version problems. Make sure that you have updated libraries and R. Cross check with session info posted above. Example I gave was copy/pasted from manual except for number (500 and 10)

ADD REPLY

Login before adding your answer.

Traffic: 2398 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6