Whole exomes: Rsamtools, RAM issue
0
0
Entering edit mode
7.2 years ago
bioinfo8 ▴ 230

Hi,

I have many whole exome BAM files (aligned to reference). As an R admirer, I used Rstudio to do initial analysis on one of them (~2.7 GB) using Rsamtools.

1) Whether single or paired-end:

> testPairedEndBam("1.bam")   
[1] TRUE 
> quickBamFlagSummary("1.bam")           # Got detailed information

2) Read bam file

> bam <- BamFile("1.bam", asMates = TRUE)
> bam
    class: BamFile 
    path: 1.bam
    index:1.bam.bai
    isOpen: FALSE 
    yieldSize: NA 
    obeyQname: FALSE 
    asMates: TRUE                                       
    qnamePrefixEnd: NA 
    qnameSuffixStart: NA

3) Some high­ level information

> seqinfo(bam)

Seqinfo object with 1133 sequences from an unspecified genome:               # and other information

4) Read all the reads in the file using scanBam()

details <- scanBam(bam)

But at this step, it goes on running and running and I am stuck at this step. Any thoughts please? How much RAM is required to process a 3GB BAM file in Rstudio? I have windows 8.1, 64-bit computer with 16 GB RAM.

Thanks!

> sessionInfo()

R version 3.3.2 (2016-10-31)

Platform: x86_64-w64-mingw32/x64 (64-bit)

Running under: Windows >= 8 x64 (build 9200)

locale:

[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:

[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):

[1] zlibbioc_1.16.0 IRanges_2.4.8 XVector_0.10.0 futile.logger_1.4.3 parallel_3.3.2
[6] tools_3.3.2 GenomicRanges_1.22.4 lambda.r_1.1.9 futile.options_1.0.0 Biostrings_2.38.4
[11] S4Vectors_0.8.11 BiocGenerics_0.16.1 BiocParallel_1.4.3 Rsamtools_1.26.1 GenomeInfoDb_1.6.3
[16] stats4_3.3.2 bitops_1.0-6

R Rsamtools whole exome RAM BAM • 2.3k views
ADD COMMENT
0
Entering edit mode

Check in R how much memory you can use:

memory.limit()
ADD REPLY
0
Entering edit mode
> memory.limit()
    [1] 16365

But if I wanted to check for a specific file, then?

ADD REPLY
0
Entering edit mode

I think memory is not the problem then. Do you get an error? How long do you let it run?

Maybe traceback can give you any clues?

traceback()
ADD REPLY
0
Entering edit mode

It was running for 31 minutes and showed following error:

> details <- scanBam(bam) 
Error in value[[3L]](cond) : cannot allocate vector of size 1024.0 Mb
  file: 1.bam
  index: 1.bam.bai

> traceback()
8: stop(conditionMessage(err), "\n  file: ", path(file), "\n  index: ", 
       index(file))
7: value[[3L]](cond)
6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
5: tryCatchList(expr, classes, parentenv, handlers)
4: tryCatch({
       .Call(func, .extptr(file), space, flag, simpleCigar, tagFilter, 
           mapqFilter, ...)
   }, error = function(err) {
       stop(conditionMessage(err), "\n  file: ", path(file), "\n  index: ", 
           index(file))
   })
3: .io_bam(.scan_bamfile, file, reverseComplement, yieldSize(file), 
       tmpl, obeyQname(file), asMates(file), qnamePrefix, qnameSuffix, 
       param = param)
2: scanBam(bam)
1: scanBam(bam)
ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks! but did not find it useful.

ADD REPLY
0
Entering edit mode

Having the same issue in 2020. Rsamtools's scanBam will overflow the RAM and whole PC freezes. I'm on Ubuntu 20.04.

ADD REPLY

Login before adding your answer.

Traffic: 2024 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6