Hi,
I have many whole exome BAM files (aligned to reference). As an R admirer, I used Rstudio to do initial analysis on one of them (~2.7 GB) using Rsamtools
.
1) Whether single or paired-end:
> testPairedEndBam("1.bam")
[1] TRUE
> quickBamFlagSummary("1.bam") # Got detailed information
2) Read bam file
> bam <- BamFile("1.bam", asMates = TRUE)
> bam
class: BamFile
path: 1.bam
index:1.bam.bai
isOpen: FALSE
yieldSize: NA
obeyQname: FALSE
asMates: TRUE
qnamePrefixEnd: NA
qnameSuffixStart: NA
3) Some high level information
> seqinfo(bam)
Seqinfo object with 1133 sequences from an unspecified genome: # and other information
4) Read all the reads in the file using scanBam()
details <- scanBam(bam)
But at this step, it goes on running and running and I am stuck at this step. Any thoughts please? How much RAM is required to process a 3GB BAM file in Rstudio? I have windows 8.1, 64-bit computer with 16 GB RAM.
Thanks!
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] zlibbioc_1.16.0 IRanges_2.4.8 XVector_0.10.0 futile.logger_1.4.3 parallel_3.3.2
[6] tools_3.3.2 GenomicRanges_1.22.4 lambda.r_1.1.9 futile.options_1.0.0 Biostrings_2.38.4
[11] S4Vectors_0.8.11 BiocGenerics_0.16.1 BiocParallel_1.4.3 Rsamtools_1.26.1 GenomeInfoDb_1.6.3
[16] stats4_3.3.2 bitops_1.0-6
Check in R how much memory you can use:
But if I wanted to check for a specific file, then?
I think memory is not the problem then. Do you get an error? How long do you let it run?
Maybe traceback can give you any clues?
It was running for 31 minutes and showed following error:
Since you are using windows, maybe this answer on stackoverflow can help:
http://stackoverflow.com/questions/5171593/r-memory-management-cannot-allocate-vector-of-size-n-mb/24754706#24754706
Thanks! but did not find it useful.
Having the same issue in 2020. Rsamtools's scanBam will overflow the RAM and whole PC freezes. I'm on Ubuntu 20.04.