Question: Error in parsing BAM file using readAlligned
kspata50 wrote:


I was trying to find number of reads within a specific target region. To perform this I found the post on Biostar A: How To Extract Reads From Bam That Overlap With Specific Regions?

I installed the necessary packages into an R Studion session and performed the following command.

reads <- readAligned("Sample.sorted.bam", type = "BAM")

I am getting following error

Error: UserArgumentMismatch arugment 'type' had value 'BAM' allowable values: 'SolexaExport' 'SolexaAlign' 'SolexaPrealign' 'SolexaRealign' 'SolexaResult' 'MAQMap' 'MAQMapShort' 'MAQMapview' 'Bowtie' 'SOAP'

If I change the command to:

reads <- readAligned("Sample.sorted.bam", type = "Bowtie")

I am getting this error:

Error: Input/Output 'readAligned' failed to parse files dirPath: 'Sample.sorted.bam' pattern: '' type: 'Bowtie' error: incorrect number of fields (1) 749827-1.sorted.bam:1

My R session Info:

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rtracklayer_1.42.1          ShortRead_1.40.0            GenomicAlignments_1.18.0   
 [4] SummarizedExperiment_1.12.0 DelayedArray_0.8.0          matrixStats_0.54.0         
 [7] Biobase_2.42.0              Rsamtools_1.34.0            GenomicRanges_1.34.0       
[10] GenomeInfoDb_1.18.1         Biostrings_2.50.1           XVector_0.22.0             
[13] IRanges_2.16.0              S4Vectors_0.20.1            BiocParallel_1.16.2        
[16] BiocGenerics_0.28.0        

loaded via a namespace (and not attached):
 [1] zlibbioc_1.28.0        lattice_0.20-35        hwriter_1.3.2          tools_3.5.1           
 [5] grid_3.5.1             latticeExtra_0.6-28    Matrix_1.2-14          GenomeInfoDbData_1.2.0
 [9] RColorBrewer_1.1-2     BiocManager_1.30.4     bitops_1.0-6           RCurl_1.95-4.11       
[13] compiler_3.5.1         XML_3.98-1.16

How can I resolve this error and parse the BAM file?

alignment R gene
kspata50
swbarnes25.8k wrote:

Solexa? SOAP? That's an 8 year old solution. I recommend finding something a little more up to date, like BEDTools.

swbarnes25.8k

Thank you for replying. I tried using samtools command

samtools view Sample.bam "ref:6000-11000" > region.bam

My original Sample.bam has 3264461 number of mapped reads and for the region.bam I got 3203443 number of reads.

I think this command also includes the reads which overlap outside the given range as well. I want number of reads which are within this region for example, reads with mapping start at 6000 and end at 11000.

kspata50

BEDTools can be that specific.

swbarnes25.8k
