Has Anyone Successfully Used The Bioconductor Shortread Package To Read In Pacbio Read Alignments, In Sam/Bam Format?
1
2
Entering edit mode
12.0 years ago
Joe Fass ▴ 180

I've aligned PacBio subreads (fasta) to a reference using Heng Li's BWA (the long read aligner - BWA-SW, using the command 'bwa bwasw'), and I'm trying to read these in using

> r = readAligned("test.bam", type="BAM")

Now, I know there's a potential problem here, in that BWA-SW will produce multiple lines with the same read id when it finds disconnected alignment blocks within reads (which should be the case for my data - transcript reads mapped to the genome). However, the resulting readAligned object:

> r
class: AlignedRead
length: 59664 reads; width: 126..12721 cycles
chromosome: NA NA ... NA NA
position: NA NA ... NA NA
strand: NA NA ... NA NA
alignQuality: NumericQuality
alignData varLabels: flag

has 59,664 reads, which is more than the number of unique read ids in the file, but fewer than the number of lines in the file. Furthermore, a test file containing 1 single alignment read in as so:

> r
class: AlignedRead
length: 0 reads; width:  cycles
chromosome:
position:
strand:
alignQuality: NumericQuality
alignData varLabels: flag

Can anybody suggest where I might have gone wrong?

session info:

> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GenomicFeatures_1.6.9 AnnotationDbi_1.16.19 Biobase_2.14.0       
 [4] ShortRead_1.12.4      latticeExtra_0.6-19   RColorBrewer_1.0-5   
 [7] Rsamtools_1.6.3       lattice_0.20-6        Biostrings_2.22.0    
[10] GenomicRanges_1.6.7   IRanges_1.12.6       

loaded via a namespace (and not attached):
 [1] BSgenome_1.22.0    DBI_0.2-5          RCurl_1.91-1       RSQLite_0.11.1    
 [5] XML_3.9-4          biomaRt_2.10.0     bitops_1.0-4.1     grid_2.14.1       
 [9] hwriter_1.3        rtracklayer_1.14.4 tools_2.14.1       zlibbioc_1.0.1    
r bioconductor short bam • 4.3k views
ADD COMMENT
3
Entering edit mode
12.0 years ago
Joe Fass ▴ 180

Ahem. From the help for readAligned():

[?]

So, no PacBio BAM files in ShortRead. Leaving this up as a testament to not thoroughly reading the documentation.

ADD COMMENT
0
Entering edit mode

Rsamtools gives more flexible access to the BAM files, so you'll be able to read the data in, but probably need to do your own further manipulation.

ADD REPLY

Login before adding your answer.

Traffic: 2009 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6