I've aligned PacBio subreads (fasta) to a reference using Heng Li's BWA (the long read aligner - BWA-SW, using the command 'bwa bwasw'), and I'm trying to read these in using
> r = readAligned("test.bam", type="BAM")
Now, I know there's a potential problem here, in that BWA-SW will produce multiple lines with the same read id when it finds disconnected alignment blocks within reads (which should be the case for my data - transcript reads mapped to the genome). However, the resulting readAligned object:
> r class: AlignedRead length: 59664 reads; width: 126..12721 cycles chromosome: NA NA ... NA NA position: NA NA ... NA NA strand: NA NA ... NA NA alignQuality: NumericQuality alignData varLabels: flag
has 59,664 reads, which is more than the number of unique read ids in the file, but fewer than the number of lines in the file. Furthermore, a test file containing 1 single alignment read in as so:
> r class: AlignedRead length: 0 reads; width: cycles chromosome: position: strand: alignQuality: NumericQuality alignData varLabels: flag
Can anybody suggest where I might have gone wrong?
> sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-redhat-linux-gnu (64-bit) locale:  C attached base packages:  stats graphics grDevices utils datasets methods base other attached packages:  GenomicFeatures_1.6.9 AnnotationDbi_1.16.19 Biobase_2.14.0  ShortRead_1.12.4 latticeExtra_0.6-19 RColorBrewer_1.0-5  Rsamtools_1.6.3 lattice_0.20-6 Biostrings_2.22.0  GenomicRanges_1.6.7 IRanges_1.12.6 loaded via a namespace (and not attached):  BSgenome_1.22.0 DBI_0.2-5 RCurl_1.91-1 RSQLite_0.11.1  XML_3.9-4 biomaRt_2.10.0 bitops_1.0-4.1 grid_2.14.1  hwriter_1.3 rtracklayer_1.14.4 tools_2.14.1 zlibbioc_1.0.1