What Aligner To Use For Paired-End 454 Reads?
9.2 years ago
Bioscientist

I read BWA manual. Seems for 454 longer reads, it's recommended to use BWA -BWASW, which only supports single-end reads.

Then what about 454 paired-end reads？Use bowtie? Also, 454 contain reads with variable length. Then how to deal with those very short reads? （I see some reads are only 10bp..) thx

@SRR000945.10444  E78I8DJ01CQCVG  length=108
TGACTTTGTAATTTCCATATTTAAATTCCTTCATCTGATTTTCAGCTTCTCAGGGAAACCACCTAATATCCCTACCAGACAGTCATCTTTCATCTACTGAATAATTCC
+
AAABAAA====AABAAABBA@@AAAAAAABBBBBBBBB@@@@BBBBBBBBBBBAAAAAAABABBBBAAAAAAA@??AAAAAA???@=<<<@??????@@@@?==9995
@SRR000945.10446  E78I8DJ01A06WQ  length=10
AACTCCCAGA
+
555=555995
@SRR000945.10447  E78I8DJ01A1GWE  length=48
GCTCAAATAGTTATCTTCCCTGAGATGCCTCCCACTAACTCACAATCG
+
AAAA@@@A???AAAA??===?AAAAAA@@@?????@@@???????===

the paired end vs single end reads are generated with the samse and sampe commands and I don't think the way you align aln or bwasw matters.

An obvious choice would be Newbler. If you can't use that, you could try GMAP. I don't have enough experience with the whole range of aligners to make a specific recommendation, though.

i am frankly surprised the 454 would report a 10bp read

see the edit plz......yeah, it does have 10bp read...or maybe this is not 454?

hi istvan; so for 454 paired-end reads, I simply regard them as illumina/solexa? I mean； first "bwa aln"， then "bwa sampe" ?

9.2 years ago
lh3

BWASW-0.6 supports paired-end reads (as you know, BWA does not work with 454). Smalt is a good one. Bowtie2 should also work well, though I have not tried this myself. A caveat with Bowtie2 I found recently is that for a gap in a tandem repeat where the gap placement is uncertain, bowtie2 places the gap in a largely random position in the repeat (see this). This may pose challenges to some SNP/indel caller such as samtools which assume the mapper tries best to put the gap at the leftmost position. It is also fair to say this is a weakness of samtools. I used to try GMAP a few years ago. At that time, its performance was not as good as mappers more tuned for reads mapping, but this may be changed recently.

sorry i'm still confused. Can I regard them as illumina/solexa? I mean； first "bwa aln"， then "bwa sampe" ?

No, 454 reads have high homopolymer indel error rate. aln+sampe would not work well.