Question: Semi global (end gap free) alignment of long reads?
1
gravatar for rrwick
2.4 years ago by
rrwick20
Australia
rrwick20 wrote:

I would like to align PacBio reads to contigs with an end-gap-free, semi-global approach. The exact kind of alignment I'd like is described here as an 'overlap alignment'. The alignment must keep going until it reaches the end of either the read or the contig.

For example, these alignments are all okay:

  AAAAA        AAAAAAAAAAA
  |||||          |||||||
BBBBBBBBB        BBBBBBB

    AAAAAAAA     AAAAAAAA
    |||||           |||||
BBBBBBBBB           BBBBBBBBB

But this is not okay because the alignment terminates before the sequences do:

  AAAAAAA
  |||||
BBBBBBBBBBB

I was hoping for an efficient tool (I need to do a lot of these) that handles error-prone long reads well. BLASR and BWA-MEM do local alignment and therefore won't work for me. GraphMap claims to do semi-global alignment and is the best I've found so far. But it too often gives alignments that terminate before an end of sequence. Are there other appropriate tools I haven't found?

pacbio alignment • 996 views
ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by rrwick20
2
gravatar for dariober
2.4 years ago by
dariober9.3k
Glasgow - UK
dariober9.3k wrote:

bowtie2 has a --end-to-end but I don't know how it copes with long and error prone reads. vmatch also has a -complete option (-complete: specify that query sequences must match completely) and it's very flexible, you need a license, which is free for academic use. Finally, exonerate is also very flexible.

I guess the choice depends also on the size of the genome and number of reads you have to align.

ADD COMMENTlink written 2.4 years ago by dariober9.3k

I wasn't familiar with exonerate, but it looks good! It has an alignment model (affine:overlap) which is exactly the alignment type that I need. I'll check it out to see how well it performs with long reads and long reference sequences.

ADD REPLYlink written 2.4 years ago by rrwick20

Unfortunately, I see that exonerate cannot use its heuristics to speed up the alignment process when in affine:overlap mode. This means that it finds an optimal result, which is far too slow. It took about 2 minutes to align a single read to a single contig.

ADD REPLYlink written 2.4 years ago by rrwick20
1
gravatar for rrwick
2.4 years ago by
rrwick20
Australia
rrwick20 wrote:

Since I asked this question, the GraphMap developers have created a new branch which help to make more of their alignments semi-global in the way I require. My current approach is therefore using GraphMap and then using my own script to filter out alignments that are only local.

ADD COMMENTlink written 2.4 years ago by rrwick20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1509 users visited in the last hour