Pairwise alignment of two long chromosomes (~100Mbp) with lastz
0
0
Entering edit mode
3 days ago

I'm trying to align corresponding chromosomes (~100 Mbp long) of two dog breeds:

This is for a nonscientific purpose, so alignment quality is not a priority and fast execution is preferred.

I used the program lastz with the following script:

lastz german_shepherd.fasta labrador.fasta \
  --notransition --step=20 --nogapped \
  --format=maf --ambiguous=iupac \
  > shepard_labrador.maf

This is analogous to the first example in the tutorial: https://lastz.github.io/lastz/ -- only I use it for two closely related sequences. The script takes very long to run and the file it generates reached up to ~80GB before I had to terminate the process. The sequences themselves are ~100 MB in size.

Upon examining the output, I saw that the script created many overlapping aligned segments. For example, overlapping segments [0:146], [0:231] appear in two different alignments.

Does anyone know how I can enforce the identified segments to not overlap? It works fine with the example of the chicken and human chromosomes in the tutorial, but with two closely related sequences I get all these overlaps that takes forever to process and the output is ridiculously large and ambiguous.

Thanks for your time.

long aligning lastz • 131 views
ADD COMMENT

Login before adding your answer.

Traffic: 1732 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6