Question: Fastest pairwise alignment for 10,000 sequences
0
gravatar for sheinsch
3.3 years ago by
sheinsch0
United States
sheinsch0 wrote:

I need to find pairwise alignment scores for 10,000 amino acid sequences that range from 200 aa to 4000 aa. I am currently using the EMBOSS wrapper within python to do the comparisons. However, judging by the rate at which the alignments are being performed it will be quite a long time (roughly 2,000 days) before the whole batch is complete. This seems very high and I am guessing there is a better way to accomplish what I am setting out to do.

What I have tried already:

I have excluded any comparisons that cannot generate an identity higher than 50% based on length.

alignment • 1.5k views
ADD COMMENTlink modified 3.3 years ago by abascalfederico1.1k • written 3.3 years ago by sheinsch0
3

This is why they invented the BLAST algorithm.

ADD REPLYlink written 3.3 years ago by Benn6.8k
2
gravatar for abascalfederico
3.3 years ago by
abascalfederico1.1k
Spain
abascalfederico1.1k wrote:

For local alignments I would use BLAST. It will last hours, not days.

If you need to work with global alignments and all the sequences are homologous and have the same domains, you could make a multiple sequence alignment with mafft and calculate  % of identities from it.

 

ADD COMMENTlink written 3.3 years ago by abascalfederico1.1k

Thanks I will give that a shot.

ADD REPLYlink written 3.3 years ago by sheinsch0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1153 users visited in the last hour