Fastest pairwise alignment for 10,000 sequences
1
0
Entering edit mode
8.2 years ago
sheinsch ▴ 10

I need to find pairwise alignment scores for 10,000 amino acid sequences that range from 200 aa to 4000 aa. I am currently using the EMBOSS wrapper within python to do the comparisons. However, judging by the rate at which the alignments are being performed it will be quite a long time (roughly 2,000 days) before the whole batch is complete. This seems very high and I am guessing there is a better way to accomplish what I am setting out to do.

What I have tried already:

I have excluded any comparisons that cannot generate an identity higher than 50% based on length.

alignment • 3.2k views
ADD COMMENT
3
Entering edit mode

This is why they invented the BLAST algorithm.

ADD REPLY
2
Entering edit mode
8.2 years ago
abascalfederico ★ 1.2k

For local alignments I would use BLAST. It will last hours, not days.

If you need to work with global alignments and all the sequences are homologous and have the same domains, you could make a multiple sequence alignment with mafft and calculate % of identities from it.

ADD COMMENT
0
Entering edit mode

Thanks I will give that a shot.

ADD REPLY

Login before adding your answer.

Traffic: 1564 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6