Small Problem: Multiple Alignment To Consensus Sequence
Entering edit mode
8.0 years ago
MT ▴ 30

Hi Biostar,

I'm working with softclipped reads in python, mainly using biopython. I have gathered some small clusters of interesting sequences, which i want to do some motif analysis on, maybe blast them, etc.

My problem is: i have between 3 and 20 sequences in each cluster, and i want to reduce that to a consensus sequence. The sequences are highly similar, but sometimes a few corrupted sequences are in the cluster. That means i cannot simply calculate the consensus, since a single unmatching sequences might introduce gaps or otherwise affect the consensus to much.

Is there a way to automatically (without human interference) discard any badly matching sequences from a multiple alignment?

My current implementation first does the clustalw multiple alignment, gets the consensus, and then does pairwise alignment, using emboss needle, to the consensus and discards poorly matching sequences. Then the consensus is rebuilt. This seems rather clumsy, and is terribly slow.

Any advice is greatly appreciated!

python biopython multiple-alignment consensus • 2.4k views
Entering edit mode
8.0 years ago
k.nirmalraman ★ 1.1k

How about this online tool where you can define the threshold?


Login before adding your answer.

Traffic: 1330 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6