Question: Small Problem: Multiple Alignment To Consensus Sequence
gravatar for misconstruction
5.0 years ago by
European Union
misconstruction20 wrote:

Hi Biostar,

I'm working with softclipped reads in python, mainly using biopython. I have gathered some small clusters of interesting sequences, which i want to do some motif analysis on, maybe blast them, etc.

My problem is: i have between 3 and 20 sequences in each cluster, and i want to reduce that to a consensus sequence. The sequences are highly similar, but sometimes a few corrupted sequences are in the cluster. That means i cannot simply calculate the consensus, since a single unmatching sequences might introduce gaps or otherwise affect the consensus to much.

Is there a way to automatically (without human interference) discard any badly matching sequences from a multiple alignment?

My current implementation first does the clustalw multiple alignment, gets the consensus, and then does pairwise alignment, using emboss needle, to the consensus and discards poorly matching sequences. Then the consensus is rebuilt. This seems rather clumsy, and is terribly slow.

Any advice is greatly appreciated!

ADD COMMENTlink modified 5.0 years ago by k.nirmalraman910 • written 5.0 years ago by misconstruction20
gravatar for k.nirmalraman
5.0 years ago by
k.nirmalraman910 wrote:

How about this online tool where you can define the threshold?

ADD COMMENTlink modified 5.0 years ago • written 5.0 years ago by k.nirmalraman910
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 696 users visited in the last hour