Question: Multiple sequence alignement MSA editing
gravatar for TEman
3.1 years ago by
TEman10 wrote:

I want to remove all rare insertions (when it occurs in less than 5% of the sequences) in a multiple sequence alignment file (clustal .aln) with 699 sequences.

That is, I have a MSA with many columns containing only one or two insertions while the rest of the sequences are blank "-". It is by far too much to do manually.

Any suggestions how to do this?

alignment clustal R • 1.0k views
ADD COMMENTlink written 3.1 years ago by TEman10

Do you specifically want to do this in R?

If you use BioPython, you can create an ungapped concensus sequence with a threshold for inclusion of a particular residue in a column.

ADD REPLYlink written 3.1 years ago by Joe18k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1543 users visited in the last hour