Question: Multiple sequence alignement MSA editing
gravatar for TEman
24 months ago by
TEman10 wrote:

I want to remove all rare insertions (when it occurs in less than 5% of the sequences) in a multiple sequence alignment file (clustal .aln) with 699 sequences.

That is, I have a MSA with many columns containing only one or two insertions while the rest of the sequences are blank "-". It is by far too much to do manually.

Any suggestions how to do this?

alignment clustal R • 618 views
ADD COMMENTlink written 24 months ago by TEman10

Do you specifically want to do this in R?

If you use BioPython, you can create an ungapped concensus sequence with a threshold for inclusion of a particular residue in a column.

ADD REPLYlink written 24 months ago by jrj.healey13k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 607 users visited in the last hour