Question: How To Align Unequal Length Of Protein Sequences?
6.2 years ago
Sagar Nikam
Pune, India
Sagar Nikam wrote:

i have 4500 protein sequences with very very less similarity,having length at least 40 & some may be maximum about 300/500 Is there any effect on quality of phylogeny tree, constructed by using Multiple sequence alignment's(MSA) output.

what should i do if i want to align all protein seq without any loss of useful information? should i go for further data curation?if yes,then how/

6.2 years ago
Boston, MA USA
Larry_Parnell wrote:

The quality of the MSA is highly influential on the output of the phylogenetic analysis.

In my opinion, it is more informative to build phylogenetic trees based not on 4500 sequences, but based on gene or protein families. This makes more sense biologically and reduces the complexity of the problems of sequence diversity and length. If two protein families show some plausible degree of shared evolutionary history, then you can attempt to add their respective trees after separate MSAs and trees are built.

