Creating a phylogenetic tree following MSA
2
0
Entering edit mode
10 weeks ago

Hi, Using my windows machine I was able to use Clustalo command line tool to conduct MSA on 1,000 different protein sequences (around 30 - 40 A.A long). At this point I wish to use the aligned output to produce a phylogeneitc tree. I would love to hear recomendations on which tool would be the best fit for this job? I plan on using more sequences in the future so a tool which can handle bigger scales of work has an advantage.

Clustalo trees MSA Phylogenetic Python • 362 views
ADD COMMENT
1
Entering edit mode
10 weeks ago
Mensur Dlakic ★ 22k

For that many sequences, and especially considering they are short, an educated guess is that many of them are identical. If so, there is no point in including them all in a tree. You will end up with branches that have several leaves of zero length, because there is no distance between the sequences. You can get the same information from the fact that they are identical.

Besides, removing identical sequences will also make a tree more amenable to visual inspection. I have looked at the trees all the way from a couple of entries to tens of thousands. To my eyes, trees that have more than 150-200 entries are very difficult to inspect, and those over a thousand are almost impossible to grasp in a meaningful way. Cutting out the redundant sequences should help with that.

https://sites.google.com/view/cd-hit

There is a list of all phylogeny-related programs:

https://evolution.genetics.washington.edu/phylip/software.html

Most of them will not have any problem with the dataset size you have. I recommend IQ-TREE from the Maximum Likelihood list and MrBayes among Bayesian inference programs. MEGA will probably do fine if your main feature of interest is user-friendliness.

ADD COMMENT
0
Entering edit mode
ADD COMMENT
0
Entering edit mode

i recommendusing linux and cmd :)

ADD REPLY

Login before adding your answer.

Traffic: 1977 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6