MAFFT branch length VS. BLAST
7.7 years ago
shirani.s ▴ 10

I am trying to compare 10 sequences (~1Mbp size each) to find out their divergence and phylo genetic tree. I am getting my phylo tree from MAFFT which provides the branch length in "substitution per site" unit. However, these branch lengths are by a factor of 10 higher than my results from pairwise BLAST comparison between the sequences.

Is there a justification for this? and if not, what are the alternatives to get corresponding results?

I suspect 1Mbp is too much for one or both programs.

Check length requirements and limita in corresponding README-files.

7.4 years ago
pawlowac ▴ 80

MAFFT isn't made for multiple genome alignments, particularly because of horizontal gene transfer and rearrangements (assuming this occurs for what you are looking at). Also, MAFFT guide tree and a pairwise blast identity are very very different things, and the MAFFT guide tree is not a phylogenetic tree. If you want a rough approximation, I suggest FastTree, but I'm not sure your alignment would be good enough to perform a phylogenetic reconstruction.

You need to give more information on what you are trying to do. If you just want a species phylogeny, then maybe pick ~10 housekeeping genes, align them separately (possibly with MAFFT, I like the linsi method), or use something like PhyloSift.

If you are not interested in a species phylogeny, then I would suggest looking at gene content, and focusing in on what you are interested in studying.


