Phylogenetic tree in R
1
0
Entering edit mode
12 months ago

How come BLAST result of my sample sequence is showing top sequences with 97-100% similarity but the bootstrap value on the branch point of my sample is only 30/100 in the phylogenetic tree? It's only the branch point of my sample with such low bootstrap value, the rest of the branch points have high bootstrap value. How do I interpret the result?

BLAST Bootstrap Phylogenetic Tree R • 362 views
1
Entering edit mode
12 months ago
Mensur Dlakic ★ 10k

Bootstrap values show support for a particular branch configuration. Generally speaking, branch support is not correlated with percent identity of the entries. For example, it is possible to have 3-4 entries with >95% identity to each other, all on the same branch and with poor bootstrap support. It is also possible to have 3-4 entries with <80% identity to each other, all on the same branch and with high bootstrap support. Until you provide more details and the tree itself, it is impossible to answer this question except in general terms.

0
Entering edit mode

Hi Sir, I actually used ezbiocloud platform for BLAST. These are all 16S rRNA genes of the Bacillus genus. My sample is labeled as "MT7" that I got sequenced here. If you could check these links it would be really helpful for me. 1) Tree: https://docs.google.com/presentation/d/1UnQCzhJtKOjBz0gQwgV8wQnSB0_7MHL8R853w710pGo/edit?usp=sharing

Thank you so much for the response.

1
Entering edit mode

It is tough to get a clear-cut tree with sequences that are so similar as in your case. Specifically, you have three sequences on top of which two are identical, and the third one has 1 substitution in 1432 residues. So the two identical sequences definitely need to be next to each other, since there is nothing that is more similar to either of them than each other. Next, during the reconstruction the program has to "decide" which of the two identical sequences (MT7 and ASJC01000029) is closer to AMSH01000114. That, of course, is a trick question, since both of them are identical and therefore equally close to AMSH01000114. A possible solution to that is to slot AMSH01000114 in between the other two sequences, but make the AMSH01000114 branch long so that the other two are still closer to each other than to AMSH01000114. Those are the three topological possibilities, and they are roughly equally possible given a minuscule differences between the three top sequences. That's why you get a bootstrap value of 30.

A simple solution that will fix that branch is to remove ASJC01000029 from the tree, because it is identical to your query and therefore unnecessary. In that case the relationship between MT7 and AMSH01000114 will be unambiguous and the bootstrap support will be high.

0
Entering edit mode

Oh thank you so much, I understood. I'll try that. You really helped me a lot. Also, one more question what 100% similarity means in case of 16S rRNA? Does it mean they're exactly the same?

1
Entering edit mode

I don't know what exactly 100% similarity means on that BLAST platform you are using, because the words similar and identical are only similar but not identical. Sorry, couldn't resist :-))

A safe bet is that they man identical because that is what is typically reported, so it likely means they are exactly the same. You can see below how BLASTn reports its matches and there are plenty of identical sequences to ASJC01000029, so presumably to your sequence as well.

0
Entering edit mode

Oh I understood now, thank you so much once again. You really helped me a lot.

1
Entering edit mode

Did the branch support go to 100 once you removed the identical sequence?

1
Entering edit mode

Yes, thank you. I also got to know that the 16S rRNA gene could be identical in some species. Thanks again!!!

0
Entering edit mode

Also, I used Maximum likelihood method and K80 evolutionary model for tree construction.