Question: Phylogenetic tree in R
0
gravatar for phoenix.sum13
3 months ago by
phoenix.sum1320 wrote:

How come BLAST result of my sample sequence is showing top sequences with 97-100% similarity but the bootstrap value on the branch point of my sample is only 30/100 in the phylogenetic tree? It's only the branch point of my sample with such low bootstrap value, the rest of the branch points have high bootstrap value. How do I interpret the result?

ADD COMMENTlink modified 3 months ago by Mensur Dlakic6.0k • written 3 months ago by phoenix.sum1320
1
gravatar for Mensur Dlakic
3 months ago by
Mensur Dlakic6.0k
USA
Mensur Dlakic6.0k wrote:

Bootstrap values show support for a particular branch configuration. Generally speaking, branch support is not correlated with percent identity of the entries. For example, it is possible to have 3-4 entries with >95% identity to each other, all on the same branch and with poor bootstrap support. It is also possible to have 3-4 entries with <80% identity to each other, all on the same branch and with high bootstrap support. Until you provide more details and the tree itself, it is impossible to answer this question except in general terms.

ADD COMMENTlink written 3 months ago by Mensur Dlakic6.0k

Hi Sir, I actually used ezbiocloud platform for BLAST. These are all 16S rRNA genes of the Bacillus genus. My sample is labeled as "MT7" that I got sequenced here. If you could check these links it would be really helpful for me. 1) Tree: https://docs.google.com/presentation/d/1UnQCzhJtKOjBz0gQwgV8wQnSB0_7MHL8R853w710pGo/edit?usp=sharing

2) sequences: https://drive.google.com/file/d/1OzEr9mBjMpDayEV9G54SyGBzRxTxsU7E/view?usp=sharing

Thank you so much for the response.

ADD REPLYlink written 3 months ago by phoenix.sum1320
1

It is tough to get a clear-cut tree with sequences that are so similar as in your case. Specifically, you have three sequences on top of which two are identical, and the third one has 1 substitution in 1432 residues. So the two identical sequences definitely need to be next to each other, since there is nothing that is more similar to either of them than each other. Next, during the reconstruction the program has to "decide" which of the two identical sequences (MT7 and ASJC01000029) is closer to AMSH01000114. That, of course, is a trick question, since both of them are identical and therefore equally close to AMSH01000114. A possible solution to that is to slot AMSH01000114 in between the other two sequences, but make the AMSH01000114 branch long so that the other two are still closer to each other than to AMSH01000114. Those are the three topological possibilities, and they are roughly equally possible given a minuscule differences between the three top sequences. That's why you get a bootstrap value of 30.

A simple solution that will fix that branch is to remove ASJC01000029 from the tree, because it is identical to your query and therefore unnecessary. In that case the relationship between MT7 and AMSH01000114 will be unambiguous and the bootstrap support will be high.

ADD REPLYlink written 3 months ago by Mensur Dlakic6.0k

Oh thank you so much, I understood. I'll try that. You really helped me a lot. Also, one more question what 100% similarity means in case of 16S rRNA? Does it mean they're exactly the same?

ADD REPLYlink written 3 months ago by phoenix.sum1320
1

I don't know what exactly 100% similarity means on that BLAST platform you are using, because the words similar and identical are only similar but not identical. Sorry, couldn't resist :-))

A safe bet is that they man identical because that is what is typically reported, so it likely means they are exactly the same. You can see below how BLASTn reports its matches and there are plenty of identical sequences to ASJC01000029, so presumably to your sequence as well.

enter image description here

ADD REPLYlink modified 3 months ago • written 3 months ago by Mensur Dlakic6.0k

Oh I understood now, thank you so much once again. You really helped me a lot.

ADD REPLYlink written 3 months ago by phoenix.sum1320
1

Did the branch support go to 100 once you removed the identical sequence?

ADD REPLYlink written 3 months ago by Mensur Dlakic6.0k
1

Yes, thank you. I also got to know that the 16S rRNA gene could be identical in some species. Thanks again!!!

https://drive.google.com/file/d/1_H92Gj8l2Jim7dib6ZgVjQqY0KwAqw6U/view?usp=sharing

ADD REPLYlink written 3 months ago by phoenix.sum1320

Also, I used Maximum likelihood method and K80 evolutionary model for tree construction.

ADD REPLYlink written 3 months ago by phoenix.sum1320
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1717 users visited in the last hour