Question: Phylogenetic Tree to compare my sequences to a database
0
gravatar for Neuls
2.1 years ago by
Neuls0
Neuls0 wrote:

Hi,

We are currently doing an screening of lactobacilli in diferents kinds of samples. I have isolated around 40 strains, sequenced a fragment of their 16S gene and BLASTed it to get the specie.

I'd like to build up some kind of phylogenetic tree. In order to do so, I have downloaded in fasta format all lactobacilli 16S sequences available in european databases and i'd like to build a tree using this sequences but also including the ones I sequenced to place them regarding downloaded ones. I though about doing so using closing neightbours algorithm.. but im not sure

I'm pretty new doing phylogenetic trees so i'd like if somebody could give me any tips and also a good tool recomendation for unix systems or windows. I have also did a bit of research in forums and found this topic ( What is the fastest way and software to build phylogenetic trees from WGS NGS data ) would the method described in answers there be valid?

Thank you :)

ADD COMMENTlink modified 2.1 years ago by abascalfederico1.1k • written 2.1 years ago by Neuls0
0
gravatar for abascalfederico
2.1 years ago by
abascalfederico1.1k
Spain
abascalfederico1.1k wrote:

How many sequences do you have? I would try a maximum likelihood approach using RAxML or Phyml. The RAxML-gui helps, same for the Phyml's web-server at ATGC or the program SeaView. A key step is to build a reliable an alignment as possible. You can use MAFFT or Muscle and then clean unreliably aligned regions with Trimal, Gblocks, by hand...

Also, before trying a ML approach you may want to do a quick neighbor joining to see if you have any strange sequence.

ADD COMMENTlink written 2.1 years ago by abascalfederico1.1k
1

In addition, I would suggest to run a test for the optimal substitution model, e.g. ModelTest. Following a recent article by Tan et al. (2015) automatic filtering of alignments (gblocks, trimal) does not improve phylogenetic trees.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Michael Dondrup45k

Interesting paper indeed. what dou mean by substitution model?

ADD REPLYlink written 2.1 years ago by Neuls0

What I meant is explained here: http://www.molecularevolution.org/resources/models/nucleotide

ADD REPLYlink written 2.1 years ago by Michael Dondrup45k

thank you, this indeed helped me :)

ADD REPLYlink written 2.1 years ago by Neuls0

Although I haven't gone through the paper my guess is that it depends. In my experience cleaning alignments has solved problems in phylogenetic reconstruction, and this makes perfect sense from a theoretical perspective: any extraneous sequence or badly align region can confound phylogenetic reconstruction. However, if done automatically there will probably be many cases in which part of the phylogenetic signal will be lost during cleaning, which would explain the worse performance of automatic cleaning. I should read the paper, though :-)

ADD REPLYlink written 2.1 years ago by abascalfederico1.1k

hi, I have around 3000 sequences which I downloaded, the ones I sequenced are 40 in total. I tried a few web services but i cannot upload these large sequence set. I have already aligned all sequences togheter so far using MUSCLE. So a neighbour joining would be the first option ? then a likelyhood approach using the programs you mentioned? Also, I fear that such amount of sequence make an unclear tree with too much information, I dont know how to cope with this.. I just wanna place my sequence to some how of "reference tree made by the sequences I downloaded

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Neuls0
2

RAxML can handle thousands of sequences but later the interpretation of the tree would be less clear.

You can try some software to remove sequence redundancy. With cd-hit you can remove sequences that are X% identical. That would for sure reduce the size of your alignment.

To remove redundancy I prefer to use Jalview (a multiple alignment viewer and editor). You select all but your sequences and then click "remove redundancy", where you can select different thresholds. Also make sure to remove strange, largely incomplete sequences and poorly aligned regions.

ADD REPLYlink written 2.1 years ago by abascalfederico1.1k

Oh ya I know Jalview!! I forgot it could remove redundancy. Anyhow, I have already tried to upload that sequences but sadly my PC cant handle so I think I need a better PC to perform this task ^^". Thank you for the advice I was pretty lost with phylogenetic trees!

ADD REPLYlink written 2.1 years ago by Neuls0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1251 users visited in the last hour