Question

Phylogenetic Tree to compare my sequences to a database

0

Entering edit mode

7.2 years ago

Neuls ▴ 20

Hi,

We are currently doing an screening of lactobacilli in different kinds of samples. I have isolated around 40 strains, sequenced a fragment of their 16S gene and BLASTed it to get the species.

I'd like to build up some kind of phylogenetic tree. In order to do so, I have downloaded in fasta format all lactobacilli 16S sequences available in European databases and I'd like to build a tree using this sequences but also including the ones I sequenced to place them regarding downloaded ones. I though about doing so using closing neighbours algorithm.. but I'm not sure

I'm pretty new doing phylogenetic trees so I'd like if somebody could give me any tips and also a good tool recommendation for unix systems or windows. I have also did a bit of research in forums and found this topic ( What is the fastest way and software to build phylogenetic trees from WGS NGS data) Would the method described in answers there be valid?

Thank you :)

phylogenetic-tree • 2.5k views

ADD COMMENT • link updated 12 months ago by Ram 43k • written 7.2 years ago by Neuls ▴ 20

score 0 · Answer 1 · 2017-02-28

0

Entering edit mode

7.2 years ago

abascalfederico ★ 1.2k

How many sequences do you have? I would try a maximum likelihood approach using RAxML or Phyml. The RAxML-gui helps, same for the Phyml's web-server at ATGC or the program SeaView. A key step is to build a reliable an alignment as possible. You can use MAFFT or Muscle and then clean unreliably aligned regions with Trimal, Gblocks, by hand...

Also, before trying a ML approach you may want to do a quick neighbor joining to see if you have any strange sequence.

ADD COMMENT • link 7.2 years ago by abascalfederico ★ 1.2k

1

Entering edit mode

In addition, I would suggest to run a test for the optimal substitution model, e.g. ModelTest. Following a recent article by Tan et al. (2015) automatic filtering of alignments (gblocks, trimal) does not improve phylogenetic trees.

ADD REPLY • link 7.2 years ago by Michael 54k

0

Entering edit mode

Interesting paper indeed. what dou mean by substitution model?

ADD REPLY • link 7.2 years ago by Neuls ▴ 20

0

Entering edit mode

What I meant is explained here: http://www.molecularevolution.org/resources/models/nucleotide

ADD REPLY • link 7.2 years ago by Michael 54k

0

Entering edit mode

thank you, this indeed helped me :)

ADD REPLY • link 7.2 years ago by Neuls ▴ 20

0

Entering edit mode

Although I haven't gone through the paper my guess is that it depends. In my experience cleaning alignments has solved problems in phylogenetic reconstruction, and this makes perfect sense from a theoretical perspective: any extraneous sequence or badly align region can confound phylogenetic reconstruction. However, if done automatically there will probably be many cases in which part of the phylogenetic signal will be lost during cleaning, which would explain the worse performance of automatic cleaning. I should read the paper, though :-)

ADD REPLY • link 7.2 years ago by abascalfederico ★ 1.2k

0

Entering edit mode

hi, I have around 3000 sequences which I downloaded, the ones I sequenced are 40 in total. I tried a few web services but i cannot upload these large sequence set. I have already aligned all sequences togheter so far using MUSCLE. So a neighbour joining would be the first option ? then a likelyhood approach using the programs you mentioned? Also, I fear that such amount of sequence make an unclear tree with too much information, I dont know how to cope with this.. I just wanna place my sequence to some how of "reference tree made by the sequences I downloaded

ADD REPLY • link 7.2 years ago by Neuls ▴ 20

2

Entering edit mode

RAxML can handle thousands of sequences but later the interpretation of the tree would be less clear.

You can try some software to remove sequence redundancy. With cd-hit you can remove sequences that are X% identical. That would for sure reduce the size of your alignment.

To remove redundancy I prefer to use Jalview (a multiple alignment viewer and editor). You select all but your sequences and then click "remove redundancy", where you can select different thresholds. Also make sure to remove strange, largely incomplete sequences and poorly aligned regions.

ADD REPLY • link 7.2 years ago by abascalfederico ★ 1.2k

0

Entering edit mode

Oh ya I know Jalview!! I forgot it could remove redundancy. Anyhow, I have already tried to upload that sequences but sadly my PC cant handle so I think I need a better PC to perform this task ^^". Thank you for the advice I was pretty lost with phylogenetic trees!

ADD REPLY • link 7.2 years ago by Neuls ▴ 20