Question: Best way for phylogenetic tree with missing data
0
gravatar for Picasa
5 months ago by
Picasa570
Picasa570 wrote:

Hi all,

I have 50 species and 8 genes. My goal is to make a phylogenetic tree but the problem is that I don't have the 8 genes for all species. Some species have just one gene for example, and also most of the genes are fragmented (so for the same gene, the size is different between species).

What is the best way to do this:

1) Concatenate all genes for all species and perform a multiple alignment (so with 50 fasta)

2) Do an alignment for each genes (so with 8 fasta) and then concatenate all alignements to species level ?

3) Other methods ?

Thanks for your help.

phylogenetic tree • 183 views
ADD COMMENTlink modified 5 months ago by Mensur Dlakic7.1k • written 5 months ago by Picasa570
1
gravatar for Maciej Motyka
5 months ago by
Maciej Motyka40 wrote:

I don't know what the best way is, but you can align the genes separately and then compare the 8 trees using treespace and select the most representative tree yourself or let treespace create it for you based on the input trees. Additionally, this method will tell you if all of the genes point to the same evolutionary history or there was a recombination or horizontal gene transfer. Treespace is easy to use and has great vignettes that walk you through the whole process.

ADD COMMENTlink modified 5 months ago • written 5 months ago by Maciej Motyka40
1
gravatar for Mensur Dlakic
5 months ago by
Mensur Dlakic7.1k
USA
Mensur Dlakic7.1k wrote:

Your option 2) is the way to do it. Trimming the alignment after concatenation is always a good idea.

I would exclude any species that doesn't have at least half the genes, or half the combined sequence length in case there are large differences in gene sizes.

ADD COMMENTlink written 5 months ago by Mensur Dlakic7.1k

Thanks ! Do you have any recommendations for the trimming step ?

ADD REPLYlink written 5 months ago by Picasa570

I usually trim at 50% gap threshold.

ADD REPLYlink written 5 months ago by Mensur Dlakic7.1k

Thanks,

To infer the phylogeny, do you perform partition analysis with each gene coordinate ?

ADD REPLYlink written 5 months ago by Picasa570
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1918 users visited in the last hour