Question: Relationship Between The Accuracy Of A Phylogenetic Tree And The Length Of The Sequences Used For Analysis
0
gravatar for kmkdesilva
4.8 years ago by
kmkdesilva60
United States
kmkdesilva60 wrote:

Hi all,

What is the relationship between the length of the sequences used for phylogenetic tree building and the accuracy of the generated tree?

I have generated set of sequences which are around 20,000bp of length, they are generated from NGS data using stacks pipeline. So I'm wondering whether there is a connection between the sequence length and the inferred tree accuracy.

Any help is highly appreciated and thank you in advance.

Regards, kmkdesilva

• 2.8k views
ADD COMMENTlink modified 7 weeks ago by Biostar ♦♦ 20 • written 4.8 years ago by kmkdesilva60

It depends on what you are studyng (phylogeny of species or of genes?). Also, what do you exactly mean by connection? Linear correlation? A correlation that can be expressed by any kind of function? A limit under (or above) which the tree is unreliable? I try to speculate a little bit: There is certanily a connection. Imagine your sequence are 10bp long. Would you trust the phylogenetic tree? On the contrary. Imagine your sequence are as long as an entire chromosome (I assume, that given your data your are studying phylogeny based on a given gene). Would you trust them or would you be worried by the risk of mixing phylogeny information at your locus with phylogeny information at several other loci that might have evolved with different histories? Although I root for the motto "the more the better", if more implies adding low quality sequences, or sequences belonging to a portion of the genome that had an evolution different from the specific locus you are interested in, then maybe the more is not always the better.

ADD REPLYlink written 4.8 years ago by Fabio Marroni1.8k

It is actually phylogeny of species. I extracted this SNPs from whole genome. I tried to remove most of the low quality data at the beginning. I was wondering whether accuracy increases or decreases when the sequence length is increasing.

ADD REPLYlink written 4.8 years ago by kmkdesilva60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1139 users visited in the last hour