Question: Detecting Derived Sites
gravatar for ngsgene
9.0 years ago by
United States
ngsgene350 wrote:

I am working on coding sequences of two plant species and have identified their orthologs using reciprocal best BLAST hits. For detecting the derived site in one of the species (of interest) I am considering the other to have the ancestral sequence - and the comparison for all sites between these two lets me look at the derived sites in the specie of interest.

I need to segregate the synonymous & non-synonymous derived sites for computing their summary statistics separately. I would like to understand the significance of classifying them differently and also some suggestions as to how can this be done.

non orthologues blast • 1.8k views
ADD COMMENTlink modified 8.4 years ago by Casey Bergman18k • written 9.0 years ago by ngsgene350
gravatar for Casey Bergman
9.0 years ago by
Casey Bergman18k
Athens, GA, USA
Casey Bergman18k wrote:

1) With only two species, you cannot determine which state is ancestral and which state is derived. You need a minimum 3 OTUs with a clear outgroup to determine ancestral states, and even then you can be misled.

2) Synonymous and non-synonymous sites are treated differently since they have different selective regimes, and therefore their rates vary drastically from one another. Synonymous rates are more similar across genes than non-synonymous rates. See Wen-Hsiung Li's textbook for detailed consideration of this classic topic in molecular evolution.

ADD COMMENTlink written 9.0 years ago by Casey Bergman18k

Thats a good point, I had been considering one of the two as an outgroup and deriving information based on that.

ADD REPLYlink written 9.0 years ago by ngsgene350
gravatar for Vitis
9.0 years ago by
New York
Vitis2.4k wrote:

Without an outgroup, you can't really say which are ancestral and which are derived. There are tons of different methods to calculate Ka/Ks, if that's what you're trying to do. For a starter, you can take a look at Nei and Gojobori 1986, to see what those parameters mean and how they are calculated. More modern method uses phylogenetic tree and codon-based model to calculate Ka/Ks site-by-site and lineage-by-lineage, as implemented in HyPhy.

ADD COMMENTlink written 9.0 years ago by Vitis2.4k

No, you didn't understand. You need at least three to know ancestral or derived states, and only if you're pretty sure one of the three are ancestral to the other two, then you can use that as an outgroup. You can never use two taxa and figure out ancestral and derived. In terms of Tajima's D, if I understand right, it's a parameter used at population level. It's an indicator of rare allele frequencies, which means you need at least 3 entires/sequences. By the way, identifying orthologs are not as straightforward as you think, usually BLAST is not enough.

ADD REPLYlink written 9.0 years ago by Vitis2.4k

I have 50 accessions of one specie - one of them being the reference sequence. Basically I am using blast with the reference sequence and the other specie and finding genes with bidirectional best hit - as Khader has pointed out in this thread

When you say blast is not enough, are there any other (better) tools that can be used then?

ADD REPLYlink modified 8 months ago by RamRS28k • written 9.0 years ago by ngsgene350

I am studying a single specie and the other is hence being considered as the outgroup - as Bergman pointed out I need to look into that. Initially I am looking to calculate Tajima's D - and trying to figure the segregation based on synonymous/non-synonymous - what needs to be segragated - if I am considering genes for Tajima's D - while the synonymous changes are at the amino acid level.

ADD REPLYlink written 9.0 years ago by ngsgene350

Blast-based ortholog identification is one type of approach, the other major type usually uses some kind of phylogenetic trees to guide the ortholog identification. There is a very good summary of this here. Personally, I'd prefer tree-based approaches.

What Is The Best Method To Find Orthologous Genes Of A Species?

ADD REPLYlink modified 8 months ago by RamRS28k • written 9.0 years ago by Vitis2.4k

Also, for 50 accessions (individuals from different populations?) within one species, if they're very close (check ~10 genes?), it'd be straightforward to find the orthologs because genetic divergence within one species is usually very small. You probably can use the read mapping results to directly reconstructed the orthologs.

ADD REPLYlink written 9.0 years ago by Vitis2.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1495 users visited in the last hour