Question: Correlation Between Genome Conservation Scores (Phastcons Vs. Phylop)?
18
gravatar for Adrian
8.8 years ago by
Adrian680
Cambridge, MA
Adrian680 wrote:

I've been looking at using conservation scores, obtained from the UCSC Genome database, as a means to prioritize some SNPs we're looking at that are in non-coding regions.

The PhastCons score is a probability that each nucleotide belongs to a conserved element, whereas abs(phyloP) is the -log(p-value) under a null hypothesis of neutral evolution, and a negative sign indicates faster-than expected evolution, while positive values imply conservation.

In eyeballing the data a bit, I was a little surprised that there appears to be only weak (0.397) correlation between the phyloP and and PhastCons values at each site. (I was looking at PhastCons vs. 1- exp(-phyloP) for only sites with positive phyloP). While I realize that they're different statistics and measuring slightly different things, I would have still expected them to be quite highly correlated.

Any experiences or thoughts about using these scores?

conservation comparative • 25k views
ADD COMMENTlink written 8.8 years ago by Adrian680
3

I am not sure about phyloP, but I once used PhastCon score. PhastCon socre is the score from 0 to 1 to show the conservation level. But PhastCon score is not a linear system, most part of the genome are 0 score or even not have score at all, only some parts have quite high score -- they usually are annotated genes (here I am talk about the score based on the placental mammal). I don't think you will simplely find high ccorrelation of the two score systems.

ADD REPLYlink written 8.8 years ago by Ning-Yi Shao380
1

But the highest conserved regions showed by the two score systems should be highly correlated. And, perhaps you may choose the right PhastCon score, try different PhastCon score tracks that based on different spieces span -- but I am not optimistic you will find high correlation.

ADD REPLYlink written 8.8 years ago by Ning-Yi Shao380

Could you please add these comments as an answer? Comments are more for clarification, your comments are very informative and actually provide what the original poster asked for. After that we can remove these comments altogether.

ADD REPLYlink written 8.8 years ago by Istvan Albert ♦♦ 79k

Istvan, I followed your suggestion to add my comments as an ansewr.

ADD REPLYlink written 8.8 years ago by Ning-Yi Shao380
7
gravatar for Ning-Yi Shao
8.8 years ago by
Ning-Yi Shao380
United States
Ning-Yi Shao380 wrote:

I am not sure about phyloP, but I once used PhastCon score. PhastCon socre is the score from 0 to 1 to show the conservation level. But PhastCon score is not a linear system, most part of the genome are 0 score or even not have score at all, only some parts have quite high score -- they usually are annotated genes (here I am talking about the score based on the placental mammal). Here are the figures I once drew based on 17 way Phastcon score at about 2007 (figure 1, figure 2, figure 2 is the y axis of log transformation, left is score 0, right is score 1, and many gaps without score perhaps because of the gaps of the genomes' alignments).

I don't think you will simplely find high ccorrelation of the two score systems. But the highest conserved regions showed by the two score systems should be highly correlated. And, perhaps you may choose the right PhastCon score, try different PhastCon score tracks that based on different spieces span -- but I am not optimistic you will find high correlation.

ADD COMMENTlink modified 8.8 years ago • written 8.8 years ago by Ning-Yi Shao380
4
gravatar for Alex Reynolds
8.8 years ago by
Alex Reynolds27k
Seattle, WA USA
Alex Reynolds27k wrote:

I believe that per-base PhastCon scores are the result of a windowed calculation — and the windowing may be tuned depending on the alignment and the genomes involved — while per-base phyloP scores are obtained from a per-base calculation.

Therefore, assuming the same genome alignment is used for score generation, it may be difficult to calculate an informative correlation score without first transforming the phyloP scores in a similar fashion, so as to make a fair comparison.

ADD COMMENTlink written 8.8 years ago by Alex Reynolds27k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1395 users visited in the last hour