Correlation Between Genome Conservation Scores (Phastcons Vs. Phylop)?
2
23
Entering edit mode
13.9 years ago
Adrian ▴ 700

I've been looking at using conservation scores, obtained from the UCSC Genome database, as a means to prioritize some SNPs we're looking at that are in non-coding regions.

The PhastCons score is a probability that each nucleotide belongs to a conserved element, whereas abs(phyloP) is the -log(p-value) under a null hypothesis of neutral evolution, and a negative sign indicates faster-than expected evolution, while positive values imply conservation.

In eyeballing the data a bit, I was a little surprised that there appears to be only weak (0.397) correlation between the phyloP and and PhastCons values at each site. (I was looking at PhastCons vs. 1- exp(-phyloP) for only sites with positive phyloP). While I realize that they're different statistics and measuring slightly different things, I would have still expected them to be quite highly correlated.

Any experiences or thoughts about using these scores?

conservation comparative • 35k views
ADD COMMENT
3
Entering edit mode

I am not sure about phyloP, but I once used PhastCon score. PhastCon socre is the score from 0 to 1 to show the conservation level. But PhastCon score is not a linear system, most part of the genome are 0 score or even not have score at all, only some parts have quite high score -- they usually are annotated genes (here I am talk about the score based on the placental mammal). I don't think you will simplely find high ccorrelation of the two score systems.

ADD REPLY
1
Entering edit mode

But the highest conserved regions showed by the two score systems should be highly correlated. And, perhaps you may choose the right PhastCon score, try different PhastCon score tracks that based on different spieces span -- but I am not optimistic you will find high correlation.

ADD REPLY
0
Entering edit mode

Could you please add these comments as an answer? Comments are more for clarification, your comments are very informative and actually provide what the original poster asked for. After that we can remove these comments altogether.

ADD REPLY
0
Entering edit mode

Istvan, I followed your suggestion to add my comments as an ansewr.

ADD REPLY
7
Entering edit mode
13.9 years ago
Ning-Yi Shao ▴ 390

I am not sure about phyloP, but I once used PhastCon score. PhastCon socre is the score from 0 to 1 to show the conservation level. But PhastCon score is not a linear system, most part of the genome are 0 score or even not have score at all, only some parts have quite high score -- they usually are annotated genes (here I am talking about the score based on the placental mammal). Here are the figures I once drew based on 17 way Phastcon score at about 2007 (figure 1, figure 2, figure 2 is the y axis of log transformation, left is score 0, right is score 1, and many gaps without score perhaps because of the gaps of the genomes' alignments).

I don't think you will simplely find high ccorrelation of the two score systems. But the highest conserved regions showed by the two score systems should be highly correlated. And, perhaps you may choose the right PhastCon score, try different PhastCon score tracks that based on different spieces span -- but I am not optimistic you will find high correlation.

ADD COMMENT
5
Entering edit mode
13.9 years ago

I believe that per-base PhastCon scores are the result of a windowed calculation — and the windowing may be tuned depending on the alignment and the genomes involved — while per-base phyloP scores are obtained from a per-base calculation.

Therefore, assuming the same genome alignment is used for score generation, it may be difficult to calculate an informative correlation score without first transforming the phyloP scores in a similar fashion, so as to make a fair comparison.

ADD COMMENT

Login before adding your answer.

Traffic: 2575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6