calculating conservation depth of multiple regions(in form of bed coordinates)
0
0
Entering edit mode
7.7 years ago

hello! i am trying to calculate conservation depth of multiple regions (100-3000 bps) present in human genome. since regions are more than 10,000, its not possible to check conservation manually. i attempted to download phastcons score files from ucsc (46way.wigifx files). and when i tried to average the phascons score of my desired regions, the results were bit confusing. for example: chr7:21114483-21117423 (hg19) has phastcons max =1 and mean= 0.437672 when i check it in UCSC browser, its conserved down to fish ! where as chr7:20838708-20841649 (hg19) max=1 and mean =0.459534 while on browser its conserved to mammals only! if a region has 0.4 score and is conserved in tetraodon fish, then every other region having this score must be conserved till fish. why is this contradiction here? kindly guide me or suggest me some other way of getting proper conservation depth of these multiple regions.

conservation alignment phastcons ucsc chain files • 1.7k views
ADD COMMENT
0
Entering edit mode

an other solution i saw was to check for the maximum distant specie having at least 50% sequence conserved with query sequence in chain file. but when i see chain files on ucsc, they are splited into blocks or patches according to conservation . i cant understand how to combine those blocks to search my desired region and then how to calculate 50% conservation with the maximally distant specie.

ADD REPLY
0
Entering edit mode

Perhaps you might investigate per-base conservation/evolution signal, like phyloP.

ADD REPLY
0
Entering edit mode

nopes.... not per base, i think thats the main issue. phyloP and PhastCons both give me per base score, but i need to estimate conservation depth of whole patch/region. very crude solution was to average the per base score of whole region to get one mean score of whole region, but thats not working,as i told in details of my question. And thank you so much for answering, i am looking forward to your suggestions and comments. Actually i just need to know the organism till which my region is conserved, and there are some 10K above such regions, which makes viewing them on UCSC custom track impossible

ADD REPLY

Login before adding your answer.

Traffic: 2477 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6