Question: Ambiguous fields in FANTOM 5 Enhancer_TSS_association.bed file
0
gravatar for rohitsatyam102
5 weeks ago by
rohitsatyam10260 wrote:

Hi all

I downloaded a file from enhancer database (Slidebase) named "enhancer tss associations". However, I am facing a problem in identifying what does coordinates in the first three columns of this bed file represents. I am aware that the fourth column containing entries like "chr1:167440766-167441089; NM_052862; RCSD1; R:0.319; FDR:0" means enhancer coordinates,Transcript accession number,gene symbol,some_score, and False Discovery rate. I am not sure what kind of score does R score represents. I went through the paper of andersson et al to which the website points. However, I couldn't find anything. Also, the last two columns don't make sense to me.

rna-seq genome gene • 104 views
ADD COMMENTlink modified 5 weeks ago by Corentin450 • written 5 weeks ago by rohitsatyam10260
2
gravatar for Corentin
5 weeks ago by
Corentin450
Corentin450 wrote:

Hi,

The file is in the BED12 format: http://genome.ucsc.edu/FAQ/FAQformat.html#format1 . This format is used to display tracks on a Genome Browser.

The last two columns represents where blocks are drawn on the Genome Browser. In my understanding, one block represents the enhancer and the other the TSS. One of the column represents the length of each block and the other column represents the start of each block (compared to the position on the chromosome, the second column).

You can see an example of the two blocks here (notice how the line name correspond to the 4th column of your file):

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr1%3A858252%2D861621&hgsid=791359097_TnfoVJubF5SaAM0recpdIpTpvsGI

The R score (calculated as a Pearson Correlation Score) represents the strength of the association between an enhancer and a tss site, if it is higher, then the association is stronger. As you can see, the higher the R score is, the higher the "score column" is (this is because, the "score column" is used to draw the blocks in different shades of grey).

For more information you can also read the FANTOM5 paper: https://www.nature.com/articles/nature12787

ADD COMMENTlink written 5 weeks ago by Corentin450

Thanks, Corentin for your explanation. However, it is still unclear to me what does first three columns represent in the bed file. They aren't the coordinates for the enhancers that I am sure of. I wish to understand what the start and end coordinates refer to in this case. They aren't TSS either.

ADD REPLYlink written 4 weeks ago by rohitsatyam10260
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 765 users visited in the last hour