Question: Ambiguous fields in FANTOM 5 Enhancer_TSS_association.bed file
gravatar for rohitsatyam102
10 months ago by
rohitsatyam102200 wrote:

Hi all

I downloaded a file from enhancer database (Slidebase) named "enhancer tss associations". However, I am facing a problem in identifying what does coordinates in the first three columns of this bed file represents. I am aware that the fourth column containing entries like "chr1:167440766-167441089; NM_052862; RCSD1; R:0.319; FDR:0" means enhancer coordinates,Transcript accession number,gene symbol,some_score, and False Discovery rate. I am not sure what kind of score does R score represents. I went through the paper of andersson et al to which the website points. However, I couldn't find anything. Also, the last two columns don't make sense to me.

rna-seq genome gene • 261 views
ADD COMMENTlink modified 10 months ago by Corentin450 • written 10 months ago by rohitsatyam102200
gravatar for Corentin
10 months ago by
Corentin450 wrote:


The file is in the BED12 format: . This format is used to display tracks on a Genome Browser.

The last two columns represents where blocks are drawn on the Genome Browser. In my understanding, one block represents the enhancer and the other the TSS. One of the column represents the length of each block and the other column represents the start of each block (compared to the position on the chromosome, the second column).

You can see an example of the two blocks here (notice how the line name correspond to the 4th column of your file):

The R score (calculated as a Pearson Correlation Score) represents the strength of the association between an enhancer and a tss site, if it is higher, then the association is stronger. As you can see, the higher the R score is, the higher the "score column" is (this is because, the "score column" is used to draw the blocks in different shades of grey).

For more information you can also read the FANTOM5 paper:

ADD COMMENTlink written 10 months ago by Corentin450

Thanks, Corentin for your explanation. However, it is still unclear to me what does first three columns represent in the bed file. They aren't the coordinates for the enhancers that I am sure of. I wish to understand what the start and end coordinates refer to in this case. They aren't TSS either.

ADD REPLYlink written 10 months ago by rohitsatyam102200

I did some testing on the UCSC genome browser and it seems that the first three columns correspond to the whole feature (the enhancer + TSS). It is probably to make the genome browser display everything.

The coordinates does not exactly match the features (it seems to start before and end after the actual enhancer and TSS, which is probably to make the view better?).

But since I have not found a documentation for it, I am not 100% sure. Let us know if you manage to find an answer.

ADD REPLYlink written 9 months ago by Corentin450
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1048 users visited in the last hour