Identifying/Annotating Enhancers
0
0
Entering edit mode
11 months ago
cthangav ▴ 100

I have a file with the coordinates of possible enhancers in reprogamming mouse cells (fibroblast and iPS cells, the mm10 genome). The datasheet looks like this, where RE is the enhancer and TG/TF are their associated genes/transcription factors:

 TF TG  Score   FDR REs
 Pou5f1 L1td1   1.09E+06    9.24E-06    chr4_98727908_98728497
 Pou5f1 Cd109   580062  9.24E-06    chr9_78623220_78624062;chr9_78615332_78616035
 Sox2   L1td1   428168  9.24E-06    chr4_98727908_98728497;chr4_98726641_98726938

I wanted to be able to confirm they are enhancers and categorize them as known enhancers or novel enhancers like in this figure- is there a way to do this?

Graph of Known/Novel Enhancers

I was thinking of comparing the regions with a known enhancer database such as enhancer atlas (but it looks like they use mm9) or H3K4me1 Encode data (although there isn't a track for ipsc). Is there a way to find/count the union set in R or do I need to use bedtools?

bedtools R ENCODE • 641 views
ADD COMMENT
1
Entering edit mode

The question here for me is not such much how to technically do this but rather what is considered an enhancer.

After all, there is a variety of methods that are used to call "enhancers" which depending on context is either any regulatory element, or one with certain marks. ATAC-seq measures open chromatin, ChIP-seq can probe associated marks such as H3K4me1 and H3K27ac and CAGE-seq can identify active transcription of non-promoter elements. You will find literature calling enhancers with any of these methods but without any functional data towards whether the called region has indeed any regulatory activity.

As such, there is much uncertainty when pulling any databases. The question would be whether you want first check whether there are datasets or databases that use the exact assay you used to call enhancers and then compare with that. There is naturally limited overlaps between independent methods and a just because a CAGE-seq database might miss an enhancer you called does not necessarily mean it is unreported, maybe it is in all of the ChIP-based databases.

ADD REPLY
0
Entering edit mode

Is there a common identifier/piece of information between your "possible enhancer coordinates" datasheet and databases such as enhance atlas to make a mapping between the two?

ADD REPLY

Login before adding your answer.

Traffic: 1489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6