3.8 years ago by
UT Southwestern Medical Center
It sounds like you're trying to identify 'active enhancers'. There's a couple of interesting options you have to do this. For visualization @Vivek Bhardwaj's answer is great. But i'm assuming that you'd like to have a list of putative 'active enhancers' in which to over-lay your TF ChIP-seq data.
So literature tells us that enhancers (whether active or inactive) are typically marked with H3K4me1, and active enhancers are typically marked with H3K27Ac. What i'm getting from your post is that you aren't aware that there is also literature that has shown H3K4me3 to be associated with active enhancers. So you can see that enhancer definitions are sort of scattered, and a lot of histone modification will be cell line specific.
The first and simplest option is to use bedtools intersect to look for overlapping peaks of H3K4me1 + H3K27Ac. There's a couple of caveats to this, but it's a good starting point. To do this, you have to take into account that enhancers are typically distal sites from genes (though there ARE intragenic enhancers, but my post does not cover them, they're a whole other mess). So your best bet is to head on over to the UCSC Table Browser and download a RefSeq or Gencode annotation for your species of all coding transcripts and then use bedtools slop to extend all transcripts' TSS and TSE +5k in both directions.
bedtools slop -i $REFSEQ -g $GENOME -b 5000 > $OUTPUT
After getting your sloped transcripts, you can then overlay your histone data and identify peaks that contain H3K4me1 + H3K27Ac overlap. You could then argue that these areas are 'probable enhancers'.
bedtools intersect -a $H3K4me1 -b $REFSEQ_SLOPED -v | bedtools intersect -a - -b H3K27Ac > $OUTPUT_ENHANCERS
Your other option is to use PARE to help you identify possible enhancers based on this PVP pattern as indicated in a couple of enhancer reviews. Typically enhancers are marked with large 'peaks' of histone marker followed by a low 'valley' of no signal, and then another 'peak' of histone marker. Think: /\ __ /\
You could supply your H3K4me1 + H3K27Ac files to PARE instead of using two H3K4me1 replicates because you are interested in active enhancers. If you only care about identifying enhancers (poised or active) then you can use PARE as the manual suggests without modifications.
ChromHMM is also something that you could take a look at. It's more complicated than the above, and it requires you to do a bit of reading in order to be know what histone markers are typically found or annotate what regions of the genome. But essentially ChromHMM uses a hidden markov model to identify the presence or absence of each chromatin mark and then discovers combinational 'states' which can be used to annotate a genome.
I hope some of this is useful to you. Let me know if you have any questions.
modified 3.8 years ago
3.8 years ago by
Sinji • 3.0k