Why most super enhancers identified by ROSE were located on promoter regions? How can I remove them?
Entering edit mode
12 months ago
Erica • 0

Hi all,

I called 90,112 peaks from H3K27ac CHIP-seq data using MACS2, and then identified 1,108 super enhancers (SEs) using ROSE with default parameter '-s 12500 -t 2500'. However, I found that most of SEs are located in promoter regions.

The genomic annotation of 1,108 SEs by ROSE:

enter image description here

The genomic annotation of 90,112 peaks by MACS2:

enter image description here

I am kind of interested in SEs located in distal intergenic regions. But now the number of this type of SEs is limited. Could any one guide me to solve this problem? Any suggestion will be highly appreciated. Thank you in advance.

super-enhancer ROSE • 668 views
Entering edit mode

Hi Erica, were you able to find a solution? I am trying to do the same. Also, did you use chipseeker to find the feature distribution? I wonder which output file from Rose you use to run this with SE. Thanks!!

Entering edit mode
5 months ago
ElCascador ▴ 20

Hi, You will need to get the tss information for your genome :

wget -q -O " http://ftp.ensembl.org/pub/grch37/current/gtf/homo_sapiens/Homo_sapiens.GRCh37.87.gtf.gz" | gunzip -c | grep -v "#" ./data/Homo_sapiens.GRCh37.87.gtf | awk '($3=="gene")' | grep protein_coding | awk '{OFS="\t"};{if($7 == "+"){start = $4} else if($7 == "-"){start = $5}};{print $1, start-1, start}' | grep -v "\." | grep -v "_" > tss.bed

With this, you can use awk to extend the TSS as much as you want (here 2000bp each way).

awk '{OFS="\t" print $1,$2-2000,$3+2000}'

Then bedtools intersect :

bedtools intersect -a yourpeakfile -b tss.be -v > filtered_tss.bed

You can check again with Chipseeker if this work


Login before adding your answer.

Traffic: 727 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6