How to split a bed file according promoter window (-30 to 300bp)
0
1
Entering edit mode
5.2 years ago
Lila M ★ 1.1k

Hi everybody,

I have a bed file of interest that I want to compare with a bigwig file to get the coverage. To do that, I am using deepTools

 multiBigwigSummary BED-file -b Norm -out promoter_coverage.zip --BED file.BED --outRawCounts coverage


I want to calculate the coverage over the promoter region (-30 bp to 300bp arround TSS) and I was thinking in the best way to do that. First I thought that it could be possible address this issue editing the bed file as follow

chr   start  end         chr   start   end (= start +300)
chr1  14362  29370  ---> chr1  14362  14662


And for the gen body (if I want a window from 300 bp to the end) :

chr   start  end         chr   start (+300)  end
chr1  14362  29370  ---> chr1     1736     29370


Have this approach any sense? or does anybody knows another better?

Thank you!

ChIP-Seq bed deepTools coverage edit • 1.4k views
1
Entering edit mode

Please take a look at this thread. Specifically Alex Reynold's comment. It looks like you're attempting to calculate promoter pausing indexes, or similarly.

You would need to generate a coverage profile for promoters and then for gene bodies and then perform the appropriate calculations. Generating the coverage profiles can be done using the bedops program using Alex's approach.

0
Entering edit mode

I appreciate if you could specify a bit more your answer and also respond to my question, because maybe both of them are correct, the difference is that bedops is totally new for me and I would like to know if my approach is correct.

Thank you

2
Entering edit mode

In general yes your approach works. You would want to subtract -30 from the start coordinate for column 2, and then add 300 to the start coordinate for column 3. You would do the opposite for genes on the - strand.

Something like:

awk -v OFS='\t' '{if ($6 == "+") print$1, $2-30,$2+300, $4,$5, $6; else print$1, $3-30,$3+300, $4,$5, \$6}' INFILE > OUTFILE

For the gene body you would simply add 300 bp to the start coordinate for the second column and keep the third column the way it is, and do the opposite for the genes in the - direction.

So yes. Your approach works. Just make sure that you're subtracting and adding bp in the appropriate direction.

1
Entering edit mode

Thank you very much :) anyway, I'm going to have a look to bedops!!