I have already obtained a narrow peak file of my mouse sample using macs2. Next, I want to analyze now the overlapping peaks between my sample (per chromosome) and the mouse reference genome, specifically in the TSS. In other words, I wanna identify if which peaks lie on the mouse promoter region. I made/split into separate files per chromosome of both my experimental file and the mouse genome using Perl script. Do you have any recommendations on how to proceed, using Perl programming again, on how to identify overlapping peaks in the TSS with my sample file and the reference genome?
You could get TSS from gene annotations specific to your reference genome, e.g.:
Once you have that, you can find set intersections via something like:
$ bedops --element-of 1 peaks.bed tss.bed > peaks_that_overlap_tss.bed
If you'd rather get a calculation of which peaks overlap which TSS, specifically:
$ bedmap --echo --echo-map tss.bed peaks.bed > tss_with_associated_peaks.bed
The difference between
bedmap, very generally, is that
bedops does set and interval operations (intersect, difference, etc.), while
bedmap does associations between genomic elements in sets ("mapping").