Minimum overlap when comparing BED files
0
0
Entering edit mode
3.1 years ago

Hello everybody!

I have a collection of genomic coordinates (BED file) that I have generated in an ATAC-Seq experiment. These genomic locations are annotated as distal intergenic. I have collected from publications a databases a collection of known enhancer regions in a single BED file. What I would like to do is find the overlap between my two BED files.

For that, I am using bedtools intersect. One parameter that I am not quite sure how to set uo is the command -f or the Minimum overlap required as a fraction of A (first input file). This will set with how much overlap I consider two genomic regions to present in my two input files. My question is: what is the minimum overlap to confidently say there is an overlap? Any thoughts or recommendation?

Thanks!

bedtools • 669 views
ADD COMMENT
1
Entering edit mode

Hi,

For what I understood you would like to annotate the ATAC-Seq experiment resultant BED file with known enhancer regions in a BED file. In annotations tools for Structural Variants, is commonly used a percentage of overlap or reciprocal overlap (usually ~70%). But I think would be more about the interpretation or the analysis you are going to do with the results.

ADD REPLY
0
Entering edit mode

Hi,

Thanks for the indication! Yes, in a way what I want to do is say whether my identified genomic regions annotate as reported enhancers. However, what I could find as enhancer collection, is derived from many different cell types. I am not sure then whether 70% is going to be too high of a threshold. Thanks for the indication, I really had not much idea!

ADD REPLY
1
Entering edit mode

There is no gold standard answer to these "cutoff" questions. By experience with these kinds of things it actually should not matter too much what you use. I can hardly imagine that the cutoff has the power to change the "big-picture overall" outcome. If you feel that e.g 50% is a good cutoff then go ahead. With default cutoff you get a few more hits, but again, I doubt the big picture changes. Only important thing is to not spend too much time on these little details but rather to decide for a cutoff and then go forward. If things do not work at all and nothing downstream makes sense you can always go back and change parameters.

ADD REPLY

Login before adding your answer.

Traffic: 2015 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6