Question: Find consensus overlap between many bed files
1
gravatar for goodez
19 months ago by
goodez470
United States
goodez470 wrote:

I usually use bedtools intersect to find overlapping regions of bed files, but it seems like this tool can only output overlap between a pair of files. I need something that can do the following from many bed files and only report regions contained in all of the bed files.

Example input from 4 separate bed files:

chr1    50    100
chr1    60    120
chr1    30    90
chr1    50    90

Desired output:

chr1    60    90

Any tools for this? Maybe I should just cat all the bed files together and merge them?

overlap intersect bedtools genome • 1.7k views
ADD COMMENTlink modified 19 months ago by Alex Reynolds29k • written 19 months ago by goodez470
3
gravatar for venu
19 months ago by
venu6.3k
Germany
venu6.3k wrote:

Maybe I should just cat all the bed files together and merge them?

You can do that. But there is also multiIntersect. Check here

ADD COMMENTlink written 19 months ago by venu6.3k

Thanks! My servers at work had multiIntersectBed already installed, which I guess is also known as bedtools multiinter as genomax suggested. This is just the functionality I was looking for, nothing fancy, just common regions between many bed files.

ADD REPLYlink written 19 months ago by goodez470
3
gravatar for genomax
19 months ago by
genomax76k
United States
genomax76k wrote:

Intersect multiple BED files (bedops option)

There is also

 bedtools multiinter

Tool:    bedtools multiinter (aka multiIntersectBed)
Version: v2.26.0
Summary: Identifies common intervals among multiple
         BED/GFF/VCF files.

Usage:   bedtools multiinter [OPTIONS] -i FILE1 FILE2 .. FILEn
         Requires that each interval file is sorted by chrom/start.
ADD COMMENTlink modified 19 months ago • written 19 months ago by genomax76k

Thanks to you and venu for pointing me in the right direction. +1 for both

ADD REPLYlink written 19 months ago by goodez470
3
gravatar for Alex Reynolds
19 months ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

To intersect intervals from 1 to N files, simply use bedops --intersect:

$ bedops --intersect A.bed B.bed ... N.bed > answer.bed

More details from the documentation: http://bedops.readthedocs.io/en/latest/content/reference/set-operations/bedops.html#intersect-i-intersect

This requires sorted BED files. You can use BEDOPS sort-bed to do this quickly.

I think the other toolkit suggestions now require sorted BED files, as well.

ADD COMMENTlink modified 19 months ago • written 19 months ago by Alex Reynolds29k

I was able to find all overlapping regions of my BEDs using bedtools multiinter, although I had to filter the output to only keep regions found in all BEDs.

I'll try out bedops --intersect too since the output might be easier to keep in my pipeline. To sort the BEDs, I found bedtools sort worked great.

ADD REPLYlink written 19 months ago by goodez470
1

bedops --intersect will give you the answer you want without additional steps.

ADD REPLYlink written 19 months ago by Alex Reynolds29k

Thanks it was just one pipe to awk using bedtools, but I'm going to switch to bedops. +1 for every answer lol they're all good

ADD REPLYlink written 19 months ago by goodez470

I agree. I used the bedops --intersect and it has same functionality as bedtools multiinter and you won't require even filtering.

ADD REPLYlink written 27 days ago by rohitsatyam10260
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1925 users visited in the last hour