Question: Find consensus overlap between many bed files
0
gravatar for goodez
3 months ago by
goodez280
KCMO, United States
goodez280 wrote:

I usually use bedtools intersect to find overlapping regions of bed files, but it seems like this tool can only output overlap between a pair of files. I need something that can do the following from many bed files and only report regions contained in all of the bed files.

Example input from 4 separate bed files:

chr1    50    100
chr1    60    120
chr1    30    90
chr1    50    90

Desired output:

chr1    60    90

Any tools for this? Maybe I should just cat all the bed files together and merge them?

ADD COMMENTlink modified 3 months ago by Alex Reynolds26k • written 3 months ago by goodez280
2
gravatar for venu
3 months ago by
venu5.6k
Germany
venu5.6k wrote:

Maybe I should just cat all the bed files together and merge them?

You can do that. But there is also multiIntersect. Check here

ADD COMMENTlink written 3 months ago by venu5.6k

Thanks! My servers at work had multiIntersectBed already installed, which I guess is also known as bedtools multiinter as genomax suggested. This is just the functionality I was looking for, nothing fancy, just common regions between many bed files.

ADD REPLYlink written 3 months ago by goodez280
2
gravatar for genomax
3 months ago by
genomax57k
United States
genomax57k wrote:

Intersect multiple BED files (bedops option)

There is also

 bedtools multiinter

Tool:    bedtools multiinter (aka multiIntersectBed)
Version: v2.26.0
Summary: Identifies common intervals among multiple
         BED/GFF/VCF files.

Usage:   bedtools multiinter [OPTIONS] -i FILE1 FILE2 .. FILEn
         Requires that each interval file is sorted by chrom/start.
ADD COMMENTlink modified 3 months ago • written 3 months ago by genomax57k

Thanks to you and venu for pointing me in the right direction. +1 for both

ADD REPLYlink written 3 months ago by goodez280
2
gravatar for Alex Reynolds
3 months ago by
Alex Reynolds26k
Seattle, WA USA
Alex Reynolds26k wrote:

To intersect intervals from 1 to N files, simply use bedops --intersect:

$ bedops --intersect A.bed B.bed ... N.bed > answer.bed

More details from the documentation: http://bedops.readthedocs.io/en/latest/content/reference/set-operations/bedops.html#intersect-i-intersect

This requires sorted BED files. You can use BEDOPS sort-bed to do this quickly.

I think the other toolkit suggestions now require sorted BED files, as well.

ADD COMMENTlink modified 3 months ago • written 3 months ago by Alex Reynolds26k

I was able to find all overlapping regions of my BEDs using bedtools multiinter, although I had to filter the output to only keep regions found in all BEDs.

I'll try out bedops --intersect too since the output might be easier to keep in my pipeline. To sort the BEDs, I found bedtools sort worked great.

ADD REPLYlink written 3 months ago by goodez280

bedops --intersect will give you the answer you want without additional steps.

ADD REPLYlink written 3 months ago by Alex Reynolds26k

Thanks it was just one pipe to awk using bedtools, but I'm going to switch to bedops. +1 for every answer lol they're all good

ADD REPLYlink written 3 months ago by goodez280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1888 users visited in the last hour