Using multiIntersectBed to filter original bed files
1
0
Entering edit mode
3.2 years ago

I am trying to use multiIntersectBed to find overlap between multiple bed files and THEN use the overlap to filter the original bed files to, for example, only keep rows that are within the overlaps found in 2 or more libraries. In the example below, it is easy enough to filter the output to only keep rows where row 4 (num) >=2. How would I then remove every row from a.bed that does not fall within these overlapping intervals?

$ multiIntersectBed -header -i a.bed b.bed c.bed

 chrom  start   end num list    a.bed   b.bed   c.bed
 chr1   6   8   1   1   1   0   0
 chr1   8   12  2   1,3 1   0   1
 chr1   12  15  3   1,2,3   1   1   1
 chr1   15  20  2   1,2 1   1   0
 chr1   20  22  1   2   0   1   0
 chr1   22  30  2   1,2 1   1   0
 chr1   30  32  1   2   0   1   0
 chr1   32  34  1   3   0   0   1
bedtools multiIntersectBed • 1.6k views
ADD COMMENT
0
Entering edit mode

I hope you realize that multiIntersectBed gives you the exact nucleotides that overlap between samples, not the ranges as a whole. In this situation here:

A) -------------------
B)      --------------------
C)                   -----------

it would give you a value of 3 (present in all ranges) only for a single nucleotide (the first one of C that is present in all three. It would not return the intervals of A, B, and C. Just making sure you know that because I did not realize how multiIntersectBed works for quite a while. Is this really what you want?

ADD REPLY
1
Entering edit mode
3.2 years ago

Here's a general solution for filtering a.bed against n files with another kit:

$ OVERLAP_THRESHOLD=2
$ bedops --everything a.bed b.bed c.bed ... n.bed \
    | bedmap --count --echo --delim '\t' - \
    | uniq \
    | awk -v ot=${OVERLAP_THRESHOLD} '$1 >= ot' \
    | cut -f2- \
    | bedops -n 1 a.bed - \
    > a.filter.bed
ADD COMMENT

Login before adding your answer.

Traffic: 2263 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6