using bedtools to find the intersect of bedpe and bed
2
5
Entering edit mode
5.5 years ago

Hi I have a bedpe file(describing loop across genome, distance may be quite long) and a bed file.

The data look like this:

Bedpe(First 6 columns describe the loop which are chromosome1 start1 end1 chromosome2 start2 end2, while the rest of the columns are some attribution that is useful):

chr1    1050000 1060000 chr1    1180000 1190000 0,255,255       241     107.673 11     8.802 143.514 120.144 1.09607073802e-16       9.5834568345e-17        2.5647576134     8e-07       1.6487531336e-16        2       1060000 1180000 7071.06781187


Bed :

chr1 10000 10271 CTCF 1000 . 10000 10271 10,190,254

I want to find the overlap between anchor region in bedpe file and bed file. How can I use bedtools to do this job?

BTW, is there a way to properly sort the bedpe file? I tried to sort using the command "sort -k1,1 -k2,2n infile" that is recommended by the bedtools. Is it suitable for bedpe file? Or should I use "sort -k1,1 -k2,2 -k3,3 -k4,4 -k5,5 -k6,6 infile"?

genome • 4.1k views
4
Entering edit mode
5.5 years ago

$intersectBed -a bedpe -b hg19_cytoband.bed -wao chr1 1050000 1060000 chr1 118000011900000,255,255 241 107.67311 8.802143.514120.1441.09607073802e-16 9.5834568345e-17 2.5647576134 8e-07 1.6487531336e-16 2 106000011800007071.06781187 chr1 0 2300000 p36.33 gneg 10000  Where chr1 0 2300000 p36.33 gneg 10000 is the feature in the -b file The last column (10000) is the number of base pairs that overlap the feature in -a. http://bedtools.readthedocs.io/en/latest/content/tools/intersect.html To sort a BED file use the sortBed command in bedtools. $ sortBed -i bedpe

0
Entering edit mode

yep I got similar results like this. I used this command:

bedtools intersect -wa -wb -a bedpe -b bed -sorted

But it seems that the overlap is between first 3 column in -a file and -b file. Is there any way that can also find out the overlap between column 4-6 in -a file and first 3 column in -b file at the same time?

1
Entering edit mode

bed file is always concerned with overlapping of first 3 columns in the tab delimited file, chr#, start and end co-ordinates. The rest you see in output are just data entries of corresponding input files that you want to see as output using different handles like -wa . -wb - wao. If you want to work on other columns of a bed file then you simply have to reconstruct new bed file with your desired columns and then use them for your downstream operations.

1
Entering edit mode

Break up your BEDPE file using cut. I would also annotate each line in the BEDPE so you can match the two positions in the BEDPE file. If your BEDPE file is chr1 100 200 chr1 500 600 ... I would break it up like chr1 100 200 POS1 and the other file chr1 500 600 POS1

Then run intersectBed on each BED file you generated

2
Entering edit mode
5.3 years ago
PT ▴ 20

Use pairToBed, bedtools pairtobed -a file.bedpe -b file.bed