Hi I have a bedpe file(describing loop across genome, distance may be quite long) and a bed file.
The data look like this:
Bedpe(First 6 columns describe the loop which are chromosome1 start1 end1 chromosome2 start2 end2, while the rest of the columns are some attribution that is useful):
chr1 1050000 1060000 chr1 1180000 1190000 0,255,255 241 107.673 11 8.802 143.514 120.144 1.09607073802e-16 9.5834568345e-17 2.5647576134 8e-07 1.6487531336e-16 2 1060000 1180000 7071.06781187
Bed :
chr1 10000 10271 CTCF 1000 . 10000 10271 10,190,254
I want to find the overlap between anchor region in bedpe file and bed file. How can I use bedtools to do this job?
BTW, is there a way to properly sort the bedpe file? I tried to sort using the command "sort -k1,1 -k2,2n infile" that is recommended by the bedtools. Is it suitable for bedpe file? Or should I use "sort -k1,1 -k2,2 -k3,3 -k4,4 -k5,5 -k6,6 infile"?
yep I got similar results like this. I used this command:
bedtools intersect -wa -wb -a bedpe -b bed -sortedBut it seems that the overlap is between first 3 column in -a file and -b file. Is there any way that can also find out the overlap between column 4-6 in -a file and first 3 column in -b file at the same time?
bed file is always concerned with overlapping of first 3 columns in the tab delimited file,
chr#, startandendco-ordinates. The rest you see in output are just data entries of corresponding input files that you want to see as output using different handles like-wa . -wb - wao. If you want to work on other columns of a bed file then you simply have to reconstruct new bed file with your desired columns and then use them for your downstream operations.Break up your BEDPE file using
cut. I would also annotate each line in the BEDPE so you can match the two positions in the BEDPE file. If your BEDPE file ischr1 100 200 chr1 500 600 ...I would break it up likechr1 100 200 POS1and the other filechr1 500 600 POS1Then run
intersectBedon each BED file you generated