Bedtools Intersect Error
1
0
Entering edit mode
6.0 years ago
dally ▴ 200

I am trying to intersect two bed files using the intersectBed command of bedtools. But I am generating an error when I haven't run into it before generating these files originally.

I run this:

intersectBed -a H3K1-Chip.MACS2_peaks.narrowPeak -b pol_summits_windowed > pol-H3K1-model

And it generates this error:

Error: Invalid record in file pol_summits_windowed. Record is

chr1    249106180    248956422    Pol-II-Chip.MACS2_peak_8225    22.27517

It seems to be incorrectly adding and subtracting flanks

This is from the unflanked file that i'm trying to flank:

chr1    249106179    249106180    Pol-II-Chip.MACS2_peak_8225

The pol_summits_windowed was created in bedtools using a pol_summit bed file from the sequencing core called by MACS and flanked 250bp in either direction using the flankBed function from bedtools.

The narrowpeak file is a file with all peaks of H3K4me1 from sequencing core. I want to interesect the peaks from this file that overlap my flanked summit file.

Looking at it it seems the chr str is larger than chr end. Don't know what went wrong in the flank command.

Any ideas? It works fine if I don't flank my pol summit file, but then I lose out on some H3K4me1 peaks.

EDIT: It's also useful if I not that I have flankedBed this file BEFORE. So that might be giving me the error.

bedtools error intersect • 6.0k views
2
Entering edit mode
6.0 years ago

The end site is before the start site.

0
Entering edit mode

Yes I'm aware that is the problem now. However, this is caused by the flankBed option because otherwise it works. How do I flank these summits without it causing this?

The reason I need to flank this is because if not I am losing true positives I want to keep when intersecting my Pol summits and a mark such as H3K1

0
Entering edit mode

I'm not sure why that's happening, but as long as the coordinates are correct (except that they are reverse order), you can fix this with awk (or excel). Do the ranges look right?

0
Entering edit mode

the ranges are correct, the problem is that this only occurs for some start and end coordinates .. not all of them

0
Entering edit mode

Weird, wonder if it has something to do with the size of the interval. Anyway, yeah, an awk one-liner or excel formula should be able to fix that. Sure is frustrating, though!