bedtools merge, ERROR: file has non positional records, which are only valid for the groupBy tool.
1
1
Entering edit mode
7.4 years ago

I am attempting to utilize bedtools merge (which I have used many times before), to merge overlapping elements in a BED file and I am getting the error:

"ERROR: file has non positional records, which are only valid for the groupBy tool."

I have:
- sorted the bed file by chromosome and start position
- checked the chromosome column for any non-standard notation
- checked the start and end columns for any non-numerical characters
- checked that the start values are all smaller than the end values

None of the above items seem to be an issue. The file has almost 2 million lines, so checking the file manually is not an option.

Any suggestions for how to remedy this issue would be greatly appreciated.

Here is the head of the file:

chr1 11874 12227 DDX11L1 NR_046018.2 +
chr1 12613 12721 DDX11L1 NR_046018.2 +
chr1 13221 14409 DDX11L1 NR_046018.2 +
chr1 14362 14829 WASH7P NR_024540.1 -
chr1 14970 15038 WASH7P NR_024540.1 -
chr1 15796 15947 WASH7P NR_024540.1 -
chr1 16607 16765 WASH7P NR_024540.1 -
chr1 16858 17055 WASH7P NR_024540.1 -
chr1 17233 17368 WASH7P NR_024540.1 -

bedtools • 14k views
ADD COMMENT
0
Entering edit mode

According to the GitHub code, that error gets triggered by either "Enforce integer coordinates" and "Enforce tab-separated files", maybe the file(s) are not tab delimited or there is a weird coordinate in there

ADD REPLY
0
Entering edit mode

Thank you for the help. It turns out that my python code that was generating the input file was adding a space to the front end of each integer (in addition to the tab separators).

ADD REPLY
0
Entering edit mode

Sometimes it easiest just to subset the file until you can find the offending line.

ADD REPLY
0
Entering edit mode
7.4 years ago

Solved the issue:

I had generated the BED file with python code, and my print statement was placing "space" characters next to each coordinate, so bedtools merge didn't recognize the start and end coordinates as integers

Code generating error:

print >> OutputFile, chrom, "\t", start, "\t", end

Code that words:

to_print=[chrom,start,end]
print >> OutputFile, "\t".joint(to_print)
ADD COMMENT

Login before adding your answer.

Traffic: 2592 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6