Question: Error using bedtools "intersect" command
0
gravatar for hila
9 weeks ago by
hila0
hila0 wrote:

Hi all, I'm trying to intersect a vcf file (as file -a) with a bed file (as file -b). I'm getting this error:

ERROR: Received illegal bin number 4294967295 from getBin call.
ERROR: Unable to add record to tree.

After reading some previous similar questions, I made sure that my bed file is tab delimited and sorted.

When using the -sorted flag I got an error on my first position in the bed file:

Error: Sorted input specified, but the file file.bed has the following out of order record
1       2336225 2337283

Would appreciate any suggestions

Thanks Hila

*adding* My command line is: bedtools intersect -a file_a.gz -b file_b.bed I don't have headers in both files

bedtools software error • 257 views
ADD COMMENTlink modified 9 weeks ago by Pierre Lindenbaum115k • written 9 weeks ago by hila0
1

chromosomes on the bed file are: 1, 2, 3 rather than chr1, chr2, chr3?

ADD REPLYlink written 9 weeks ago by jomo018450

yes, does it matter? should I change it to chr1, chr2...?

ADD REPLYlink written 9 weeks ago by hila0

Yes, this does matter!

All your files in your analyse pipeline should have the same naming schema. Otherwise you are running in problems like this one.

fin swimmer

ADD REPLYlink written 9 weeks ago by finswimmer8.2k

In addition to @finswimmer response, your sorting scheme is 1, 2, 3. If the gz file is chr1, chr2, chr3, it is probably sorted lexicographically (chr1, chr10).

ADD REPLYlink written 9 weeks ago by jomo018450

Hello, can you add your command line and the header of each files in your post please.

ADD REPLYlink written 9 weeks ago by Bastien Hervé2.7k

I've added it to the original post. Thanks!

ADD REPLYlink written 9 weeks ago by hila0

Do sorting of both the files with sort -V giving appropriate columns (e.g,1-3 in the given file.bed) and then try .

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by Jeffin Rockey990

Hello hila ,

how did you made the sorting of both files? Also have a look at the line before and after the one bedtools is complaining about. How does they look like?

fin swimmer

ADD REPLYlink written 9 weeks ago by finswimmer8.2k

Hi, the sorting was done by

cat file_name|sort -k1,1n -k2,2n

The line bedtools is complaining about is the first line in the bed file, the line after it is in order Thanks Hila

ADD REPLYlink written 9 weeks ago by hila0

Looks ok. Have you done it for both files?

In your initial post you wrote the filename is file_a.gz. How did you compress it? Using bgzip would be the right way.

fin swimmer

ADD REPLYlink written 9 weeks ago by finswimmer8.2k

I got the gz file from another source, I didn't make it myself. The thing is- this code already worked with the same gz file and another bed file, so I guess the problem is with the new bed file, I just can't find it :(

ADD REPLYlink written 9 weeks ago by hila0
0
gravatar for Pierre Lindenbaum
9 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum115k wrote:

your coordinate system is too large for bedtools. This typically happens when the chromosome is longer than 500mb https://github.com/arq5x/bedtools2/blob/e5ad7e48108681fe93ee4600d07699ab278e3c56/src/utils/BinTree/BinTree.cpp#L118

ADD COMMENTlink written 9 weeks ago by Pierre Lindenbaum115k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1140 users visited in the last hour