Error: unable to open file or unable to determine types for file bed file
3
3
Entering edit mode
6.8 years ago

Im trying to do a bedtools compare on a couple of bed files. One of the bed files keeps throwing this error:

Error: unable to open file or unable to determine types for file

My file is as follows:

 chr1 810865 3198369 chr1 844270 845356 chr1 882432 10373009 chr1 1104962 1173985 chr1 2616309 2617058 chr1 3056425 3245459 chr1 4704545 5447621

I have checked that the start position is before the end position. I have added and removed column headings chrom, "start" and "end" (also tried 'chromStart' and 'chromEnd'. But no joy. I had originally created this bed file from a tab delimited file from R. Does anyone know what has gone wrong?

bed sequencing bedtools • 14k views
2
Entering edit mode

Try: perl -p -i -e 's/ /\t/g'  bed_file.txt for converting white spaces to tabs and rerun bedtools.

1
Entering edit mode

Have you ensured that you're always using tabs to separate the columns? I've seen a couple files with spaces randomly thrown in and that tends to break things.

1
Entering edit mode

As others have suggested, this is most likely caused by the file having spaces in place of tabs. I note you mention this table was written in R, where the default seperator for write.table is indeed a space.  I have this function in my .Rprofile to make writing bioinformatic-sy table easier:

write.simple.table <- function(...){
write.table(..., quote=FALSE, row.names=FALSE, col.names=FALSE, sep='\t')
}


4
Entering edit mode
6.8 years ago

OK so this is what I did in case anyone else gets the same problem

1. As per Ashutosh I ran perl -p -i -e 's/ /\t/g' I think this was not the problem though in the end but is probably a good idea

2. Uploaded the bed file to UCSC genome browser as a custom track. If the bed file is incorrectly formatted it will tell you what the problem is and is much better at debugging than bedtools

3. beware of directory paths when running the command line.

Sorted now

Thanks everone

0
Entering edit mode

I got the same error message from intersectBed using a gtf file.

I uploaded it to UCSC genome browser as a custom track as suggested (https://genome.ucsc.edu/cgi-bin/hgCustom) and it told me what my issue was:

"chromStart after chromEnd (1462431 > 1462417)"

Thanks

Note: I had to change my chromosome name to get this to work as E. coli isn't one of their species options.

0
Entering edit mode

Sometimes it is caused by header added by program which generated bed file. In this case tail -n +2 file.bed > newfile.bed helps

3
Entering edit mode
6.8 years ago

Run cat -T on your file in order to verify if it has tabs where you think they should be. If it has multiple spaces where a tab is expected, use sed 's/ \+/\t/g' on the file to replace one or more spaces with a single tab character.

0
Entering edit mode
6.8 years ago

One easy thing to try, despite the bed format only requiring those 3 columns, I've gotten that error, and fixed it by adding a fourth column with '+', because the command I was using need that information.