Question: Error: unable to open file or unable to determine types for file bed file
2
gravatar for sebastianzeki0
4.2 years ago by
United Kingdom
sebastianzeki0170 wrote:

Im trying to do a bedtools compare on a couple of bed files. One of the bed files keeps throwing this error:

 

Error: unable to open file or unable to determine types for file

 

My file is as follows: 

chr1 810865 3198369
chr1 844270 845356
chr1 882432 10373009
chr1 1104962 1173985
chr1 2616309 2617058
chr1 3056425 3245459
chr1 4704545 5447621
   

 

 

 

I have checked that the start position is before the end position. I have added and removed column headings chrom, "start" and "end" (also tried 'chromStart' and 'chromEnd'. But no joy. I had originally created this bed file from a tab delimited file from R. Does anyone know what has gone wrong?

sequencing bed bedtools • 8.8k views
ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by sebastianzeki0170
2

Try: perl -p -i -e 's/ /\t/g'  bed_file.txt for converting white spaces to tabs and rerun bedtools. 

ADD REPLYlink written 4.2 years ago by Ashutosh Pandey11k
1

Have you ensured that you're always using tabs to separate the columns? I've seen a couple files with spaces randomly thrown in and that tends to break things.

ADD REPLYlink written 4.2 years ago by Devon Ryan88k
1

As others have suggested, this is most likely caused by the file having spaces in place of tabs. I note you mention this table was written in R, where the default seperator for write.table is indeed a space.  I have this function in my .Rprofile to make writing bioinformatic-sy table easier:

write.simple.table <- function(...){
    write.table(..., quote=FALSE, row.names=FALSE, col.names=FALSE, sep='\t')
}

 

ADD REPLYlink written 4.2 years ago by David W4.7k
4
gravatar for sebastianzeki0
4.2 years ago by
United Kingdom
sebastianzeki0170 wrote:

OK so this is what I did in case anyone else gets the same problem

1. As per Ashutosh I ran perl -p -i -e 's/ /\t/g' I think this was not the problem though in the end but is probably a good idea

2. Uploaded the bed file to UCSC genome browser as a custom track. If the bed file is incorrectly formatted it will tell you what the problem is and is much better at debugging than bedtools

3. beware of directory paths when running the command line.

 

Sorted now

 

Thanks everone

ADD COMMENTlink written 4.2 years ago by sebastianzeki0170

I got the same error message from intersectBed using a gtf file.

I uploaded it to UCSC genome browser as a custom track as suggested (https://genome.ucsc.edu/cgi-bin/hgCustom) and it told me what my issue was:

"chromStart after chromEnd (1462431 > 1462417)"

Thanks

Note: I had to change my chromosome name to get this to work as E. coli isn't one of their species options.

ADD REPLYlink written 3.6 years ago by Tim Amos20
3
gravatar for Alex Reynolds
4.2 years ago by
Alex Reynolds27k
Seattle, WA USA
Alex Reynolds27k wrote:

Run cat -T on your file in order to verify if it has tabs where you think they should be. If it has multiple spaces where a tab is expected, use sed 's/ \+/\t/g' on the file to replace one or more spaces with a single tab character.

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Alex Reynolds27k
0
gravatar for swbarnes2
4.2 years ago by
swbarnes25.0k
United States
swbarnes25.0k wrote:

One easy thing to try, despite the bed format only requiring those 3 columns, I've gotten that error, and fixed it by adding a fourth column with '+', because the command I was using need that information.

ADD COMMENTlink written 4.2 years ago by swbarnes25.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1296 users visited in the last hour