Hi, I have been struggling with an error in bedtools intersect. The command I am trying to run is as follows
bedtools intersect -a sorted.vcf -b nstd166.GRCh38.variant_call_chr.vcf.gz -wo -sorted  -f 0.8 -r  -g Homo_sapiens_assembly38.fasta.fai 
For some of the files that I am assessing, I don't get any errors and the output is obtained without issues. But sometimes the error I receive is as follows:
Error: The genome file Homo_sapiens_assembly38.fasta.fai has no valid entries. Exiting.
I have been looking for what could be the cause of the problem and I have seen that this is a quite common failure derived from the genome file structure, which in my case is the following:
chrI  15072421 101 112
While according to the bedtools documentation itself, the structure should be
chrI  15072421
chrII 15279323
...
chrX  17718854
chrM  13794
My question is, how is it possible that for some of the files I got an output but for some of them I get the error?
Thanks in advance!
You are using a version of bedtools prior to 2.29. More recent versions have changes in the way the
-gfile is read and more detailed error messages, so I'd suggest you try the current version to shed some light on this.Hi,
Please try:
Kevin
A bedtools genome file, as used with
-g, is a tab-delimited table giving chromosome names and lengths, and the desired order of the chromosomes. Only the first two columns are used, so a .fai file is suitable. The FASTA file itself is not suitable.Indeed, Sir, it is not expected a FASTA