Exec format error: bcftools failing to open an indexed VCF file
1
0
Entering edit mode
3.2 years ago
Ella • 0

Hello,

From a VCF file, I am trying to pull the GT information (in the format of 0/1, 0/0, 2/0 etc.) for certain positions per chromosome per sample. When I run the following code I get the resulting line for each chromosome and contig:

bcftools query -t 4:58000000-59000000 -f '[ %GT]\n' myfile.vcf

[W::vcf_parse] Contig 'scaffold_869' is not defined in the header.
(Quick workaround: index the file with tabix.)

So I zip compressed and indexed the file with bgzip and tabix:

bgzip myfile.vcf 
tabix myfile.vcf.gz

When I re-run the above query with the new indexed file, I get:

bcftools query -t 4:58000000-59000000 -f '[ %GT]\n' myfile.vcf.gz.tbi
[E::hts_hopen] Failed to open file myfile.vcf.gz.tbi
[E::hts_open_format] Failed to open file "myfile.vcf.gz.tbi" : Exec
format error    Failed to read from myfile.vcf.gz.tbi: Exec format error

I have checked and changed the myfile.vcf.gz.tbi permissions to -rwxrwxrwx via chmod. I have checked the file format via htsfile and get:

myfile.vcf.gz.tbi:        Tabix compressed index data 

Also, the input VCF data is position sorted. I am using samtools version 1.10-98 and htslib 1.10.2-135 on a university server.

Can someone please suggest what is going wrong and why my file cannot be read?

Thanks in advance

samtools bcftools VCF tabix HTSlib • 3.8k views
ADD COMMENT
2
Entering edit mode
3.2 years ago

not

bcftools query -t 4:58000000-59000000 -f '[ %GT]\n' indexed.vcf.gz.tbi

but

bcftools query -r 4:58000000-59000000 -f '[ %GT]\n' indexed.vcf.gz

[W::vcf_parse] Contig 'scaffold_869' is not defined in the header. (Quick workaround: index the file with tabix.)

means that there is a chromosome named scaffold_869 but it is not defined in the header (lines starting with #contig )

ADD COMMENT
0
Entering edit mode

Hi Pierre, Thanks for your help. I have added the chromosome lengths into the header using:

> $ awk '/^#CHROM/ {printf("##contig=<ID=chr1,length=263206000>\n##contig=<ID=chr2,length=347119000>\n##contig=<ID=chr3,length=309664000>\n##contig=<ID=chr4,length=340129330>\n##contig=<ID=chr5,length=275007000>\n##contig=<ID=chr6,length=214272000>\n##contig=<ID=chr7,length=255482000>\n");} {print;}' file.vcf > file_header.vcf

after compressing and indexing the file_header.vcf like previously, i still get an Exec error message:

 bcftools query -r 4:58000000-59000000 -f '[ %GT]\n' file_header.vcf.gz.tbi 

[E::hts_hopen] Failed to open file file_header.vcf.gz.tbi [E::hts_open_format] Failed to open file "file_header.vcf.gz.tbi" : Exec format error Failed to read from file_header.vcf.gz.tbi: Exec format error

I have full permissions and am using samtools 1.10-98-gfaab8b0 + htslib 1.10.2-135-gf4f7f24. Any other suggestions on why samtools cant read my file?

ADD REPLY

Login before adding your answer.

Traffic: 4079 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6