Question: Generating .Vcf.Idx Files From Cmdline
gravatar for Roman Valls Guimerà
8.2 years ago by
Roman Valls Guimerà530 wrote:


A colleague pointed out this problem with GATK, which generates indexes in memory on the fly if it doesn't find those on disk:

INFO  14:05:14,023 RMDTrackBuilder - Creating Tribble index in memory for file dbsnp_132.vcf
WARN  14:09:31,798 FSLockWithShared - WARNING: Unable to lock file dbsnp_132.vcf.idx (could not open read/write file channel)
WARN  14:09:31,798 RMDTrackBuilder - Unable to write to dbsnp_132.vcf.idx for the index file, creating index in memory only

I've read and gone through these two resources:

But I found it a bit awkward having to write my own java class just to generate an index file from a .vcf :-!

VCFTools does not seem to have that functionality either, at least according to their documentation:

Anybody knows how to generate those .vcf.idx files in a more straightforward way ?

Thanks in advance !

PD: I just couldn't refrain myself from including this link too :)

vcf gatk vcftools • 24k views
ADD COMMENTlink modified 8.2 years ago by Jitendra50 • written 8.2 years ago by Roman Valls Guimerà530

GATK should generate those indexes on disk as part of processing. It looks like you might not have permissions to write the directory. I don't know of a indexing function from GATK, but you might be able to run something like 'ValidateVariants' as a lightweight way to make an index as a side effect.

ADD REPLYlink written 8.2 years ago by Brad Chapman9.5k

Hi Roman - curious what you set your permissions to for the VCF file? I am running into this same problem and cannot figure it out. Thanks!

ADD REPLYlink written 8.2 years ago by Caddymob960

Indeed this was a permissions issue. We intend to have a shared reference genomes repository in our HPC environment by using your script:

This means that the directories/files there shouldn't be writeable by all users and that's why GATK complains about it.

I'll try your lightweight suggestion and integrate it on

Thanks Brad !

ADD REPLYlink written 8.2 years ago by Roman Valls Guimerà530

Caddymob, I haven't set the permissions myself, but the HPC sysadmins. The only thing you've to do is set the directory permissions to "write" on the directory pointed by the error message (chmod ug+w dir).

ADD REPLYlink written 8.2 years ago by Roman Valls Guimerà530
gravatar for Jitendra
7.8 years ago by
Jitendra50 wrote:

Hi I found IGVTools, best for VCF indexing.

igvtools can be run from the command line or IGV itself (File>Run igvtools...) After launching, choose the Index command and browse to your .vcf file. The index file (.idx) will be created in the same directory as the .vcf file.

Thanks! Jitendra

ADD COMMENTlink written 7.8 years ago by Jitendra50
gravatar for Rlong
8.2 years ago by
Rlong340 wrote:

Depending on what you want to do downstream with these vcf's, you also have the option of using Heng Li's bgzip and tabix. Bgzip compresses and chunks the contents into blocks, then tabix labels the chunks and produces a .tbi index file. Then you can use tabix on the vcf much like you would samtools on an indexed bam.

ADD COMMENTlink written 8.2 years ago by Rlong340

That's not the use case I was looking for, but it's definitely worth knowing, thanks for your feedback !

ADD REPLYlink written 8.2 years ago by Roman Valls Guimerà530
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2318 users visited in the last hour