How to use gathervcfs in GATK 4.1.4.0
1
0
Entering edit mode
4.1 years ago
raf.marcondes ▴ 100

Hi all, I used GenomicsDBImport to generate a genomic database from a bunch of gvcf files corresponding to a number of genomic intervals. For each interval, GenomicsDBImport produced a directory with contents that look like this:

[rmarcondes@boslogin02 db_180]$ pwd
/n/holyscratch01/edwards_lab/rafa/genomic_DBs/db_180
[rmarcondes@boslogin02 db_180]$ ls -a
.          111$1$5060538  114$1$7908999  117$1$5787846  callset.json        vidmap.json
..         112$1$3923849  115$1$2315063  118$1$2797076  __tiledb_workspace.tdb
110$1$3996139  113$1$1360579  116$1$1734749  119$1$4990981  vcfheader.vcf

Now I want to use Gathervcfs to merge all my gvcfs into a single vcf file. I assumed the files I needed to put through Gathervcfs were the "vcfheader.vcf" file in each interval directory, like this:

java -Xmx200g -XX:ParallelGCThreads=20 -jar $GATKPATH GatherVcfs \
-I /n/holyscratch01/edwards_lab/rafa/genomic_DBs/db_3/vcfheader.vcf \
-I /n/holyscratch01/edwards_lab/rafa/genomic_DBs/db_4/vcfheader.vcf \
-I /n/holyscratch01/edwards_lab/rafa/genomic_DBs/db_5/vcfheader.vcf \
-I /n/holyscratch01/edwards_lab/rafa/genomic_DBs/db_227/vcfheader.vcf \
-I /n/holyscratch01/edwards_lab/rafa/genomic_DBs/db_228/vcfheader.vcf \
-I /n/holyscratch01/edwards_lab/rafa/genomic_DBs/db_229/vcfheader.vcf \
-O thevcf.vcf

But that just produced an empty vcf file with no variants, just a header and a list of all contigs.

What gives? Thanks for any pointers!

gathervcfs • 3.1k views
ADD COMMENT
0
Entering edit mode
4.1 years ago
Ram 43k

The first thing that comes to mind when I see this is that the vcfheader.vcf, as aptly named, probably only contain VCF headers. This would make sense given that GenomicsDBImport is supposed to create a genomics DB workspace from a VCF file.

In other words, you're combining header files and expecting data to be populated out of thin air.

See this: https://github.com/Intel-HLS/GenomicsDB/wiki/Importing-VCF-data-into-GenomicsDB

Your data is probably in one of the JSON files.

ADD COMMENT

Login before adding your answer.

Traffic: 2893 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6