Question: A strange problem with annotation of vcf file
gravatar for seta
2.5 years ago by
seta1.4k wrote:

Hi friends,

I did variant calling of a bam file (it's related to whole-genome sequencing of a human sample) with GATK (version 4) and then tried to annotate it with dbsnp, but it was not successful. I download human reference genome and dbsnp files from the following addresses:

For annotation with dbsnp, firstly, I used the below command from GATK

Gatk VariantAnnotator –R genom.fa –I file1.bam –V file1.vcf –A coverage –D All_20180418.vcf

GATK4 said

Warning: VariantAnnotator is a BETA tool and is not yet ready for use in production

and didn’t return me any output. So, I used bcftools for annotation with the below command:

bcftools annotate –c ID –a All_20180418.vcf.gz file1.vcf.gz > file1_annotate.vcf

But, bcftools also didn’t annotate my vcf file without returning any error.

Could you please help me what’s wrong and kindly tell me how I can annotate my vcf file with dbsnp?

Many thanks

ADD COMMENTlink modified 2.2 years ago by zx87549.9k • written 2.5 years ago by seta1.4k

Did you try this one:

ADD REPLYlink written 18 months ago by Shicheng Guo8.5k
gravatar for Kevin Blighe
2.5 years ago by
Kevin Blighe70k
Republic of Ireland
Kevin Blighe70k wrote:

SnpSift Annotate does a good job of annotating the ID field of a VCF. You will have to download the dbSNP VCF, though, which is > 10GB last time that I checked.

ADD COMMENTlink written 2.5 years ago by Kevin Blighe70k

Yes, I downloaded dbsnp vcf file with 15.2 GB in gzip. Could you please tell me what is wrong in my work to annotate the vcf file?

ADD REPLYlink written 2.5 years ago by seta1.4k

I cannot comment on the GATK command because I have never used it. The warning message issue by that command speaks for itself, too.

Regarding BCFtools, you likely have to first unset your VCF IDs with

bcftools annotate --remove ID file1.vcf.gz | \
bcftools annotate -c ID -a All_20180418.vcf.gz file1_annotate.vcf -Ov \
> file1_annotate.vcf

Of course, SnpSift also works.

Nota bene: also check that your contig names are the same as those in the dbSNP VCF. One may have a 'chr' prefix, while, the other, not

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Kevin Blighe70k

Many thanks for your points. I'll try them

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by seta1.4k

Good luck my friend.

ADD REPLYlink written 2.5 years ago by Kevin Blighe70k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1833 users visited in the last hour