Question: Why Does Vcftools Vcf-Annotate Produce An Empty File
0
gravatar for ttom
5.4 years ago by
ttom200
ttom200 wrote:

I am trying to annotate a VCF file by adding DBSNP information to the INFO column, using the given below annotation file

CHROM   FROM INFO/DBSNP
1       69428   dbSNP_134
1       69476   dbSNP_134
1       69496   dbSNP_134
1       69511   dbSNP_131
1       69590   dbSNP_134
1       69594   dbSNP_134

Both the VCF file and annotation file was bgzipped and tabix indexed. Used the following vcftools vcf-annotate command for it

 zcat sample.vcf.gz | vcf-annotate -a annotation.txt.gz -d key=INFO,ID=DBSNP,Number=1,Type=String,Description='My annotation' -c CHROM,FROM,INFO/DBSNP >test_annotated.vcf

The command ran without any errors creating the output VCF file named 'test_annotated.vcf' with just a header line with description of INFO/DBSNP added to it, but DBSNP information was not added to any variants. Both the VCF and Annotation file have variants with CHR and POS in common too. Not sure why it is not getting annotated.

Has anyone successfully done this before. Any help is much appreciated. I posted the same question over at VCFtools-help http://bit.ly/ZjT3YG Petr or Adam, appreciate some help :)

Thanks, Tinu

vcftools • 3.7k views
ADD COMMENTlink modified 5.4 years ago • written 5.4 years ago by ttom200
1

I would show a few lines of the annotation.txt.gz file as well. Just to make sure that the chromosome names are identical. That is the most common reason.

ADD REPLYlink written 5.4 years ago by Istvan Albert ♦♦ 77k

I have already put few lines from the annotation.txt file above.Just printing them once more Few lines from annotation.txt

CHROM   FROM INFO/DBSNP
1       69428   dbSNP_134
1       69476   dbSNP_134
1       69496   dbSNP_134
1       69511   dbSNP_131
1       69590   dbSNP_134
1       69594   dbSNP_134

Few lines from sample.vcf which is the input VCF file to which I need to add annotations, here is it

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
1       69270   .       A       G       2068.69 VQSRTrancheSNP99.50to99.90      AC=126;AF=0.818;AN=154;BaseQRankSum=-3.337;DP=120201;Dels=0.00;FS=0.000;HaplotypeScore=0.0549;InbreedingCoeff=0.2860;MLEAC=129;MLEAF=0.838;MQ=2.04;MQ0=30366;MQRankSum=5.157;QD=0.29;ReadPosRankSum=-2.035;SNPEFF_AMINO_ACID_CHANGE=S108;SNPEFF_CODON_CHANGE=tcA/tcG;SNPEFF_EFFECT=SYNONYMOUS_CODING;SNPEFF_EXON_ID=exon_1_69037_69829;SNPEFF_FUNCTIONAL_CLASS=SILENT;SNPEFF_GENE_BIOTYPE=protein_coding;SNPEFF_GENE_NAME=OR4F5;SNPEFF_IMPACT=LOW;SNPEFF_TRANSCRIPT_ID=ENST00000534990;VQSLOD=-3.545e+01;culprit=MQ
1       69428   rs140739101     T       G       46115.5 VQSRTrancheSNP99.50to99.90      AC=74;AF=0.053;AN=1390;BaseQRankSum=40.754;DB;DP=89129;Dels=0.00;FS=134.908;HaplotypeScore=0.2101;InbreedingCoeff=0.4406;MLEAC=66;MLEAF=0.047;MQ=15.57;MQ0=18658;MQRankSum=4.927;QD=10.09;ReadPosRankSum=-8.476;SNPEFF_AMINO_ACID_CHANGE=F113C;SNPEFF_CODON_CHANGE=tTt/tGt;SNPEFF_EFFECT=NON_SYNONYMOUS_CODING;SNPEFF_EXON_ID=exon_1_69091_70008;SNPEFF_FUNCTIONAL_CLASS=MISSENSE;SNPEFF_GENE_BIOTYPE=protein_coding;SNPEFF_GENE_NAME=OR4F5;SNPEFF_IMPACT=MODERATE;SNPEFF_TRANSCRIPT_ID=ENST00000335137;VQSLOD=-2.032e+01;culprit=MQ
ADD REPLYlink written 5.4 years ago by ttom200
3
gravatar for ttom
5.4 years ago by
ttom200
ttom200 wrote:

tabix annotation.txt.gz 1:69428-69428 ---> gave no output

The command used earlier to bgzip and tabix index annotation file

bgzip annotation.txt; tabix -s 1 -b 2 -e 3 annotation.txt.gz

Later realized the tabix command options used were incorrect. Tried the given below command again

bgzip annotation.txt; tabix -s 1 -b 2 -e 2 annotation.txt.gz (gave start and end column as column number 2)

zcat sample.vcf.gz | vcf-annotate -a annotation.txt.gz -d key=INFO,ID=DBSNP,Number=1,Type=String,Description='My annotation' -c CHROM,FROM,INFO/DBSNP >test_annotated.vcf

grep 69428 test_annotated.vcf # could see that the DBSNP annotation was added to the INFO column

1       69428   rs140739101     T       G       46115.5 VQSRTrancheSNP99.50to99.90      AC=74;AF=0.053;AN=1390;BaseQRankSum=40.754;DB;DP=89129;Dels=0.00;FS=134.908;HaplotypeScore=0.2101;
InbreedingCoeff=0.4406;MLEAC=66;MLEAF=0.047;MQ=15.57;MQ0=18658;MQRankSum=4.927;QD=10.09;ReadPosRankSum=-8.476;
SNPEFF_AMINO_ACID_CHANGE=F113C;SNPEFF_CODON_CHANGE=tTt/tGt;SNPEFF_EFFECT=NON_SYNONYMOUS_CODING;
SNPEFF_EXON_ID=exon_1_69091_70008;SNPEFF_FUNCTIONAL_CLASS=MISSENSE;SNPEFF_GENE_BIOTYPE=protein_coding;
SNPEFF_GENE_NAME=OR4F5;SNPEFF_IMPACT=MODERATE;SNPEFF_TRANSCRIPT_ID=ENST00000335137;VQSLOD=-2.032e+01;
culprit=MQ;**DBSNP=dbSNP_134**
ADD COMMENTlink modified 5.4 years ago by Istvan Albert ♦♦ 77k • written 5.4 years ago by ttom200
1

thanks for following up with the answer

ADD REPLYlink written 5.4 years ago by Istvan Albert ♦♦ 77k
1

Petr Danacek helped in troubleshooting

ADD REPLYlink written 5.4 years ago by ttom200
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 984 users visited in the last hour