Why Does Vcftools Vcf-Annotate Produce An Empty File
1
0
Entering edit mode
11.1 years ago
ttom ▴ 220

I am trying to annotate a VCF file by adding DBSNP information to the INFO column, using the given below annotation file

CHROM   FROM INFO/DBSNP
1       69428   dbSNP_134
1       69476   dbSNP_134
1       69496   dbSNP_134
1       69511   dbSNP_131
1       69590   dbSNP_134
1       69594   dbSNP_134

Both the VCF file and annotation file was bgzipped and tabix indexed. Used the following vcftools vcf-annotate command for it

 zcat sample.vcf.gz | vcf-annotate -a annotation.txt.gz -d key=INFO,ID=DBSNP,Number=1,Type=String,Description='My annotation' -c CHROM,FROM,INFO/DBSNP >test_annotated.vcf

The command ran without any errors creating the output VCF file named 'test_annotated.vcf' with just a header line with description of INFO/DBSNP added to it, but DBSNP information was not added to any variants. Both the VCF and Annotation file have variants with CHR and POS in common too. Not sure why it is not getting annotated.

Has anyone successfully done this before. Any help is much appreciated. I posted the same question over at VCFtools-help http://bit.ly/ZjT3YG Petr or Adam, appreciate some help :)

Thanks, Tinu

vcftools • 6.6k views
ADD COMMENT
1
Entering edit mode

I would show a few lines of the annotation.txt.gz file as well. Just to make sure that the chromosome names are identical. That is the most common reason.

ADD REPLY
0
Entering edit mode

I have already put few lines from the annotation.txt file above.Just printing them once more Few lines from annotation.txt

CHROM   FROM INFO/DBSNP
1       69428   dbSNP_134
1       69476   dbSNP_134
1       69496   dbSNP_134
1       69511   dbSNP_131
1       69590   dbSNP_134
1       69594   dbSNP_134

Few lines from sample.vcf which is the input VCF file to which I need to add annotations, here is it

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
1       69270   .       A       G       2068.69 VQSRTrancheSNP99.50to99.90      AC=126;AF=0.818;AN=154;BaseQRankSum=-3.337;DP=120201;Dels=0.00;FS=0.000;HaplotypeScore=0.0549;InbreedingCoeff=0.2860;MLEAC=129;MLEAF=0.838;MQ=2.04;MQ0=30366;MQRankSum=5.157;QD=0.29;ReadPosRankSum=-2.035;SNPEFF_AMINO_ACID_CHANGE=S108;SNPEFF_CODON_CHANGE=tcA/tcG;SNPEFF_EFFECT=SYNONYMOUS_CODING;SNPEFF_EXON_ID=exon_1_69037_69829;SNPEFF_FUNCTIONAL_CLASS=SILENT;SNPEFF_GENE_BIOTYPE=protein_coding;SNPEFF_GENE_NAME=OR4F5;SNPEFF_IMPACT=LOW;SNPEFF_TRANSCRIPT_ID=ENST00000534990;VQSLOD=-3.545e+01;culprit=MQ
1       69428   rs140739101     T       G       46115.5 VQSRTrancheSNP99.50to99.90      AC=74;AF=0.053;AN=1390;BaseQRankSum=40.754;DB;DP=89129;Dels=0.00;FS=134.908;HaplotypeScore=0.2101;InbreedingCoeff=0.4406;MLEAC=66;MLEAF=0.047;MQ=15.57;MQ0=18658;MQRankSum=4.927;QD=10.09;ReadPosRankSum=-8.476;SNPEFF_AMINO_ACID_CHANGE=F113C;SNPEFF_CODON_CHANGE=tTt/tGt;SNPEFF_EFFECT=NON_SYNONYMOUS_CODING;SNPEFF_EXON_ID=exon_1_69091_70008;SNPEFF_FUNCTIONAL_CLASS=MISSENSE;SNPEFF_GENE_BIOTYPE=protein_coding;SNPEFF_GENE_NAME=OR4F5;SNPEFF_IMPACT=MODERATE;SNPEFF_TRANSCRIPT_ID=ENST00000335137;VQSLOD=-2.032e+01;culprit=MQ
ADD REPLY
3
Entering edit mode
11.1 years ago
ttom ▴ 220

tabix annotation.txt.gz 1:69428-69428 ---> gave no output

The command used earlier to bgzip and tabix index annotation file

bgzip annotation.txt; tabix -s 1 -b 2 -e 3 annotation.txt.gz

Later realized the tabix command options used were incorrect. Tried the given below command again

bgzip annotation.txt; tabix -s 1 -b 2 -e 2 annotation.txt.gz (gave start and end column as column number 2)

zcat sample.vcf.gz | vcf-annotate -a annotation.txt.gz -d key=INFO,ID=DBSNP,Number=1,Type=String,Description='My annotation' -c CHROM,FROM,INFO/DBSNP >test_annotated.vcf

grep 69428 test_annotated.vcf # could see that the DBSNP annotation was added to the INFO column

1       69428   rs140739101     T       G       46115.5 VQSRTrancheSNP99.50to99.90      AC=74;AF=0.053;AN=1390;BaseQRankSum=40.754;DB;DP=89129;Dels=0.00;FS=134.908;HaplotypeScore=0.2101;
InbreedingCoeff=0.4406;MLEAC=66;MLEAF=0.047;MQ=15.57;MQ0=18658;MQRankSum=4.927;QD=10.09;ReadPosRankSum=-8.476;
SNPEFF_AMINO_ACID_CHANGE=F113C;SNPEFF_CODON_CHANGE=tTt/tGt;SNPEFF_EFFECT=NON_SYNONYMOUS_CODING;
SNPEFF_EXON_ID=exon_1_69091_70008;SNPEFF_FUNCTIONAL_CLASS=MISSENSE;SNPEFF_GENE_BIOTYPE=protein_coding;
SNPEFF_GENE_NAME=OR4F5;SNPEFF_IMPACT=MODERATE;SNPEFF_TRANSCRIPT_ID=ENST00000335137;VQSLOD=-2.032e+01;
culprit=MQ;**DBSNP=dbSNP_134**
ADD COMMENT
1
Entering edit mode

thanks for following up with the answer

ADD REPLY
1
Entering edit mode

Petr Danacek helped in troubleshooting

ADD REPLY

Login before adding your answer.

Traffic: 2833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6