Question: vcftools FORMAT ID error during filtering
0
gravatar for wickell.david
4.6 years ago by
United States
wickell.david0 wrote:

 

I have been using vcftools_0.1.12b (mac OSX 10.9.5) to filter SNP data quite successfully for the last month or so. Recently I have run into a problem trying to filter a group of .vcf files generated in TASSEL. Below I have supplied an example of the kind of command I am using (though to be clear this happens with every vcftools function I have tried) along with the file header and terminal output in hopes that it is something stupid and I have just been awake for too long to see it... ANY help would be greatly appreciated

I feel like I should also point out that perl commands seem to work fine with this dataset (as in "vcf-concat")

vcftools --vcf ../vcf/out.vcf --min-meanDP 50 --out mDP50 --recode

 

##fileformat=VCFv4.0
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth (only filtered reads used for calling)">
##FORMAT=<ID=GQ,Number=1,Type=Float,Description="Genotype Quality">
##FORMAT=<ID=PL,Number=3,Type=Float,Description="Normalized, Phred-scaled likelihoods for AA,AB,BB genotypes where A=ref and B=alt; not applicable if site is not biallelic">
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">


VCFtools - v0.1.12b

(C) Adam Auton and Anthony Marcketta 2009


Parameters as interpreted:

    --vcf ../vcf/out.vcf

    --min-meanDP 50

    --out mDP50

    --recode


Error: ID required in FORMAT field description: =<ID=PL,Number=3,Type=Float,Description="Normalized, Phred-scaled likelihoods for AA,AB,BB genotypes where A=ref and B=alt; not applicable if site is not biallelic">

It seems to be saying that there is no ID for this format field and yet it says right in the error that the ID=PL!

-UPDATE-

So, I'm not sure it really qualifies as a fix but I did end up getting my file to work. I'm just not quite sure why it works... Ultimately I ended up copying and pasting the entire document into a new text file and saving it, in a newly created directory if that makes any difference. I did NOT however make any edits to the text of the final document- merely copied and pasted it in chunks using TextEdit. Aside from their names (gType2.vcf vs. out.vcf) the two files are identical. But hey, it works! so I'm not going to complain.

As for the answer portion of this post- would anyone like to speculate what the original problem may have been or why copy-paste-save was the solution? Could my original file have been corrupted somehow? I am particularly interested because another grad student had almost the exact same problem with an input file for Pal_finder3 (perl script) on the same computer. Different error (obviously) but the same solution... Sorry for the long post and thanks again!

-David

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by wickell.david0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1215 users visited in the last hour