Question: SnpSift extractFields error: Cannot find 'Description' in info line: '##FORMAT=<ID=AD,Number=.,Type=Integer,Description=Allelic depths for the ref and alt alleles in the order listed>
0
gravatar for t2
3.6 years ago by
t250
Netherlands
t250 wrote:

Hi all, I'm trying to use the SnpSift extractFields (SnpSift 4.1g (build 2015-05-17)) command to change a mpileup gVCF file into a nice tab delimited text file. I have been able to do this with previous files, but they were not gVCF (GATK Haplotype caller) files.

I have also tried outputting the file using GATK VariantsToTable function, but it does not accept SnpEff fields. I did not add the SnpEff data, it comes from another colleague.

The SnpSift command I use:

java -jar ~/void/tools/snpEff/SnpSift.jar extractFields -s "," -e "." grep.vcf  CHROM POS ID REF ALT "ANN[*].GENE" "ANN[*].IMPACT" "ANN[*].EFFECT"  >out.txt

I get the error Cannot find 'Description' in info line:

'##FORMAT=<ID=AD,Number=.,Type=Integer,Description=Allelic depths for the ref and alt alleles in the order listed>

Oddly, I am able to get output if I leave out any extra information to extract besides CHROM POS ID REF ALT. But I want all that extra information that was added with SnpEff.

My VCF header is too big to be supported by Biostars. I can also email it directly if anyone thinks they can help me out to fix the output table problem. I have emailed the developer about it but haven't had a response (4 days).

Thanks very much, Tesa

snp annotation • 1.5k views
ADD COMMENTlink modified 3.6 years ago by Pierre Lindenbaum124k • written 3.6 years ago by t250
1
gravatar for Pierre Lindenbaum
3.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum124k wrote:

i would try to add quotes in description.

sed -e 's/Description=Allelic depths/Description="Allelic depths /' -e 's/ in the order listed>/ in the order listed">/'  grep.vcf > grep2.vcf
ADD COMMENTlink written 3.6 years ago by Pierre Lindenbaum124k

Hi Pierre, Thanks for this. I think you are on to something. This didn't fix it but now I have a new error which apparently goes to the next offending line.

Cannot find 'Description' in info line: '##FORMAT=<ID=DP,Number=1,Type=Integer,Description=Approximate read depth (reads with MQ=255 or with bad mates are filtered)>

I hope there aren't a bazillion of these lines. Any idea how to find offenders more efficiently than one-by-one?

Cheers, Tesa

ADD REPLYlink written 3.6 years ago by t250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1271 users visited in the last hour