Question: Gatk Vcf File To Consensus Sequence Use Vcfutils.Pl
0
gravatar for Yu
7.4 years ago by
Yu100
China
Yu100 wrote:

Hi, I have a GATK vcf file and want to get the consensus sequence. I know samtools-bcftools-vcfutils pipeline to get the consensus sequence, but I try use GATK and vcfutils to get the result. However, when I input GATK vcf file to vcfutils, the program report

Use of uninitialized value in addition (+) at /my/bin/vcfutils.pl line 518, <> line 94433.
Use of uninitialized value in numeric lt (<) at my/bin/vcfutils.pl line 508, <> line 94434.

and didn't get the consensus sequence.( file context are N)

The code GATK get vcf file

java -Xmx4g -jar /my/GenomeAnalysisTK-1.5-30/GenomeAnalysisTK.jar -R /my/Drosophila3R.fa -T UnifiedGenotyper -I IN.3R.sam.GATK1.bam -o snps.raw.EMIT_ALL_CONFIDENT_SITES.vcf -out_mode EMIT_ALL_CONFIDENT_SITES

The code samtools-bcftools get vcf file

samtools mpileup -uD  -f Drosophila3R.fa sorted.3R.bam | bcftools view -cg  -  > test.vcf

snps.raw.EMITALLCONFIDENT_SITES.vcf

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT whatever

3R 4 . T . 33.01 . AC=0;AF=0.00;AN=2;DP=1;MQ=37.00;MQ0=0 GT:DP 0/0:1

3R 5 . T . 39.01 . AC=0;AF=0.00;AN=2;DP=3;MQ=37.00;MQ0=0 GT:DP 0/0:3

3R 6 . C . 42.03 . AC=0;AF=0.00;AN=2;DP=4;MQ=37.00;MQ0=0 GT:DP 0/0:4

test.vcf

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sorted.3R.bam

3R 4 . T . 33 . DP=1;AF1=0;AC1=0;DP4=0,1,0,0;MQ=37;FQ=-30 PL:DP 0:1

3R 5 . T . 39 . DP=3;AF1=0;AC1=0;DP4=0,3,0,0;MQ=37;FQ=-36 PL:DP 0:3

3R 6 . C . 42 . DP=4;AF1=0;AC1=0;DP4=1,3,0,0;MQ=37;FQ=-39 PL:DP 0:4

I don't know why the vcfuilts.pl can't get the consensus sequence from GATK vcf file. Can anybody help me to fix the error?

gatk • 5.6k views
ADD COMMENTlink written 7.4 years ago by Yu100
1
gravatar for lh3
7.4 years ago by
lh331k
United States
lh331k wrote:

Vcfutils.pl requires the FQ INFO, the consensus quality. To convert GATK vcf, you'd better write a script by yourself. You can use GT.

ADD COMMENTlink written 7.4 years ago by lh331k

thanks a lot :)

by the way,is there a way of calculating FQ INFO from the data GATK gives?

Eureka!

I find the FQ==QUAL-3

so I can calcualting it use QUAL and use GT to distinguish heterozygotes and homozygotes

ADD REPLYlink modified 7.4 years ago • written 7.4 years ago by Yu100

Hi

Actually I am also trying the similar thing, can you tell from where you got the information "FQ==QUAL-3". Is it in some manual of samtools.

Thanks

ADD REPLYlink modified 5.1 years ago • written 5.1 years ago by vicky20
1
gravatar for swbarnes2
7.4 years ago by
swbarnes26.5k
United States
swbarnes26.5k wrote:

The dot in the reference sequences might be a problem. But yes, the vcfutils script defiantely does use FQ. So either manually add that to your vcf, or you can change vcfutils. It's in perl, which means it's a plain text file, so it's not too hard to alter, though you'll need to learn how to read regex'es. Changing the existing script is probably easier than writing a new script from scratch.

I wrote a bit about that part of the program here;

http://seqanswers.com/forums/showthread.php?t=11651

ADD COMMENTlink written 7.4 years ago by swbarnes26.5k

thanks a lot :)

ADD REPLYlink written 7.4 years ago by Yu100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2109 users visited in the last hour