Gatk Vcf File To Consensus Sequence Use Vcfutils.Pl
2
0
Entering edit mode
12.0 years ago
Yu ▴ 140

Hi, I have a GATK vcf file and want to get the consensus sequence. I know samtools-bcftools-vcfutils pipeline to get the consensus sequence, but I try use GATK and vcfutils to get the result. However, when I input GATK vcf file to vcfutils, the program report

Use of uninitialized value in addition (+) at /my/bin/vcfutils.pl line 518, <> line 94433.
Use of uninitialized value in numeric lt (<) at my/bin/vcfutils.pl line 508, <> line 94434.

and didn't get the consensus sequence.( file context are N)

The code GATK get vcf file

java -Xmx4g -jar /my/GenomeAnalysisTK-1.5-30/GenomeAnalysisTK.jar -R /my/Drosophila3R.fa -T UnifiedGenotyper -I IN.3R.sam.GATK1.bam -o snps.raw.EMIT_ALL_CONFIDENT_SITES.vcf -out_mode EMIT_ALL_CONFIDENT_SITES

The code samtools-bcftools get vcf file

samtools mpileup -uD  -f Drosophila3R.fa sorted.3R.bam | bcftools view -cg  -  > test.vcf

snps.raw.EMITALLCONFIDENT_SITES.vcf

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT whatever

3R 4 . T . 33.01 . AC=0;AF=0.00;AN=2;DP=1;MQ=37.00;MQ0=0 GT:DP 0/0:1

3R 5 . T . 39.01 . AC=0;AF=0.00;AN=2;DP=3;MQ=37.00;MQ0=0 GT:DP 0/0:3

3R 6 . C . 42.03 . AC=0;AF=0.00;AN=2;DP=4;MQ=37.00;MQ0=0 GT:DP 0/0:4

test.vcf

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sorted.3R.bam

3R 4 . T . 33 . DP=1;AF1=0;AC1=0;DP4=0,1,0,0;MQ=37;FQ=-30 PL:DP 0:1

3R 5 . T . 39 . DP=3;AF1=0;AC1=0;DP4=0,3,0,0;MQ=37;FQ=-36 PL:DP 0:3

3R 6 . C . 42 . DP=4;AF1=0;AC1=0;DP4=1,3,0,0;MQ=37;FQ=-39 PL:DP 0:4

I don't know why the vcfuilts.pl can't get the consensus sequence from GATK vcf file. Can anybody help me to fix the error?

gatk • 8.0k views
ADD COMMENT
1
Entering edit mode
12.0 years ago
lh3 33k

Vcfutils.pl requires the FQ INFO, the consensus quality. To convert GATK vcf, you'd better write a script by yourself. You can use GT.

ADD COMMENT
0
Entering edit mode

thanks a lot :)

by the way,is there a way of calculating FQ INFO from the data GATK gives?

Eureka!

I find the FQ==QUAL-3

so I can calcualting it use QUAL and use GT to distinguish heterozygotes and homozygotes

ADD REPLY
1
Entering edit mode

Hi

Actually I am also trying the similar thing, can you tell from where you got the information "FQ==QUAL-3". Is it in some manual of samtools.

Thanks

ADD REPLY
1
Entering edit mode
12.0 years ago

The dot in the reference sequences might be a problem. But yes, the vcfutils script defiantely does use FQ. So either manually add that to your vcf, or you can change vcfutils. It's in perl, which means it's a plain text file, so it's not too hard to alter, though you'll need to learn how to read regex'es. Changing the existing script is probably easier than writing a new script from scratch.

I wrote a bit about that part of the program here;

http://seqanswers.com/forums/showthread.php?t=11651

ADD COMMENT
0
Entering edit mode

thanks a lot :)

ADD REPLY

Login before adding your answer.

Traffic: 1757 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6