Question: Fq In Vcf
gravatar for potassiumiodide0990
5.0 years ago by
United States
potassiumiodide099070 wrote:

Hi everyone! I'm a very very new to these variant formats. So please, kindly help me out. I have a SNP vcf file. And I need the hom and het variants. But I dont want to use the 012 output format of vcftools, its just too much of hassle. So I was thinking of using bcftools/samtools. I was thinking of using the FQ tag for this task. FQ : Consensus quality. If positive, FQ equals the phred-scaled probability of there being two or more different alleles. If negative, FQ equals the minus phred-scaled probability of all chromosomes being identical. Notably, given one sample, FQ is positive at hets and negative at homs. But my vcf file does not contain FQ tags in the INFO column. So I'm guessing I'd have to use vcf2fq command to get FQ. How do I go about issuing this command? Please help.

vcf samtools bcftools vcftools • 2.1k views
ADD COMMENTlink modified 5.0 years ago by Ashutosh Pandey11k • written 5.0 years ago by potassiumiodide099070
gravatar for Biomonika (Noolean)
5.0 years ago by
State College, PA, USA
Biomonika (Noolean)3.0k wrote:

I am not sure if I understand your problem, but wouldn't just looking at AF enough?

In your VCF file in 8th column, AF means allele frequency for each ALT allele in the same order as listed. So essentially AF=0.5 would be heterozygous site.

ADD COMMENTlink written 5.0 years ago by Biomonika (Noolean)3.0k
gravatar for Ashutosh Pandey
5.0 years ago by
Ashutosh Pandey11k wrote:

vcf2fq command is a part of program and is used to generate consensus sequence for a given diploid organism. vcf2fq input.vcf > consensus.fq

But depending on your need I think using the GT tag in the vcf file should be the right approach. For a diploid organisms, you will see on of these values: 1/1, 0/1 or 0/0 in each row.

1/1 represents homozygous SNP. In other words, both the chromosomes in a homologous pair show different nucleotides.

0/1 represents heterozygous SNP. In other words, one chromosome from the pair of homologous chromosomes shows different nucleotide and the other chromosome has the same nucleotide as reference.

0/0 represents homozygous reference represents "no SNP" as each chromosome in a homologous pair contains the same nucleotide as reference.

All you need to do is grep these rows from your vcf file. For example for case 1/1,

egrep "^#|1/1:" Input.vcf > Homozygous_SNP.vcf

ADD COMMENTlink modified 5.0 years ago • written 5.0 years ago by Ashutosh Pandey11k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1833 users visited in the last hour