allele frequency from vcf
0
0
Entering edit mode
3.1 years ago
niyomiw88 • 0

I am trying to figure out why the allele frequencies I get from Plink and vcftools are different from what I have on the vcf file. My vcf file was produced by GATK jointgenotyping and filtered using vcftools. I have 448 samples in there. Here is a line from the vcf

DS235882        421     .       T       A       10758.8 .       AC=29;AF=0.186;AN=156;BaseQRankSum=0.394;DP=1264;ExcessHet=0.0000;FS=1.155;InbreedingCoeff=0.6596;MLEAC=32;MLEAF=0.205;MQ=59.92;MQRankSum=0.00;QD=33.41;ReadPosRankSum=0.335;SOR=0.617  GT:AD:DP:GQ:PGT:PID:PL:PS       0/0:22,0:22:60:.:.:0,60,900:.   ./.:0,47:.:99:1|1:405_G_A:2166,147,0:405        ./

I computed the allele frequency using vcftools vcftools --vcf input.vcf --freq2 --out allele_freq --max-alleles 2 and this is what I got;

CHROM   POS N_ALLELES   N_CHR   {FREQ}
DS235882    101 2   140 1   0
DS235882    117 2   148 1   0
DS235882    128 2   150 1   0
DS235882    206 2   160 1   0

I tried plink using plink --vcf input.vcf --freq --allow-extra-chr --out allele_freq

#CHROM  ID  REF ALT ALT_FREQS   OBS_CT
DS235882    .   T   A   0   140
DS235882    .   T   A   0   148
DS235882    .   A   T   0   150
DS235882    .   A   G   0   160

vcftools output gives all 1's and plink gives all 0's. What did I do wrong? Any advice or tips? I even extracted the AF column on my vcf and it looks like this,

DS235882 101 0.012
DS235882 117 0.041
DS235882 128 0.042
DS235882 206 0.012

I want to find out why these analyses wouldn't work. Thanks in advance.

SNP genome • 1.0k views
ADD COMMENT
0
Entering edit mode

Hi,

did you figure that out? I have similar issues regarding allele counts. I obtained higher allele counts when samples were processed by GATK3.8 than GATK4.

ADD REPLY

Login before adding your answer.

Traffic: 1677 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6