1000 Genome Derived Allele Frequency
7.6 years ago

Hi all,

I am looking for a way to extract derived allele frequency (DAF) of varirants from 1000 genome vcf: ALL.wgs.phase3_shapeit2_mvncall_integrated_v5a.20130502.sites.vcf.gz

e.g.

1    10177    rs367896724    A    AC    100    PASS    AC=2130;AF=0.425319;AN=5008;NS=2504;DP=103152;EAS_AF=0.3363;AMR_AF=0.3602;AFR_AF=0.4909;EUR_AF=0.4056;SAS_AF=0.4949;AA=|||unknown(NO_COVERAGE);VT=INDEL


Does the AF=0.425319 in the INFO field equivalent to the derived allele frequency?

Chung

vcf 1000genome DAF • 4.4k views
7.6 years ago
lh3 33k

Please read the vcf spec first. AF is the alternate allele frequency. AA is the ancestral allele. If both present, you can use it to get the derived allele frequency by flipping AF.

0
Hi,

I would like to ask you by new example for extraction of the DAF from 1000 genomes vcf.

In prior post, example .vcf line does not contain ancestral allele information (shown as "unknown"). It is supposed that if the ancestral allele is not addressed at the variant, we cannot extract DAF from the .vcf line.

1 10177 rs367896724 A AC 100 PASS AC=2130;AF=0.425319;AN=5008;NS=2504;DP=103152;EAS_AF=0.3363;AMR_AF=0.3602;AFR_AF=0.4909;EUR_AF=0.4056;SAS_AF=0.4949;AA=A|||;VT=INDEL


If the variant has ancestral allele such as AA=A in the new example, ancestral and reference alleles are same base in the variant.

Does the AF=0.425319 equivalent to the DAF in this case? I think it is not required for flipping AF (1 - AF).

suimye