How to calculate Minor Allele Frequency (MAF) for your own population data
1
0
Entering edit mode
5.6 years ago

Dear All,

I have a VCF file containing 60 samples.

I want to calculate MAF for each variant with respect to 60 samples present in a VCF file.

Can anyone please tell me how to calculate MAF for you own population data?

Thank you In advance for your help

VCF MAF • 10.0k views
ADD COMMENT
0
Entering edit mode

This thread may help you....!!

ADD REPLY
0
Entering edit mode

Thank you, Nitin!! Will try to calculate manually using the mentioned formula in the link

ADD REPLY
1
Entering edit mode
5.6 years ago

You can add the AF and MAF via the BCFtools +fill-tags plugin.

See my answer, here, in particular part 4: A: How to use bcftools to calculate AF INFO field from AC and AN in VCF?

bcftools +fill-tags test.vcf
[W::bcf_hdr_check_sanity] PL should be declared as Number=G
5   135337248   .   CT  C   .   PASS    END=135337249;HOMLEN=3;HOMSEQ=TTT;SVLEN=-1;SVTYPE=DEL;AC=1;AN=2;NS=1;AF=0.5;MAF=0.5;AC_Het=1;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=1  GT:AD   0/1:205,118
5   135337259   .   AG  A   .   PASS    END=135337260;HOMLEN=4;HOMSEQ=GGGG;SVLEN=-1;SVTYPE=DEL;AC=1;AN=2;NS=1;AF=0.5;MAF=0.5;AC_Het=1;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=1 GT:AD   0/1:190,220
5   135337259   .   A   AG  .   PASS    END=135337259;HOMLEN=5;HOMSEQ=GGGGG;SVLEN=1;SVTYPE=INS;AC=1;AN=2;NS=1;AF=0.5;MAF=0.5;AC_Het=1;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=1 GT:AD   0/1:192,71
5   135337264   .   GA  G   .   PASS    END=135337265;HOMLEN=1;HOMSEQ=A;SVLEN=-1;SVTYPE=DEL;AC=1;AN=2;NS=1;AF=0.5;MAF=0.5;AC_Het=1;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=1    GT:AD   0/1:184,83
5   135337274   .   A   ATTATTGCATCAACTCCTCCGACATCTCTTCCCCTGCAAGAGTTCAGGCCCACAGGTTCTGGTGTGGGCTTGCTCAGCTGGAGGTAGCCTGAGGTGAGCTGGAG    .PASS   END=135337274;HOMLEN=23;HOMSEQ=TTATTGCATCAACTCCTCCGACA;SVLEN=103;SVTYPE=INS;AC=1;AN=2;NS=1;AF=0.5;MAF=0.5;AC_Het=1;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=1    GT:AD   0/1:130,17

Kevin

ADD COMMENT
0
Entering edit mode

Thank you, Kevin!!!

I think it will help

I trying to run bcftools but I got below error:

bcftools: plugins/fill-tags.c:432: process: Assertion `rec->n_allele < 8*sizeof(int)' failed. Aborted (core dumped)

Trying to solve the error. Once the error is solved I will get back to you!!

ADD REPLY
0
Entering edit mode

What is the source of your VCF? Did you install the plugin exactly as per A: How to use bcftools to calculate AF INFO field from AC and AN in VCF? ?

ADD REPLY
0
Entering edit mode

Thank again Kevin,

What is the source of your VCF? => The file created using below command( merging of multiple vcf):

java -jar ~/Tools/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef/GenomeAnalysisTK.jar -T CombineVariants -R ~/Database2/HG19_UCSC/hg19/Sequence/Chromosomes/GATK_UCSC_HG19.fasta --variant input.list -o All_sample_merge_GATK.vcf -genotypeMergeOptions REQUIRE_UNIQUE

File head:1

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  sample4 sample6 sample7 sample10    sample14    sample15    sample18    sample19    sample21    sample22    sample23    sample24    sample29    sample32    sample33    sample34    sample35    sample37    sample39    sample42    sample43    sample46    sample49    sample51    sample53    sample54    sample55    sample57    sample58    sample60    sample63    sample65    sample67    sample74    sample75    sample77    sample81    sample85    sample86    sample89    sample91    sample93    sample99    sample101   sample104   sample105   sample108   sample109   sample114   sample118   sample119   sample120   sample121   sample12    sample25    sample27    sample40    sample56    sample62    sample66
chr1    10146   .   AC  A   365.28  VQSRTrancheBOTH99.90to100.00    AC=1;AF=0.5;AN=2;BaseQRankSum=-1.498;ClippingRankSum=-0.659;DP=29;FS=10;GQ_MEAN=9;MLEAC=1;MLEAF=0.5;MQ=33.66;MQ0=0;MQRankSum=0.3;NCC=0;NEGATIVE_TRAIN_SITE;PG=0,0,0;QD=17.39;ReadPosRankSum=-0.18;SOR=1.85;VQSLOD=-8.428;culprit=MQ;set=variant64GT:AD:DP:GQ:PL:PP  ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   0/1:2,19:21:9:393,0,9:393,0,9   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.   ./.:.:.:.:.:.

Did you install the plugin exactly as per A: How to use bcftools to calculate AF INFO field from AC and AN in VCF? ? => Yes, I have followed exactly same step

ADD REPLY
0
Entering edit mode

Your VCF already has the AF tag ? For the variant in your post, it is 0.5 (50%). With the AF, you can automatically infer the MAF.

ADD REPLY
0
Entering edit mode

Can you explain it in little details with respect to MAF?

ADD REPLY
0
Entering edit mode

Sure!

In pseudocode, for bi-allelic sites:

Rule 1:

If (AF < 0.5) then MAF = AF

Rule 2:

if (AF > 0.5) then MAF = 1.0 - AF

The minor allele may not always be the variant base in your VCF

ADD REPLY
0
Entering edit mode

The calculation you explained is with respect to all samples?

Means the "If (AF < 0.5) then MAF = AF" 0.5 MAF of that particular allele is with respect to all sample present in VCF?

ADD REPLY
0
Entering edit mode

Yes, these tags are with respect to all samples that have genotypes.

Samples with ./.:.:.:.:.:. are not included, because these are effectively NA (missing)

ADD REPLY
1
Entering edit mode

Thank you, Kevin, this answers my main question.!!

ADD REPLY

Login before adding your answer.

Traffic: 2629 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6