Question: Confused about the AF tag in VCF from samtools/bcftools mpileup
0
gravatar for Makplus T
4 weeks ago by
Makplus T70
Makplus T70 wrote:

Hi,

I have called the mutation in Human NGS data follow the samtools/bcftools mpileup call, and got the VCF files. Due to the false positive, I need to go further filter, just like filter the position with more than 20 reads support, Or the remove the variants frequency less than threshold value, (eg: AF < 4% ).

The VCF file from bcftools call do not contain the INFO/AF tags, so I add it with below command:

 bcftools +fill-tags in.bcf -Ob -o out.bcf -- -t AF

But after AF added, I only got two kinds of AF values for my variants records, both AF =1 for GT 1/1, and AF=0.5 for GT 0/1.
It seems AF in samtools in not the same concept in GATK tools? with GATK I can get the AF for the site = alt reads / reads depth, and in samtools, I only got the AF value = AC/AN, here AC is always reported as 1 and AN is 1 or 2 in my samples.

May be some mistakes need to be pointed out?

Thanks

variants af samtools gatk vcf • 111 views
ADD COMMENTlink modified 4 weeks ago by Renesh1.9k • written 4 weeks ago by Makplus T70

It's possible that they are different. Check the VCF header to see how AF is defined there in your GATK-produced VCF compared to the mpileup-produced VCF.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Kevin Blighe63k
1

Thanks Kevin, I think I got the point. AF in GATK-produced VCF stands for "allele fraction of alternating allele in tumor", while AF in VCF produced by bcftools means "allele frequency", I think this is a population concept, "allele fraction of alt allele in all samples" . So when I only give one human sample, it always reports AF=1 or AF=0.5.

ADD REPLYlink written 4 weeks ago by Makplus T70
0
gravatar for Renesh
4 weeks ago by
Renesh1.9k
United States
Renesh1.9k wrote:

The AF in bcftools and GATK should be the same. AF is calculated as AC/AN, where AC is total ALT allele count and AN is total allele called in genotypes. If you have only one genotype or sample, AF = 1 for GT = 1/1 refers to you have 2 ALT allele and the total is also 2 alleles. Similarly, AF=0.5 for GT 0/1 refers to 1 ALT allele and a total of 2 alleles. If possible, please share a few lines of data from your VCF file for further clarification.

ADD COMMENTlink written 4 weeks ago by Renesh1.9k

Makes sense. I provide single sample to samtools mpileup , so I think AF = 0.5 or 1 is the actually the right result.

ADD REPLYlink written 4 weeks ago by Makplus T70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 721 users visited in the last hour