Question: SNP filtering by allele fraction
gravatar for drowl1
3 months ago by
drowl130 wrote:


I have a multi-sample VCF generated by GATK with an additional 'Allele fraction' (AF) annotation in the FORMAT field.

I wish to filter the SNPs in the VCF at the sample genotype-level using the AF in the FORMAT field and I want to remove sites with missing values (".") but without removing the whole variant.

I have tried the command GATK SelectVariant --select 'vc.hasAttribute("AF")' but that doesn't work.

Does anyone know of a simple way or a tool that could accomplish this?

Suggestions highly appreciated!

sequencing snp genome • 171 views
ADD COMMENTlink written 3 months ago by drowl130

Hi Could you please post an example of your table and the output you want to obtain after filtering.

ADD REPLYlink written 3 months ago by hugo.avila160


I figured it out eventually. So the reason why I thought to remove sites with missing values was because when I tried a 0.5 filter threshold with the command - bcftools view -e 'FORMAT/AF[*] => 0.5', the whole variant was excluded for all samples even if only one of the sites failed the threshold or had missing values.

The command bcftools view -i 'MIN(FMT/AF)>=0.5' worked well by keeping the whole variant and just excluding sites that failed the threshold.


ADD REPLYlink written 3 months ago by drowl130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1264 users visited in the last hour