Question: Allele frequency of vcf file
1
gravatar for juncheng
4.9 years ago by
juncheng180
köln
juncheng180 wrote:

I know it is a stupid question, but I'm a beginner of snp calling and really confused.

I run samtools and bcftools get vcf files like this 

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  accepted_hits.bam
1       565286  .       C       T       222     .       DP=27;VDB=2.078631e-01;AF1=1;AC1=2;DP4=0,0,14,12;MQ=40;FQ=-105  GT:PL:GQ        1/1:255,78,0:99

 

However, I not only want AF1 (Max-likelihood estimate of the first ALT allele frequency), but also want AF (allele frequency). How can I get a result containing both information?

I run samtools and bcftools like this:

samtools mpileup -ugf /data/Genomes/Homo_sapiens/Homo_sapiens_assembly19.fasta accepted_hits.bam | bcftools view -bvcg - > SRR1294493.bcf

 

bcftools view SRR1294493.bcf | vcfutils.pl varFilter -D 44 > SRR1295163.vcf

Thanks a lot.

snp • 4.2k views
ADD COMMENTlink modified 2.1 years ago by gsr9999110 • written 4.9 years ago by juncheng180

seems you are missing RO and AO tags, reference observation count and alternate observation count.

ADD REPLYlink written 4.9 years ago by Adrian Pelin2.3k

Yes, it is. Do you know why is like this?

ADD REPLYlink written 4.9 years ago by juncheng180
1

Not sure. Try to use FreeBayes, I know it reports more than enough information.

ADD REPLYlink written 4.9 years ago by Adrian Pelin2.3k
2
gravatar for Ryan Layer
2.7 years ago by
Ryan Layer60
United States
Ryan Layer60 wrote:

export BCFTOOLS_PLUGINS="$HOME/src/bcftools/plugins/"

bcftools +fill-AN-AC SRR1294493.bcf

ADD COMMENTlink written 2.7 years ago by Ryan Layer60
1
gravatar for Noushin N
2.7 years ago by
Noushin N550
Baltimore, MD
Noushin N550 wrote:

If by allele frequency you mean the fraction of reads supporting alternate allele, that information is provided in DP4 field.

DP4=0,0,14,12;

DP4 reports the number of reads covering the position with the reference allele mapped to forward and reverse strands, followed by alternate allele mapped to forward and reverse strands.

Allele frequency here would be 0 (0/26) for reference allele, and 1.0 (26/26) for alternate allele.

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Noushin N550
0
gravatar for juncheng
4.9 years ago by
juncheng180
köln
juncheng180 wrote:

If anyone knows anything, please help. I'm really struggle at the moment.

ADD COMMENTlink written 4.9 years ago by juncheng180
0
gravatar for gsr9999
2.1 years ago by
gsr9999110
United States
gsr9999110 wrote:

Hi juncheng,

you can use bcftools plugins to fill in the missing AF tag to your output vcf file.

The following example could be useful to you :

export BCFTOOLS_PLUGINS="~/mytools/bcftools/bcftools-1.3.1/plugins/"

$samtools mpileup -t DP,AD,ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR -Bugf ~/reference/human_g1k_v37.fasta -l target_regions.bed na12878_AlignedReads.bam 2> mpileup_errorLog.log | \ bcftools call -vmO u -f GQ 2> bcftoolsCall_errorLog.log | \ bcftools plugin fill-tags -Oz -o output_variants.vcf -- -t AF 2> bcftoolsPlugin_errorLog

ADD COMMENTlink written 2.1 years ago by gsr9999110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1904 users visited in the last hour