Question: Allele frequency of vcf file
1
gravatar for juncheng
5.2 years ago by
juncheng190
köln
juncheng190 wrote:

I know it is a stupid question, but I'm a beginner of snp calling and really confused.

I run samtools and bcftools get vcf files like this 

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  accepted_hits.bam
1       565286  .       C       T       222     .       DP=27;VDB=2.078631e-01;AF1=1;AC1=2;DP4=0,0,14,12;MQ=40;FQ=-105  GT:PL:GQ        1/1:255,78,0:99

 

However, I not only want AF1 (Max-likelihood estimate of the first ALT allele frequency), but also want AF (allele frequency). How can I get a result containing both information?

I run samtools and bcftools like this:

samtools mpileup -ugf /data/Genomes/Homo_sapiens/Homo_sapiens_assembly19.fasta accepted_hits.bam | bcftools view -bvcg - > SRR1294493.bcf

 

bcftools view SRR1294493.bcf | vcfutils.pl varFilter -D 44 > SRR1295163.vcf

Thanks a lot.

snp • 4.5k views
ADD COMMENTlink modified 2.5 years ago by gsr9999110 • written 5.2 years ago by juncheng190

seems you are missing RO and AO tags, reference observation count and alternate observation count.

ADD REPLYlink written 5.2 years ago by Adrian Pelin2.3k

Yes, it is. Do you know why is like this?

ADD REPLYlink written 5.2 years ago by juncheng190
1

Not sure. Try to use FreeBayes, I know it reports more than enough information.

ADD REPLYlink written 5.2 years ago by Adrian Pelin2.3k
2
gravatar for Ryan Layer
3.1 years ago by
Ryan Layer60
United States
Ryan Layer60 wrote:

export BCFTOOLS_PLUGINS="$HOME/src/bcftools/plugins/"

bcftools +fill-AN-AC SRR1294493.bcf

ADD COMMENTlink written 3.1 years ago by Ryan Layer60
2
gravatar for Noushin N
3.1 years ago by
Noushin N560
Baltimore, MD
Noushin N560 wrote:

If by allele frequency you mean the fraction of reads supporting alternate allele, that information is provided in DP4 field.

DP4=0,0,14,12;

DP4 reports the number of reads covering the position with the reference allele mapped to forward and reverse strands, followed by alternate allele mapped to forward and reverse strands.

Allele frequency here would be 0 (0/26) for reference allele, and 1.0 (26/26) for alternate allele.

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Noushin N560
0
gravatar for juncheng
5.2 years ago by
juncheng190
köln
juncheng190 wrote:

If anyone knows anything, please help. I'm really struggle at the moment.

ADD COMMENTlink written 5.2 years ago by juncheng190
0
gravatar for gsr9999
2.5 years ago by
gsr9999110
United States
gsr9999110 wrote:

Hi juncheng,

you can use bcftools plugins to fill in the missing AF tag to your output vcf file.

The following example could be useful to you :

export BCFTOOLS_PLUGINS="~/mytools/bcftools/bcftools-1.3.1/plugins/"

$samtools mpileup -t DP,AD,ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR -Bugf ~/reference/human_g1k_v37.fasta -l target_regions.bed na12878_AlignedReads.bam 2> mpileup_errorLog.log | \ bcftools call -vmO u -f GQ 2> bcftoolsCall_errorLog.log | \ bcftools plugin fill-tags -Oz -o output_variants.vcf -- -t AF 2> bcftoolsPlugin_errorLog

ADD COMMENTlink written 2.5 years ago by gsr9999110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1056 users visited in the last hour