Question: Six phred-scaled genotype likelihoods for biallelic site
gravatar for hugo.zeberg
23 months ago by
hugo.zeberg0 wrote:

I have generated a VCF file from a bam file using samtools mpileup with the arguments -d 8000 -B -q30 -Q30.

I don't really understand the resulting PL field. Here is an example:

2   167050053   .   T   C,<*>   0   .   DP=3;I16=0,0,2,1,0,0,108,3888,0,0,111,4107,0,0,67,1539;QS=0,1,0;VDB=0.243476;SGB=-0.511536;MQSB=1;MQ0F=0    PL  101,9,0,101,9,101

In the above line there are 6 genotype likelihoods.

Based on the VCF specification. Biallelic sites has the ordering: AA,AB,BB; for triallelic sites the ordering is: AA,AB,BB,AC,BC,CC.

To me, it seems as if samtool treats the site as triallelic with 6 values, i.e. ,<*> is treat as an allele. In all positions were the reads match the reference I have <> and not the the dot ( "." ) I am used to.

I have used samtools 1.9.

Can anyone explain this behaviour? All help would be highly appreciated.

snp vcf genotype likelihood • 618 views
ADD COMMENTlink modified 23 months ago by finswimmer13k • written 23 months ago by hugo.zeberg0

So I found this:

5.5 Representing unspecified alleles and REF-only blocks (gVCF)

"A symbolic alternate allele <*> is used to represent this unspecified alternate allele"

So it indeed a triallelic site, with a unspecified alternate allele. How do I remove this unspecified alternate allele?

ADD REPLYlink written 23 months ago by hugo.zeberg0
gravatar for finswimmer
23 months ago by
finswimmer13k wrote:

Hello hugo.zeberg ,

the output you are showing is the output of mpileup and not the final vcf file. For this you need to use bcftools call.

When using samtools/bcftools for variant calling this is a two step procedure. In the first step mpileup is used for finding potential position where a variant can be. In the seconde step, only these position are investigate in more detail whether there this is a "real" variant.

So my understanding of the mpileup output is, that at this position there could be a REF (T), a specific ALT (C) or any other ALT (<*>). Add the callstep like this to get the final vcf:

$ bcftools mpileup -Ou -d 8000 -B -q30 -Q30 input.bam|bcftools call -mv -Ov -o output.vcf

Since v1.9 samtools mpileupis deprecated and should be replaced by bcftools mpileup.

fin swimmer

ADD COMMENTlink written 23 months ago by finswimmer13k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1024 users visited in the last hour