I have generated a VCF file from a bam file using samtools mpileup with the arguments
-d 8000 -B -q30 -Q30.
I don't really understand the resulting PL field. Here is an example:
2 167050053 . T C,<*> 0 . DP=3;I16=0,0,2,1,0,0,108,3888,0,0,111,4107,0,0,67,1539;QS=0,1,0;VDB=0.243476;SGB=-0.511536;MQSB=1;MQ0F=0 PL 101,9,0,101,9,101
In the above line there are 6 genotype likelihoods.
Based on the VCF specification. Biallelic sites has the ordering: AA,AB,BB; for triallelic sites the ordering is: AA,AB,BB,AC,BC,CC.
To me, it seems as if samtool treats the site as triallelic with 6 values, i.e. ,<*> is treat as an allele. In all positions were the reads match the reference I have <> and not the the dot ( "." ) I am used to.
I have used samtools 1.9.
Can anyone explain this behaviour? All help would be highly appreciated.