So I've sequenced samples with MinION NGS platform, and analysed for SNPs via SAMtools/BCFtools. To corroborate or reject these SNVs I've sequenced the same samples via Sanger.
According to the VCF spec (V4.2):
Phred-scaled quality score for the assertion made in ALT. i.e. −10log10 prob(call in ALT is wrong). If ALT is ‘.’ (no variant) then this is −10log10 prob(variant), and if ALT is not ‘.’ this is −10log10 prob(no variant). If unknown, the missing value should be specified. (Numeric)
When examining the VCF file generated by the SAMtools/BCFtools pipeline, I find that the QUAL column indicates values from as low as 5.0... to as high as 196.0 for an alternative allele, with DP values on the order of 10^3 (which makes me happy, as it increases my confidence of this position being a SNP).
When I sequence by Sanger the same position, it may also support the alternative allele. But Sanger quality scores are maximum 60 in my data, and it seems from the The Sanger FASTQ file format for sequences with quality scores, that the Phred score indicated here are similar: -10*log10(Probability of base erroneously called).
- Why are these quality scores (NGS, Sanger) so distinctly different than each other?
- Is MinION NGS data not coded in Sanger+33 ASCII base, same as Sanger?
- Is a comparison between the two quality scores a valid one, to some extent?
- Edit: is there another parameter in the VCF format of my NGS data which correspond to the Sanger phred score, perhaps 'MQ' ("Root-mean-square mapping quality of covering reads")?