Question: Variant Call Format - Explanation Of Quality Scores?
2
gravatar for Ian
8.1 years ago by
Ian5.5k
University of Manchester, UK
Ian5.5k wrote:

In variant call format (VCF) files produced at the end of the samtools mpileup variant detection pipeline there are two quality scores:

1) QUAL (col 6) = Phred based score that the variant shown in the ALT col is wrong.

2) INFO (col 8) MQ flag = RMA mapping quality

The two scores do not have a linear relationship. Some variants have a high mapping quality, but lower QUAL....

Can anyone describe how these two scores are created and how they are related?

I want to be able to filter variants based on these two scores, but do not fully understand what they mean. Should i filter based on both scores or just 'QUAL' for variants?

Thanks.

vcf samtools • 5.7k views
ADD COMMENTlink written 8.1 years ago by Ian5.5k
1
gravatar for Jeremy Leipzig
8.1 years ago by
Philadelphia, PA
Jeremy Leipzig18k wrote:

mapping quality is assigned by the aligner and are wildly different from aligner to aligner. It is like e-val but less useful. Every aligner will assign a different mapping quality based on what it internally estimates the probability is that a read is accurately placed.

If you start filtering based on mapping quality you run the risk of biasing against SNPs that are in repeats or gene families.

ADD COMMENTlink written 8.1 years ago by Jeremy Leipzig18k
1

yes, this is called MAPQ in SAM and is arguably the least popular column

ADD REPLYlink written 8.1 years ago by Jeremy Leipzig18k

Just to clarify - you are talking about MQ, right? Thanks for replying, i couldn't find anything as useful on the VCF site (unless i missed it...).

ADD REPLYlink written 8.1 years ago by Ian5.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1488 users visited in the last hour