Quality, MQ0F, other parameters for proper VCF filtering
0
1
Entering edit mode
7.3 years ago

Hi all,

Reading BioStars posts for quite a time I made up my mind on what could be a proper parameter set for filtering a VCF file properly after calling the variants. I still have some doubts though:

  • When I request a minimum quality of 20 I'm basically saying that I want only variants with up to 1% of probability to be wrong. Would 5% be too much?
  • Plotting the phred scores of the 6th field of the VCF file (qualities) I get a weird small peak at the maximum, 486. Is is a computational limit?
  • The MQ0F field tells me the fraction of reads with mapping quality 0 among the reads that called the variant. I usually filter out variants that have more than 1% of the reads having map quality 0. Is it too strict?
  • What are your policies on alternative allele frequency in the reads? I tried keeping only variants where the alternative allele was confirmed by at least 20% of the reads, but I lose a lot of them. Are them all not to trust?
SNP alignment genome variant VCF • 2.7k views
ADD COMMENT
0
Entering edit mode

"The MQ0F field tells me the fraction of reads with mapping quality 0 among the reads that called the variant. I usually filter out variants that have more than 1% of the reads having map quality 0. Is it too strict?"

Sounds too strict to me. Obviously, if 100% of reads have a mapping quality of 0, that's a problem. But if 10% of them do, it's not clear to me that matters very much, if the other 90% have a high MAPQ. 0 sounds like an odd cutoff anyway, as MAPQ 3 or lower indicates a read mapped ambiguously.

The allele fraction to consider depends on the platform, as they have idiosyncratic, nonrandom errors at different rates. Also, the ploidy and various biases affect this. There isn't really a single number that's universally good for either allele fraction or quality; it depends on your experiment (the goal, the sample prep, the software pipeline, etc).

ADD REPLY

Login before adding your answer.

Traffic: 2670 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6