Meaning Of Samtools 'Vcfutils.Pl Varfilter -Q'?
2
1
Entering edit mode
12.9 years ago
Ian 6.0k

I have been using the suggested mpileup based pipeline for variant detection here.

The vcfutils.pl varFilter step does useful things for filtering the summary VCF, including the minimum and maximum number of reads that should be present. However, can anyone explain the -Q flag, other than the one line explanation included in the man page?

Usage: vcfutils.pl varFilter [options] <in.vcf>

Options: -Q INT minimum RMS mapping quality for SNPs [10]

Is this a way of filtering SNP/indel quality?

For example in the old pileup based method here the following was suggest for SNP filtering (or >=50 for INDELS).

samtools.pl varFilter raw.pileup | awk '$6>=20' > final.pileup

Also, the -Q flag the same as the value that can be specified in seqtk (qual_thres)?

Usage: seqtk fq2fa <in.fq> [qual_thres]

Thank you!

samtools vcf variant • 14k views
ADD COMMENT
0
Entering edit mode

I can't explain the parameter to you but I am currently experimenting with these filters also. I cannot get the -Q flag to filter anything, regardless of the value I set. I also found that I had to manually specify a zero value for most of the command line parameters as the default settings were filtering out ALL of my variants. Any input on this would be very useful!

ADD REPLY
0
Entering edit mode

I can't explain the parameter to you but I am currently experimenting with varFilter also. I found that I had to manually specify a zero value for most of the command line parameters as the default settings were filtering out ALL of my variants. Any input on this would be very useful!

ADD REPLY
0
Entering edit mode

Somewhere there is a distinction between consensus sequence quality and SNP quality, but i just cannot tell how it relates to my examples.

ADD REPLY
0
Entering edit mode

Hopefully someone will have an answer - it would be good to know how to apply basic score-based filtering with vcftools for SNPs/Indels.

ADD REPLY
2
Entering edit mode
12.9 years ago

fq2fa woudn't have anything to do with MAPQ since FASTQ consists of sequence and base call quality scores. MAPQ is a measure of the quality of the alignment not the read base calls themselves. it will tend to be higher in more mappable areas and less in repeats.

I think maybe you are getting confused because MAPQ in samtools and the MQ in INFO in VCF are termed "phred-scaled", but that mainly just so higher numbers are better because with phred they get bigger the less chance you have of it being wrong. It doesn't mean they are actually phred base call scores.

ADD COMMENT
0
Entering edit mode
12.9 years ago
Russh ★ 1.2k

I don't quite follow why there's a bounty on this question. The -Q flag allows you to set a threshold for the quality of the reads supporting a SNP. On a given line in the VCF, the MQ tag in your VCF contains the root mean square of the MAPQ scores for reads which support that SNP. My understanding is that -Q [INT] allows you disregard SNPs in the VCF with MQ value less that INT. R

ADD COMMENT
1
Entering edit mode

MQ is a measure of the confidence in the uniqueness of the alignment, not the quality of the reads themselves

ADD REPLY
0
Entering edit mode

I offered bounty as no-one had answered. My understanding of the VCF file is that col 6 gives the quality of the SNP and MQ gives the quality of the reads.

ADD REPLY
0
Entering edit mode

no probs. i also don't know why my answer was voted down, is it incorrect?

ADD REPLY
0
Entering edit mode

According to what i have read and what Jeremy says it looks wrong. If i am wrong i am happy to change my mind (no offence meant!)

ADD REPLY

Login before adding your answer.

Traffic: 2255 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6