Filtering Out Reads Based On Base Quality With Samtools Seems Not To Work Properly
Entering edit mode
12.3 years ago
Pascal ★ 1.5k

We have an issue with the -Q parameter in samtools mpileup: it seems not working properly.

In a bam file I know that the all reads have base Q = 40, the following commands doesn't filter out nothing:

samtools mpileup -Q 93 out_sorted.bam

Here's a complete workflow to reproduce quickly the issue we are experiencing based on simulated data. It generates a bam file and converts all read base qualities to 40 then samtools mpileup is called with a minimum of 93:

samtools faidx human_g1k_v37.fasta 20:1-5000000 > ref.fasta
wgsim -N 200000 -r 0 -R 0 -X 0 ref.fasta out.read1.fq out.read2.fq > wgsim.out
awk ' /222/ { gsub("2", "I"); print $0; next } { print } ' out.read1.fq > out.read1b.fq  
awk ' /222/ { gsub("2", "I"); print $0; next } { print } ' out.read2.fq > out.read2b.fq
bwa index ref.fasta 
bwa aln ref.fasta out.read1b.fq -f out1.sai
bwa aln ref.fasta out.read2b.fq -f out2.sai
bwa sampe -f out.sam -r '@RG\tID:foo\tSM:bar' ref.fasta out1.sai out2.sai out.read1b.fq out.read2b.fq
samtools view -hbS out.sam -o out.bam
samtools sort out.bam out_sorted
samtools index out_sorted.bam
samtools mpileup -Q 93 out_sorted.bam > mpileup.txt

Please let me know if I did not understand something or if there is a real issue here with the -Q function.

samtools • 7.7k views
Entering edit mode

perhaps 93 is an invalid value? just guessing try 92 or 60 or even 41

Entering edit mode
12.0 years ago
FGV ▴ 170

I'm under the impression that the -Q and -q option won't filter out anything from the pileup file. That is, even reads/bases are below the threshold they will still show up there. SAMTOOLS just won't take them into account when calculating genotype likelihoods (-g).

Also, -Q option in the manual: "-Q INT Minimum base quality for a base to be considered [13] "

which leads me to think that SAMTOOLS won't throw away entire read, but just ignore bases with lower than -Q quality when computing genotype likelihoods (GL)...


Login before adding your answer.

Traffic: 2155 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6