Remove both pair end reads with low mapping quality using Samtools
1
1
Entering edit mode
9.4 years ago
DVA ▴ 630

To my understanding, when use samtools -q 20 on paired end sequencing data, if one read of a pair has quality 30, the other has 0, only the 0 quality one will be removed. So I'm wondering if samtools has a option to remove both, as I'm trying to test my program and I prefer my reads stay paired.

Thank you very much!

mapping-quality samtools • 3.7k views
ADD COMMENT
1
Entering edit mode

Does that actually happen? It must be fairly rare and, what samtools utility are you running? If it's mpileup, then you don't need to worry about reads as it's doing the per-base pileup. In short, you're either creating unnecessary work for yourself or it'd be good to give more information about what you are trying to do.

ADD REPLY
0
Entering edit mode

brentp : I've just tested a few bams from 1000G, it doesn't happen often, but it happens:

$ curl -s "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/150140/alignment/150140.chrom20.ILLUMINA.bwa.CHM1.20131218.bam"  |\
samtools view  - |\
head -n 20000 | \
grep -E '(SRR642643.96013593|SRR642643.4610992|SRR642643.101471738)' | sort -k1,1 | cut -f 1-9

SRR642643.101471738    163    20    61043    37    100M    =    61133    124
SRR642643.101471738    83    20    61133    36    65S35M    =    61043    -124
SRR642643.4610992    163    20    61252    29    38M62S    =    61340    187
SRR642643.4610992    83    20    61340    37    100M    =    61252    -187
SRR642643.96013593    163    20    61234    29    40M60S    =    61322    187
SRR642643.96013593    83    20    61322    37    100M    =    61234    -187
ADD REPLY
0
Entering edit mode

you mean because one end has a qual of 37 and the other of 29?

ADD REPLY
0
Entering edit mode

It happens to me when I try to test some of my own software. I guess it is very rare in real sequencing analysis. Thank you.

ADD REPLY
1
Entering edit mode
9.4 years ago

Samtools doesn't have such an option (it would be too rarely needed to incorporate it). You can query sort the BAM file and write a little Python script with pysam to process the pairs as a unit. That'd be a simple enough way to produce the same result.

ADD COMMENT
0
Entering edit mode

Thanks! I believe there is no simple ways to get around with it:)

ADD REPLY

Login before adding your answer.

Traffic: 1591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6