Question: Not consistent result between SAM filtering and bowtie
gravatar for Picasa
4.2 years ago by
Picasa540 wrote:


I have a list of contaminants that I want to filter out from my paired end:

bowtie2 -x contaminants -1 pair1.fastq -2 pair2.fastq -S out.sam

Now I want to extract only unmapped reads from the .sam file but I want to keep only paired end.

So if read1 map somewhere and read2 not, I want to discard these PE.

samtools view -f 4 -bu input.bam | samtools view -f 8 -bu - | java -jar  picard-tools/picard.jar  SamToFastq  I=/dev/stdin F=out1.fq.gz F2=out2.fq.gz  FU=unpaired.fq.gz

I got a problem, the output of bowtie 2 is :

5317475 reads; of these:
15317475 (100.00%) were paired; of these:
15155323 (98.94%) aligned concordantly 0 times
155208 (1.01%) aligned concordantly exactly 1 time
6944 (0.05%) aligned concordantly >1 times
15155323 pairs aligned concordantly 0 times; of these:
95593 (0.63%) aligned discordantly 1 time
15059730 pairs aligned 0 times concordantly or discordantly; of these:
30119460 mates make up the pairs; of these:
29853788 (99.12%) aligned 0 times
195884 (0.65%) aligned exactly 1 time
69788 (0.23%) aligned >1 times
2.55% overall alignment rate

which mean I should have 15155323 PE right ?

But when I do :

grep '@' out1.fastq | wc -l

I got : 14798490 PE

1) What's wrong with my command ?

2) Moreover my unpaired.fq.gz is empty

bowtie sam • 1.3k views
ADD COMMENTlink modified 3.7 years ago by Biostar ♦♦ 20 • written 4.2 years ago by Picasa540

Hello picasa1983!

It appears that your post has been cross-posted to another site:

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink written 4.2 years ago by Pierre Lindenbaum130k

Sorry but I try to increase my chance of answers. This problem annoy me and it seems like nobody got an answer.

ADD REPLYlink written 4.2 years ago by Picasa540

Try read binning approach for a 1 step solution: C: BBSplit syntax for generating builds for the reference genome and how to call di

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by genomax89k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 761 users visited in the last hour