Question: unmapped reads in paired-end run pf bowtie2
0
gravatar for Assa Yeroslaviz
15 months ago by
Assa Yeroslaviz1.2k
Munich
Assa Yeroslaviz1.2k wrote:

I have a data set of paired-end samples which I'm mapping with bowtie2.

I am mainly focus on the unmapped reads, as these are the reads we're interested in.

This is a sample output of the mapping results from on of my runs:

11216394 reads; of these:
11216394 (100.00%) were paired; of these:
8079466 (72.03%) aligned concordantly 0 times
2987422 (26.63%) aligned concordantly exactly 1 time
149506 (1.33%) aligned concordantly >1 times
----
8079466 pairs aligned concordantly 0 times; of these:
  417642 (5.17%) aligned discordantly 1 time
----
7661824 pairs aligned 0 times concordantly or discordantly; of these:
  15323648 mates make up the pairs; of these:
    15226093 (99.36%) aligned 0 times
    63924 (0.42%) aligned exactly 1 time
    33631 (0.22%) aligned >1 times
32.13% overall alignment rate

Out of this output, how many unmapped reads do I have?

From my understanding I was thinking it is the fourth row from the bottom = 15226093?

If I calculate all three options in this summary i have aligned once - (29874222) + (417642 *2) + 63924 = 6874052 aligned >1 - (1495062) + 33631 = 332643 unaligned - 15226093 sum = 6874052 + 332643 + 15226093 = 22432788

This last number is the sum of my input reads of this sample.

But When I look at the number of reads in the unmapped.fastq file after the mapping it corresponds to the third row of the file (x2) = 8079466 x2 = 16158932?

But than adding the mapped and unmapped reads together give me more reads, than the there were in total.

Can anyone help me understand what is happening here?

How do I calculate the number of unmapped reads in a paired-end mapped sample?

thanks Assa

ADD COMMENTlink modified 15 months ago by Devon Ryan92k • written 15 months ago by Assa Yeroslaviz1.2k

What was the exact command you used? There are a few options for outputting various subsets of unmapped reads, it's likely you used the wrong one.

ADD REPLYlink written 15 months ago by Devon Ryan92k

This is the commands i use

bowtie2 --sensitive --end-to-end -p 10 --un-conc-gz firstRun.Unmapped.gz --phred33 -x genome -1 1.fq.gz -2 2.fq.gz | samtools view -Sbhu -F 4 - | samtools sort - -o sorted.bam
samtools index sorted.bam

The file I need in the next step is firstRun.Unmapped.gz.

I didn't subset the bam file with samtools, I thought the unmapped reads are outputted to this file.

ADD REPLYlink written 15 months ago by Assa Yeroslaviz1.2k

Just to add - this is the output of the same file when running the samtools flagstat command

7206695 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
7206695 + 0 mapped (100.00% : N/A)
7206695 + 0 paired in sequencing
3615165 + 0 read1
3591530 + 0 read2
6273856 + 0 properly paired (87.06% : N/A)
7131084 + 0 with itself and mate mapped
75611 + 0 singletons (1.05% : N/A)
3574 + 0 with mate mapped to a different chr
719 + 0 with mate mapped to a different chr (mapQ>=5)

I can't see how the numbers of the stat (bowtie2 output) file and the flagstat output correlate.

ADD REPLYlink modified 15 months ago • written 15 months ago by Assa Yeroslaviz1.2k
1
gravatar for Devon Ryan
15 months ago by
Devon Ryan92k
Freiburg, Germany
Devon Ryan92k wrote:

The discrepency you're seeing is due to using --un-conc-gz, which is writing the 8079466 pairs that didn't align concordantly to the specified fastq files. You'll instead want --un-gz instead.

ADD COMMENTlink written 15 months ago by Devon Ryan92k

I thought so too. But if I use the same command but with --un-gz and the input files -1 1.fq.gz -2 2.fq.gz, the unmapped file is empty.

Is there a way to map the files in a paired-end modus and still get the unmapped reads, not pairs?

ADD REPLYlink written 15 months ago by Assa Yeroslaviz1.2k

Then I'd suggest not writing them to a file with bowtie2 but instead keeping them in the SAM output and making fastq files from that (samtools fastq -f 4).

ADD REPLYlink written 15 months ago by Devon Ryan92k

thanks, I'll look into it

ADD REPLYlink modified 15 months ago • written 15 months ago by Assa Yeroslaviz1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1674 users visited in the last hour