bowtie output statistic in RSEM
Entering edit mode
7.6 years ago
ccshao ▴ 10

I used the RSEM to map paired-end rna-seq data to transcriptome, I am really confused by the output by the bowtie and final mapped bam file.

The command is:

rsem-calculate-expression -p 6 --paired-end $fastqP_1_fa,$fastqP_2_fa  $fastqP_3_fa,$fastqP_4_fa $rsemIndex $output

$fastqP_1_fa, $fastqP_3_fa are one paired data, $fastqP_2_fa, $fastqP_4_fa are the other paired data. Same library sequenced in two lanes.

bowtie output statistic is:

# reads processed: 31414684
# reads with at least one reported alignment: 15977643 (50.86%)
# reads that failed to align: 15437041 (49.14%)
Reported 89378248 paired-end alignments to 1 output stream(s)

However, for the two lanes, there are 15577283 *2 + 15837401 *2 = 62829368 total reads. So what does reads processed: 31414684 mean here? Why less than half of total reads processed? and what is 89378248?

In the output bam file, there are 209630578 lines, 46851725 total reads (they unique reads, i.e., only count once even mapped to multiple locations), and 15977643 mapped reads, the only number I found understandable.

Could someone give me a clue on this?

RNA-Seq bowtie RSEM • 3.1k views
Entering edit mode
7.6 years ago

Well 2 * 31414684 = 62829368, it should probably say read-pairs processed, one read pair may align in multiple locations hence 89378248


Login before adding your answer.

Traffic: 2284 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6