Question: RSEM ignores quality scores?
0
gravatar for AW
2.8 years ago by
AW350
United Kingdom
AW350 wrote:

Dear all,

I want to quantify gene expression for my Trinity transcriptome assembly using RSEM. I am getting confusing results depending on how I specify quality scores of the Illumina paired end reads. They are Illumina 1.5 and have phred 64 quality scores.

The RSEM manual says that phred+33 is default:

--phred33-quals Input quality scores are encoded as Phred+33. (Default: on)

--phred64-quals Input quality scores are encoded as Phred+64 (default for GA Pipeline ver. >= 1.3). (Default: off)

a. When I specify the following commands I get the following alignment stats

rsem-calculate-expression --paired-end 1.out_paired.fastq 2.out_paired.fastq Trinity.fasta Trinity --phred64-quals --bowtie2 -p 32

19618532 reads; of these: 19618532 (100.00%) were paired; of these: 3453455 (17.60%) aligned concordantly 0 times 3684847 (18.78%) aligned concordantly exactly 1 time 12480230 (63.61%) aligned concordantly >1 times 82.40% overall alignment rate

bowtie2 -q --phred64 --sensitive --dpad 0 --gbar 99999999 --mp 1,1 --np 1 --score-min L,0,-0.1 -I 1 -X 1000 --no-mixed --no-discordant -p 10 -k 200 -x Trinity. fasta -1 1.out_paired.fastq -2 2.out_paired.fastq | samtools view -S -b -o temp/ bam -

b. However, I get exactly the same stats when I don’t specify phred 64. How can this be if RSEM defaults to phred33?

rsem-calculate-expression --paired-end 1.out_paired.fastq 2.out_paired.fastq Trinity. fasta Trinity --bowtie2 -p 32

19618532 reads; of these: 19618532 (100.00%) were paired; of these: 3453456 (17.60%) aligned concordantly 0 times 3684845 (18.78%) aligned concordantly exactly 1 time 12480231 (63.61%) aligned concordantly >1 times 82.40% overall alignment rate

I checked in the RSEM standard output and it is specifying phred33. So phred33 is default but how can it give the same alignment?

bowtie2 -q --phred33 --sensitive --dpad 0 --gbar 99999999 --mp 1,1 --np 1 --score-min L,0,-0.1 -I 1 -X 1000 --no-mixed --no-discordant -p 10 -k 200 -x Trinity. fasta -1 1.out_paired.fastq -2 2.out_paired.fastq | samtools view -S -b -o temp/ bam -

c. When I directly specify phred33 I also get the same alignment stats

rsem-calculate-expression --paired-end 1.out_paired.fastq 2.out_paired.fastq Trinity.fasta Trinity --phred33-quals --bowtie2 -p 32

19618532 reads; of these: 19618532 (100.00%) were paired; of these: 3453456 (17.60%) aligned concordantly 0 times 3684845 (18.78%) aligned concordantly exactly 1 time 12480231 (63.61%) aligned concordantly >1 times 82.40% overall alignment rate

Any help would be much appreciated.

A

rsem phred • 1.1k views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by AW350
1
gravatar for AW
2.8 years ago by
AW350
United Kingdom
AW350 wrote:

Response from Bo Li

Hi Alison,

RSEM does not ignore quality scores. It uses the quality scores to help it allocate multi-mapping reads. But when RSEM calls Bowtie2, we set the Bowtie2 parameters in a way that ignores quality scores. In particular, we set '--mp 1,1', which implies a mismatch penalty of 1 regardless of the quality score.

Hope it helps, Bo

https://groups.google.com/forum/#!topic/rsem-users/21DFXodwWzA

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by AW350
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 886 users visited in the last hour