Question

How Does Tophat Deal With Low Quality Bases?

1

Entering edit mode

13.4 years ago

pinkiii1984v ▴ 20

Hi,

I am dealing with RNA-seq PE data. I see some of my samples have poor quality bases at the end of reverse reads. How does TopHat deal with such reads? Are these bases clipped while mapping? and how does that affect the mapping quality?

Thank you

tophat rna-seq • 4.8k views

ADD COMMENT • link 13.4 years ago by pinkiii1984v ▴ 20

0

Entering edit mode

AFAIK tophat does not perform soft or hard clipping. So, you'll get better read mapping if you clip the low-quality bases yourself. The way I do it is to perform clipping from the end of the read and retain the read if the clipped read length is >= 50 bases, else remove it.

ADD REPLY • link 13.4 years ago by Arun 2.4k

score 1 · Answer 1 · 2012-07-03

Az Arum puts it (and that should be an answer rather than a comment ;-) ) very few tools use qualities directly. Rightly so I might add since the way the base quality measures are generated lacks proper foundation - at least with respect to the numerical probabilities they stand for.

Note how a good base quality is 40 that means one in 10,000 chance of being wrong - yet at the same time just about all sequencing platforms introduce about 1 miscall per 100 bases. Trimming back reads from their ends prior to processing is the most common approach.

score 0 · Answer 2 · 2012-07-03

0

Entering edit mode

13.4 years ago

pinkiii1984v ▴ 20

In case if I don't trim the poor quality bases, then how does tophat deal with them?

ADD COMMENT • link 13.4 years ago by pinkiii1984v ▴ 20

0

Entering edit mode

please add your contributions as a followup comment rather than as a new answer. As for your question: the quality string is simply ignored,

ADD REPLY • link 13.4 years ago by Istvan Albert 103k

0

Entering edit mode

If TopHat ignores the quality string, will that affect the mapping quality?

ADD REPLY • link 13.3 years ago by pinkiii1984v ▴ 20

0

Entering edit mode

I think (but you should check this with the developers) that it ignores the quality during the alignment procedure, but then it does make use of it when computing the mapping quality is computed, at least this is what maq/bwa does: http://maq.sourceforge.net/qual.shtml - an now just a personal opinion - in general don't read too much into the qualities - these as values are rough approximations that have an accuracy that is far less than what is implied

ADD REPLY • link 13.3 years ago by Istvan Albert 103k