Question: How to identify with Tophat maximum number of uniquely aligned reads by allowing mismatches only conditionally?
gravatar for trakhtenberg
4.7 years ago by
United States
trakhtenberg150 wrote:

In identifying unique reads, if tophat alignment is set to allow mismatches, I assume, that a unique read with single perfect alignment may be tagged as having multiple alignments due to a mismatch acceptance. On the other hand, if tophat alignment is set to disallow any mismatches, even the reads which have single unique alignment with one mismatch will get excluded. Is it possible to set tophat parameters so that only if a read has 0 alignments, then to allow 1 mismatch, if this still yields 0 alignments, then allow 2 mismatches, etc. (until x maximum mismatches to accept is reached)? Or, is this best accomplished after the alignment is made, by filtering the output files (e.g., by alignment quality scores) prior to passing to Cufflinks? Either way, how to accomplish this? thanks.

rna-seq • 2.4k views
ADD COMMENTlink modified 3.9 years ago by Biostar ♦♦ 20 • written 4.7 years ago by trakhtenberg150
gravatar for Ashutosh Pandey
4.7 years ago by
Ashutosh Pandey11k wrote:

By default TopHat reports best or primary alignments based on alignment scores (AS). So even if there are more than one alignments that fulfill a certain mapping criteria given by a user it will only report the best alignment. In other words, use default settings during alignment and Tophat should be smart enough to report alignments you are looking for. If you set pretty stringent parameters like no mismatches allowed then you will loose lot of valid alignments. Use default if you don't know how Tophat works.  

ADD COMMENTlink written 4.7 years ago by Ashutosh Pandey11k

so the accepted_hits.bam will contain single best alignments, and multi-aligners only if they have the same AS? thank you.

ADD REPLYlink written 4.7 years ago by trakhtenberg150

Yes. In case more than one alignments have the best AS (alignment scores), Tophat2 will report all of them. But the default setting for --max-multihits is 20 which means if there are 30 alignments with all of them having best AS, then Tophat2 will report 20 of them randomly. If there are only 5 alignments with best AS, then all of them will be reported. The alignments with the second best AS (alignment scores) won't be reported until you use --report-secondary-alignments feature.

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by Ashutosh Pandey11k

its all clear now, thank you, but I have a follow-up question: When the MAPQ score 4 (single best alignment) is assigned, is it taking into consideration both reads in the pair, so that even if each on its own is a multi-aligner, as a pair they may be unique? (and so then each of these reads would get MAPQ 4 even if each on its own is a multi-aligner)


ADD REPLYlink written 4.7 years ago by trakhtenberg150
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1087 users visited in the last hour