What Happened To -K In Tophat For Multiple-Mapping Reads?
1
1
Entering edit mode
11.0 years ago
gaelgarcia05 ▴ 280

Selecting -g n in tophat does not discard reads mapping more than n, but instead only reports n alignments for those out all all their TOP scoring alignments.

I think there used to be an option -k that would allow one to discard reads that topped x alignments -- whatever happened to that? I only see -g in the tophat 2 manual, no reporting options like before...

multiple-alignment tophat2 rna-seq tophat • 3.1k views
ADD COMMENT
1
Entering edit mode
10.0 years ago
Dan D 7.4k

You're correct that it appears to be gone, as it's now an unrecognized option on the command line. If I had to hazard a guess, I would speculate that it's because it's tricky to know what to do with the discarded reads. They're certainly not "unmapped," but then do you make another BAM for the "abundant" reads? Either that or there was a change to the algorithm where concurrency considerations made it difficult to track the total number of alignments for a given read until later on in the process, where any efficiency gains would be wiped out.

Fortunately, you can easily remove these reads downstream of tophat using BAMTools filter. For example, if you wanted to remove any read which mapped 20 or more times, you could supply the following JSON to the tool:

{
   "tag" : "NH:<20"
}
ADD COMMENT

Login before adding your answer.

Traffic: 1665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6