Question

tophat2 0% mapping rate for strand-specific RNA-seq

0

Entering edit mode

7.9 years ago

ag1194 • 0

Hi,

I am new tophat user, I run my first RNA-seq, by using an already published data to test. However I have got 0.0% mapping rate. Obviously I am making a mistake.

The data that I am using is strand specific RNA-seq. I used the GEOdatabase to access the data. I run my tophat based on options they provide in the paper with following command:

tophat2 -r 25 --coverage-search -G UCSC/refGene_mm9.gtf --library-type fr-firststrand /bowtie2/mm9  subset697.fastq &> tophat.log

I was looking at the Nature Protocols paper for tophat and I realize that in the paper, for strand specific data, they provide 2 different fastq file.I think that is the problem in my case. I only have left_reads in my tmp folder. Does anybody have experience on strand-specific RNA-seq? And if I need two fastq file, any ideas what can I do about it since the data on GEOdatabase contains 1 big fastq file?

Thank you!! -A

RNA-Seq • 2.1k views

ADD COMMENT • link updated 7.9 years ago by Devon Ryan 104k • written 7.9 years ago by ag1194 • 0

0

Entering edit mode

Hi, I am using tophat2 to align fastq downloaded from dbGaP, and have the same issue. Have you figured out the solution yet? Thanks.

ADD REPLY • link 7.4 years ago by wwu222 • 0

score 0 · Answer 1 · 2016-06-04

Depending on when this was sequenced, fr-firststrand may not be appropriate. You might try fr-secondstrand in case this is an old dataset. Also, don't use -r 25, since you don't have paired-end data (to answer your question regarding that, no, you don't need two files). Your full command should be:

tophat2 --coverage-search -G UCSC/refGene_mm9.gtf --library-type fr-secondstrand /bowtie2/mm9  subset697.fastq &> tophat.log

You could alternatively try fr-unstranded, which should always work.

As an aside, I'd recommend ditching tophat2 and switching to hisat2, which has replaced it.