Question: Paired-end stranded RNA-seq...still confused about TopHat
gravatar for cadeans
3.5 years ago by
cadeans10 wrote:

Hey Everyone,

I'm trying to figure out the appropriate TopHat settings for strand-specific, paired-end rna-seq data. I've read several posts about this but am still uncertain about the right settings. I'm hoping someone can confirm that my understanding of the sense/antisense, forward/reverse reads, is correct. As I understand it:

1) mRNA is an exact match to the DNA coding sequence (aside from U and no introns) and matches the sense strand
2) in library prep (TruSeq stranded in my case) the first strand of the cDNA library, which is antisense to the original gene, is used for sequencing, while the second strand is dUTP marked gets degraded
3) for paired end sequencing, after bridge PCR and sequencing the sense strand becomes read 1 (forward read) and the antisense strand becomes read 2 (reverse read).

So, when running TopHat, read 1 (/1) is used for the forward read and read 2 (/2) for the reverse read, with the library type set to fr=first strand?

My ultimate goal is to use the bam files from TopHat to get raw counts in HTseq, where I understand the appropriate library setting is "reverse". Btw, I'm doing all this in Galaxy, as I'm not very proficient at coding. Thanks in advance for any re-assurance.

rna-seq alignment • 3.6k views
ADD COMMENTlink modified 3.5 years ago by Istvan Albert ♦♦ 79k • written 3.5 years ago by cadeans10
gravatar for Istvan Albert
3.5 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

Using the words forward and reverse in this context introduce more confusion since THIS forward (sense direction) is not THAT forward (genomic forward). You should not use the words forward and reverse in this context just keep the sense/antisense terminology.

 Due to to the vagaries of the library prep the reads in file 2 will correspond to the sense direction of the transcript (will indicate the 5' ends of the fragments that came from the transcript).

ADD COMMENTlink written 3.5 years ago by Istvan Albert ♦♦ 79k

Hi Istvan,

Thanks for the input. Regarding my use of forward and reverse, I am a little confused by their meaning. For my general understanding of the process, I tried to follow the sense and antisense strands through library prep and sequencing. I end up with the first sequenced read as matching the sense strand, but I think I might be misunderstanding something about the bridge PCR and sequencing, particularly because you say the sense transcripts should be present in the read 2 files. Either I'm missing something in the sequencing process or I don't understand how read 1 and read 2 are defined. Could you shed some light on this for me, please? Thanks in advance.

My reasoning:

first strand of cDNA is antisense (which is what remains after dUTP degradation) --> adapters are added to this strand and during bridge PCR the complement is created (sense strand) --> after further clustering all sense strands are washed away, leaving only antisense strands --> the sequencing process uses these antisense strands as a template to produce reads that are sense (and presumably these are the first reads in the fasta file (read 1; 1/)

ADD REPLYlink written 3.5 years ago by cadeans10

I think this paper has a good explanation (though I can't check since I can't access it from here)


ADD REPLYlink written 3.5 years ago by Istvan Albert ♦♦ 79k

Hi cadeans, Did you figure out why reads in file 2 corresponds to sense direction of the transcript? I have looked through the article but still don't get it ...

Here is my reasoning:

5' ----------- 3' mRNA fragment

   --> reads 1
3' ----------- 5' cDNA (first strand)
5' ----------- 3' cDNA (second strand that is reverse and complementary to the first strand)
           <-- read 2
ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by flsnike10
gravatar for Adrian Pelin
3.5 years ago by
Adrian Pelin2.2k
Adrian Pelin2.2k wrote:

I think all of this is correct, except a minor point for (1), there is this thing called mRNA Editing (DNA Exons are different from mRNA), happens in some weirdo eukaryotic parasites but also in plants.

ADD COMMENTlink written 3.5 years ago by Adrian Pelin2.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1137 users visited in the last hour