Tophat Library Type And Pair Orientation For Illumina Data
1
5
Entering edit mode
8.7 years ago
Chai_AF ▴ 80

I have some questions about read orientation in Tophat that I cannot find the clear answers yet and I hope you can shed some light on these.

1 What are the orientation of reads for pair-end data for library fr-unstranded,fr-firststranded, or fr-secondstranded? I'm not sure whether these strand specific parameter related with pair orientation (ff, fr, rf in Bowtie). In Bowtie, this parameter (ff, fr, rf) need to be specified and Tophat use Bowtie for alignment. However, I cannot find the way to set this parameter in Tophat.

2 According to description about library type, "Reads from the left-most end of the fragment (in transcript coordinates) map to the transcript strand, and the right-most end maps to the opposite strand." does left-most end of fragment in transcript coordinates correspond to the read in first first file in paired-end data?

3 Some of public data that I collected do not have information about library type except only platform (standard illumine or strand specific Illumina), can I assume that they are fr-unstranded and fr-secondstrand respectively?

tophat library rna • 11k views
9
Entering edit mode
8.7 years ago
Ido Tamir 5.2k

1) It specifies for tophat on how the library was prepared from the mRNA and which directions the reads have with respect to the original mRNA

Its different from the Bowtie parameter (which concerns itself on how the paired end reads are oriented towards each other -> ->, -> <- ,<- ->) and also applies to single end reads.

tophat:

  ff-firststrand: -> ->
fr-firststrand: -> <-


There are 3 possibilities (and I hope I don't mix up the last two):

• Unstranded is the simplest: The directionality of the mRNA was not preserved. You get reads that are also antisense to the mRNA (and ideally they should be half/half)
• firststrand: Directionality is preserved, and one sequences the first strand that is generated. The read you get out should be aequivalent in directionality to the mRNA, and you come in from the right (3' end of the mRNA).
• secondstrand: Directionality is preserved, and one sequences the second strand (again sense to mRNA) that is generated. The read you get out should be aequivalent in directionality to the reverse complement of the mRNA, and you come in from the left (5' end of the mRNA).

2) this depends on the library preparation, which is why you have to specify it.

3) You can not assume anything in public data. Try and check if it makes sense. I.e. if you see most of the data coming from both strands, it was unstranded.

0
Entering edit mode

Thank you Ido for answers. By the way, does the fr in [fr-unstranded,fr-firststranded, fr-secondstranded] correspond to forward-reverse? I wonder whether paired end read orientation can be set for Tophat.

0
Entering edit mode

Thanks is expressed by upvoting. I never had to set read orientation for tophat. For me it worked with the default.

0
Entering edit mode

"I never had to set read orientation for tophat. For me it worked with the default". Is this true ? If you have fr-secondstrand type of data and you run top hat by using default (you do not specify any library type on the command), does it work right ?

0
Entering edit mode

I meant the paired end read orientation like ff vs fr. My data is always fr where the data of the reads are always sequenced from the fragment ends like -> <-. I think this was not in my answer when Chai_AF asked and I updated it afterwards.

0
Entering edit mode

Ido, what do you mean by "you come in from the right (3' end of the mRNA) " ? and viceversa?

Does anyone know yet if this info is actually used for mapping specificity or if it's just passed on to the XSA tag?