Question: Read pair orientation : Illumina TruSeq Stranded mRNA library
9
gravatar for erwan.scaon
22 months ago by
erwan.scaon720
Nantes - France
erwan.scaon720 wrote:

Given strand-specific 2x150 paired reads generated with Illumina Truseq Stranded mRNA, chose the right value for featureCounts "strandSpecific" option (1=stranded vs 2=reversely stranded).

Below is a compilation of helpful posts on the subject. They made me pick strandSpecific = 2.
I am under the impression that everyone is using slightly differents words / terms to define the same thing, thus I am still a bit confused and I'd be glad to hear some feedback.

### Posts recap ###

Post 1 (2015/03/20) :

some protocols, like Illumina Truseq Stranded mRNA library prep kit, first sequence the blue end (which is the right-most read, which is from the first strand, which is complementary to mRNA), then sequence the red end (which is the left-most read, which is from the second strand, which is the same as mRNA). Therefore, you need to use “fr-firststrand” in the Tuxedo suite, and use “-s reverse” option if you use htseq-count.

Which I understand as:

  --R2-->
5'--------second_strand--------3'
3'---------first_strand--------5'
                        <--R1--

Post 2 (~2015/09) :

OP : in library prep (TruSeq stranded in my case)

Answer : Due to to the vagaries of the library prep the reads in file 2 will correspond to the sense direction of the transcript (will indicate the 5' ends of the fragments that came from the transcript).

Post 3 (~2015/10) :

OP : I've got Illumina sequencing reads that generated by the TruSeq Stranded mRNA Sample Prep Kit.

Answer : For TruSeq kits (well all dUTP kits), read #2 dictates the strand.

Post 4 (~2016/12) :

If you're using Illumina's TruSeq stranded protocol, the first read of each pair should contain the sequence of the template strand, i.e., it is in the opposite direction of the coding sequence. Thus, you should set strandSpecific=2 in featureCounts.

Post 5 (~2017/02) :

OP : Featurecounts of Rsubread package has two options for stranded libraries: 1 (stranded) and 2 (reversely stranded)

Answer : fr-firststrand = reversely stranded (the R2 read is in the same direction than the RNA fragment)

### Interpretation ###

Post 1 defines Illumina Truseq Stranded mRNA as fr-firststrand & post 5 state that fr-firststrand = reversely stranded.
=> featureCounts strandSpecific = 2.

Post 2 : R2 sense direction of the transcript
Post 3 : R2 dictates the strand.
Post 4 : R2 in the same direction as the coding sequence.
=> R2 correspond to second strand synthesized, which correspond to original RNA fragment. Sounds like featureCounts strandSpecific = 2.

Edit (2017/11/26) : Additional info found in "Mapping RNA-seq Reads with STAR" (Curr Protoc Bioinformatics - 2016) :

For the protocols in which the 1 st read is on the opposite strand to the RNA molecule (such as Illumina stranded Tru-Seq), .str1. corresponds to the (−) strand and .str2. corresponds to the (+) strand.

ADD COMMENTlink modified 22 months ago by igor8.3k • written 22 months ago by erwan.scaon720
5
gravatar for igor
22 months ago by
igor8.3k
United States
igor8.3k wrote:

You are right. Every tool and kit has a different definition of strandness and it's difficult to remember them. Luckily, there are only two options. Check this comparison table to help make some sense out of it: https://github.com/igordot/genomics/blob/master/notes/rna-seq-strand.md

Since I deal with a lot of kits and that information may not be easily accessible, I process RNA-seq data using both stand options every time. Based on the results, it should be obvious which is the correct strand (usually over 90% of the reads are on one strand).

ADD COMMENTlink modified 22 months ago • written 22 months ago by igor8.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 888 users visited in the last hour