Question

Differentiating between stranded and unstranded RNA-seq data by mapping to genome

1

Entering edit mode

7.7 years ago

amy.bashir ▴ 110

Hello everyone!

I have some paired-end RNA-seq data from a collaborator. I was told that they are "stranded", but nothing about whether it's RF or FR.

I pulled out 5 pairs of reads (read 1 and read 2) from each of the four RNA-seq libraries and aligned them to the reference genome of the organism. I see that 50% of the read 1s map to the "+" strand, the other 50% to the "-" strand. Same as the read 2s.

Does this mean that the data is actually unstranded?

Any help would be greatly appreciated!

RNA-Seq • 3.8k views

ADD COMMENT • link updated 7.7 years ago by Friederike 9.0k • written 7.7 years ago by amy.bashir ▴ 110

score 1 · Answer 1 · 2017-10-24

1

Entering edit mode

7.7 years ago

Friederike 9.0k

I recommend you run QoRTs on your data. This will tell you a lot about the quality of the data including which protocol for the strandedness was used or if the data stems from unstranded library prep.

ADD COMMENT • link 7.7 years ago by Friederike 9.0k

0

Entering edit mode

Is it not possible to tell from the information that I provided?

ADD REPLY • link 7.7 years ago by amy.bashir ▴ 110

1

Entering edit mode

I cannot tell from the information you provide above, no. And honestly, I also wouldn't want to do it, not based on 5 read pairs. There are established tools for doing that basing their verdict on thousands of reads. If you don't want to use QoRTs, maybe RSeQC tickles your fancy (see this post).

ADD REPLY • link 7.7 years ago by Friederike 9.0k

0

Entering edit mode

Great, thank you very much for the clarification. I did try RSeQC, and got the output:

Fraction of reads explained by "1+-,1-+,2++,2--": 0.9803

From what I see around online, this corresponds to a RF library?

ADD REPLY • link 7.7 years ago by amy.bashir ▴ 110