Tophat2 Library Type Choice
0
0
Entering edit mode
9.5 years ago
Malia ▴ 10

Hello,

I am trying to work out the kinks in my RNAseq alignment pipeline. I am looking at total RNA isolated from Arabidopsis. The ribosomal RNA is removed with the Ribo-Zero kit, and Illumina TruSeq stranded library preparation is performed before 50 bp single-ended sequencing on a Hi-Seq 2000 instrument.

My question stems from unexpected results gained from the Bowtie2/Tophat2 alignment to the TAIR10 scaffold. According to Illumina and the Tophat manual, I should use the -fr-firststrand argument. However, I ran the alignment three different ways (unstranded, fr-firststrand, fr-secondstrand) and compared the junctions.bed file, as also suggested by the Tophat manual. "The method that produces the most junctions should be the one used."

Unfortunately, the "unstranded" method proved to have the most junctions.

Both fr-firststrand and fr-secondstrand methods yielded the same number of junctions (80023, as seen below).

...
Chr5    26963303    26963527    JUNC00080020    3   -   26963303    26963527    255,0,0 2   32,19   0,205
Chr5    26964985    26965229    JUNC00080021    2   -   26964985    26965229    255,0,0 2   19,32   0,212
Chr5    26968163    26968606    JUNC00080022    14  -   26968163    26968606    255,0,0 2   32,49   0,394
Chr5    26969624    26969961    JUNC00080023    9   -   26969624    26969961    255,0,0 2   46,47   0,290

The output from fr-unstranded surprised me, as it also included chloroplast and mitochondrial junctions, unseen in the firststrand and secondstrand output. What was more unusual is that the number of nuclear junctions was higher than the other two methods (81566).

...
Chr5    26964985    26965229    JUNC00081564    2   -   26964985    26965229    255,0,0 2   19,32   0,212
Chr5    26968163    26968606    JUNC00081565    14  -   26968163    26968606    255,0,0 2   32,49   0,394
Chr5    26969624    26969961    JUNC00081566    9   -   26969624    26969961    255,0,0 2   46,47   0,290
...
ChrC    136307  137122  JUNC00081589    1   -   136307  137122  255,0,0 2   21,29   0,786
ChrC    136674  137586  JUNC00081590    39  +   136674  137586  255,0,0 2   49,49   0,863
ChrC    146566  150110  JUNC00081591    1   +   146566  150110  255,0,0 2   40,10   0,3534
ChrM    40535   41977   JUNC00081592    96  +   40535   41977   255,0,0 2   49,49   0,1393
ChrM    42931   59160   JUNC00081593    1   -   42931   59160   255,0,0 2   9,41    0,16188
ChrM    61080   61190   JUNC00081594    1   -   61080   61190   255,0,0 2   28,22   0,88
ChrM    189891  303492  JUNC00081595    2   +   189891  303492  255,0,0 2   19,31   0,113570
ChrM    288059  289051  JUNC00081596    26  +   288059  289051  255,0,0 2   49,49   0,943

So my question: should I go with fr-firststrand as recommended by Illumina?

Or use unstranded since it yields more junctions?

Thanks for your suggestions and advice!

~Malia

Tophat2 RNA-Seq library alignment • 3.5k views
ADD COMMENT
1
Entering edit mode

It's not about how many junction reads you are getting. If you know that the library is strand specific then you should specify it for correct results.

ADD REPLY
0
Entering edit mode

Oh Sorry Ashutosh Pandey , I just saw that you also had commented similar.

ADD REPLY
0
Entering edit mode

No problem Manu. It happens with me too all the time.

ADD REPLY
0
Entering edit mode

Great, thank you all for your help!

ADD REPLY
1
Entering edit mode

The method that produces the most junctions should be the one used.

I must say that this is not good advice. What they probably mean is that the right choice usually produces the most alignments but there absolutely no reason to guess. You have the alignment then inspect them with IGV or samtools - it will tell you which reads comes from which file - and that is the answer to what protocol is used.

ADD REPLY
0
Entering edit mode

As you have written above that you sequenced stranded total RNA, then you must specify it. When I was working with Chimeric transcripts from tophat-fusion then many of junctions were false positive.

ADD REPLY

Login before adding your answer.

Traffic: 2289 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6