Question: de novo RNA seq, combining left and right reads to create assembly
0
gravatar for nikelle.petrillo
3.4 years ago by
Providence College, Providence, RI
nikelle.petrillo100 wrote:

Hello all,

I am performing de novo transcriptome assembly on my paired end, illumina reads. I am going to create a transcriptome assembly using trinity, however, how do i go about concatenating my left and right reads? Do i need to merge together each left and right reads first and then somehow combine all of my samples? I am not familiar with the unix code that will do this.

Thanks for the help! Nikelle

ADD COMMENTlink modified 3.4 years ago by Damian Kao15k • written 3.4 years ago by nikelle.petrillo100

Why are you looking to concatenate/merge your R1/R2 reads? No point in concatenating them and they can't be merged unless you know that the insert size will allow you to do so (i.e. sequencing length > insert size).
As Trinity page recommends this is all you need to do

Trinity --seqType fq --left reads_1.fq --right reads_2.fq --CPU 6 --max_memory 20G
ADD REPLYlink written 3.4 years ago by genomax71k

Thank you,

So as Damian Kao said below me, should i be inputting all of my left reads into that trinity code, and then all of my right reads as well?

ADD REPLYlink written 3.4 years ago by nikelle.petrillo100

As @Damian pointed out in the example below make sure they are in the same order for --left and --right. It would also help if they have reads in same order (if you did any trimming then hopefully you used a PE aware trimmer) in each pair of files.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by genomax71k

By the way, Trinity will only use PE reads as extra information for bundling reads during the Chrysalis stage (unless something has changed in the newer versions that I am not aware of). It doesn't attempt do any scaffolding with them.

ADD REPLYlink written 3.4 years ago by Damian Kao15k

Thanks! And the code below, is it correct that there are no spaces before and after adding in a comma?

Nikelle

ADD REPLYlink written 3.4 years ago by nikelle.petrillo100
3
gravatar for Damian Kao
3.4 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

You can give trinity a list of files separate by commas. For example:

Trinity --left left_1.fq,left_2.fq,left_3.fq --right right_1.fq,right_2.fq,right_3.fq --CPU 6 --max_memory 20G....
ADD COMMENTlink written 3.4 years ago by Damian Kao15k

thank you! and so this will generate one assembly using all of those left and right reads?

ADD REPLYlink written 3.4 years ago by nikelle.petrillo100

yes that command will allow you to use all the fastq files listed

ADD REPLYlink written 3.4 years ago by Damian Kao15k

Great, thanks very much.

ADD REPLYlink written 3.4 years ago by nikelle.petrillo100

Hi Damian,

So I ran my trinity code in screen so that it could run in the background.

Trinity version: v2.0.6
-ERROR: couldn't run the network check to confirm latest Trinity software version.

Wednesday, April 27, 2016: 15:04:21     CMD: java -Xmx64m -jar /usr/local/bin/trinityrnaseq-2.0.6/util/support_scripts/ExitTester.jar 0
Wednesday, April 27, 2016: 15:04:21     CMD: java -Xmx64m -jar /usr/local/bin/trinityrnaseq-2.0.6/util/support_scripts/ExitTester.jar 1
Wednesday, April 27, 2016: 15:04:21     CMD: mkdir -p /home/richardsonlab/AMMA_transcripts/allsampletrinitysassembly
Wednesday, April 27, 2016: 15:04:21     CMD: mkdir -p /home/richardsonlab/AMMA_transcripts/allsampletrinitysassembly/chrysalis

-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------

Converting input files. (in parallel)Wednesday, April 27, 2016: 15:04:21        CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/npetrill/Trimmomatic-0-3.35/pairedoutput1 >> left.fa 2> /home/npetrill/Trimmomatic-0-3.35/pairedoutput1.readcount
Wednesday, April 27, 2016: 15:04:21     CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastooNIKELLEs-MacBook-Pro:~ nikellepetrillo$ 
t.fa 2> /home/npetrill/Trimmomatic-0-3.35/R1V29pairedoutput30_1.readcount
Thread 2 terminated abnormally: Error, cmd: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastoolNIKELLEs-MacBook-Pro:~ nikellepetrillo$ 
/home/npetrill/Trimmomatic-0-3.35/pairedoutput2.readcount  died with ret 256 at /usr/local/bin/trinityNIKELLEs-MacBook-Pro:~ nikellepetrillo$ 
packet_write_wait: Connection to 172.16.45.3: Broken pipe

It looks like I'm getting back some errors... Do you know what to make of this? Is Trinity still running/how do i check this?

ADD REPLYlink modified 3.4 years ago by genomax71k • written 3.4 years ago by nikelle.petrillo100

I think one of your input files is possibly named wrong. Does this file exist?

/home/npetrill/Trimmomatic-0-3.35/pairedoutput1

The Trinity Google groups might know more about these errors: https://groups.google.com/forum/#!forum/trinityrnaseq-users

ADD REPLYlink written 3.4 years ago by Damian Kao15k

Thanks for your quick reply. Yes it does exist. Maybe I will try that group!

ADD REPLYlink written 3.4 years ago by nikelle.petrillo100

Can you post the command you actually used? Perhaps there is something wrong there.

ADD REPLYlink written 3.4 years ago by genomax71k

Its very long! But here it is:

--seqType fq --left /home/npetrill/Trimmomatic-0-3.35/pairedoutput1,/home/npetrill/Trimmomatic-0-3.35/R1V29pairedoutput30_1,/home/npetrill/Trimmomatic-0-3.35/R1V34pairedoutput30_1,/home/npetrill/Trimmomatic-0-3.35/R1V39pairedoutput30_1,/home/npetrill/Trimmomatic-0-3.35/R1V42pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR1V49/R1V49pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR1V60/R1V60pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR1V6/R1V6pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V13/R2V13pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V18/R2V18pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V1/R2V1pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V33/R2V33pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V46/R2V46pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V57/R2V57pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V59/R2V59pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V5/R2V5pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V62/R2V62pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V14/R3V14pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V16/R3V16pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V20/R3V20pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V26/R3V26pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V31/R3V31pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V54/R3V54pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V14/R4V14pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V23/R4V23pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V25/R4V25pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V54/R4V54pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V58/R4V58pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V6/R4V6pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V9/R4V9pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V19/R5V19pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V28/R5V28pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V33/R5V33pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V4/R5V4pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V54/R5V54pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V8/R5V8pairedoutput30_1 --right /home/npetrill/Trimmomatic-0-3.35/pairedoutput2,/home/npetrill/Trimmomatic-0-3.35/R1V29pairedoutput30_2,/home/npetrill/Trimmomatic-0-3.35/R1V34pairedoutput30_2,/home/npetrill/Trimmomatic-0-3.35/R1V39pairedoutput30_2,/home/npetrill/Trimmomatic-0-3.35/R1V42pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR1V49/R1V49pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR1V60/R1V60pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR1V6/R1V6pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V13/R2V13pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V18/R2V18pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V1/R2V1pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V33/R2V33pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V46/R2V46pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V57/R2V57pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V59/R2V59pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V5/R2V5pairedoutput30_2 --CPU 6 --max_memory 60G --output /home/richardsonlab/AMMA_transcripts/allsampletrinitysassembly
ADD REPLYlink modified 3.4 years ago by genomax71k • written 3.4 years ago by nikelle.petrillo100

The first step of Trinity is to convert and concatenate all your left/right input files into a huge .fasta file using fastools. It is possible that Trinity needs to know whether your input files are .gz compressed or not first so it can then use the proper fastool command to concatenate the input files. You might need to indicate whether it is compressed or not with a fastsq.gz file extension.

ADD REPLYlink written 3.4 years ago by Damian Kao15k

You also seem to have omitted some parts of the error message that you had pasted in the post above. Perhaps a vital clue is missing in that text. Can you edit that post and re-paste the error output?

ADD REPLYlink written 3.4 years ago by genomax71k

unfortunately, the error message exceeds the character limit.

I ended up terminating the run and started a new run. I am now getting a new error message:

Thursday, April 28, 2016: 11:15:27 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR3V54/R3V54pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR3V54/R3V54pairedoutput30_1.readcount Thursday, April 28, 2016: 11:15:41 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V14/R4V14pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V14/R4V14pairedoutput30_1.readcount Thursday, April 28, 2016: 11:16:25 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V23/R4V23pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V23/R4V23pairedoutput30_1.readcount Thursday, April 28, 2016: 11:17:18 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V25/R4V25pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V25/R4V25pairedoutput30_1.readcount Thursday, April 28, 2016: 11:17:57 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V54/R4V54pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V54/R4V54pairedoutput30_1.readcount Thursday, April 28, 2016: 11:18:50 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V58/R4V58pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V58/R4V58pairedoutput30_1.readcount Thursday, April 28, 2016: 11:19:48 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V6/R4V6pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V6/R4V6pairedoutput30_1.readcount Thursday, April 28, 2016: 11:20:43 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V9/R4V9pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V9/R4V9pairedoutput30_1.readcount Thursday, April 28, 2016: 11:21:34 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V19/R5V19pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V19/R5V19pairedoutput30_1.readcount Thursday, April 28, 2016: 11:22:26 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V28/R5V28pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V28/R5V28pairedoutput30_1.readcount Thursday, April 28, 2016: 11:23:10 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V33/R5V33pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V33/R5V33pairedoutput30_1.readcount Thursday, April 28, 2016: 11:23:58 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V4/R5V4pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V4/R5V4pairedoutput30_1.readcount Thursday, April 28, 2016: 11:25:00 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V54/R5V54pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V54/R5V54pairedoutput30_1.readcount Thursday, April 28, 2016: 11:26:07 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V8/R5V8pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V8/R5V8pairedoutput30_1.readcount Use of uninitialized value in array dereference at /usr/local/bin/trinityrnaseq-2.0.6/Trinity line 1212. Trinity run failed. Must investigate error above.

unfortunately, i can't see the top part of this error message since I cannot scroll up while in screen mode.

ADD REPLYlink written 3.4 years ago by nikelle.petrillo100

Run the trinity command like this. It will capture stdout and stderr messages (stuff that scrolls off screen) to files. Then you can show us the relevant errors from those files.

  $  trinity commands  > log_file 2> err_file
ADD REPLYlink written 3.4 years ago by genomax71k

Thanks, do you know where i can find those error files?

ADD REPLYlink written 3.4 years ago by nikelle.petrillo100

They should be in the directory from where you ran the trinity command.

ADD REPLYlink written 3.4 years ago by genomax71k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 785 users visited in the last hour