Question: How should I begin to assemble my transcripts?
0
gravatar for adampepper313
2 days ago by
adampepper3130 wrote:

Hi,

I am working on a project and I have two paired-end reads ending in .fq. My question is, I wish to assemble the transcripts and to begin that I tried to use Trinity. But I don’t feel like I am on the right track, any suggestions would very munch appreciated!!

I am running my project on command-line.

sequencing rna-seq assembly • 93 views
ADD COMMENTlink modified 2 days ago by Michael Dondrup48k • written 2 days ago by adampepper3130

If you are working on a cluster then it would be best to alert the sys admins of the fact that install of trinity is missing a program needed. They should be able to fix this easily.

While the conda option mentioned by @Michael will generally work, it may fail if your home directory is not available on cluster nodes.

ADD REPLYlink written 2 days ago by genomax92k

Hi @genomax,

The reads that I have in my fastq file as I have previously mentioned are paired-end reads from a previously unsequenced transcriptome animal species with read length 80 and insert size 300.

Would these two files require and prior modification before running it through Trinity?

ADD REPLYlink written 1 day ago by adampepper3130
1
gravatar for Michael Dondrup
2 days ago by
Bergen, Norway
Michael Dondrup48k wrote:

I think you are on the right track, running long running multi-parameter scripts directly from the commandline can be cumbersome and after a while you never know which command actually succeeded. A hint to make your life easier: make a simple shell script wrapper for the trinity call. That way, you also document the parameters used. A step you have to do before is some quality control to check if you need to trim your reads. I should maybe stress that running a QC process is the point to start from, not running the assembly outright. Run fastQC on your files first to see if the quality is high and whether there is adapter contamination.

Here is a simple command to start running Trinity from, without using trimming

#!/bin/sh
set -eu

TRINITYCMD=Trinity # adapt this if Trinity is not in your path
# you need to adapt max_memory and CPU to your server 
$TRINITYCMD   --seqType fq --max_memory 500G --left $1 --right $2 --CPU 60 --output Trinity-1 
# change the output directory each time you run something fundamental, like adding trimmomatic option

Save the script as run_trinity.sh, make it executable and run it as:

nohup ./run_trinity.sh file1.fq file2.fq &

That way, the script will not be interrupted when you log out. Debug output is in nohup.out.

ADD COMMENTlink modified 2 days ago • written 2 days ago by Michael Dondrup48k

Hi @MichaelDondrup,

Thanks so much for your help! I did try running Trinity, but the command continuously keeps ending.

I’m receiving this message:

** NOTE: Latest version of Trinity is v2.11.0, and can be obtained at: website
which: no salmon in (/usr/local/trinityrnaseq-v2.11.0)

And then it just says Trinity Trinity-v2.11.0 requires salmon to be installed.

But I’ve previously used trinity before for other fq files and it worked perfectly.

I’m just curious, if I need to use STAR and/or Cuffdiffprior, or if I need to truncate my files?

ADD REPLYlink modified 1 day ago • written 2 days ago by adampepper3130

could you install via conda/bioconda instead? That might be easier and pull the dependencies for you.

ADD REPLYlink written 2 days ago by Michael Dondrup48k

Although that would have been a much easier alternative, I must carry everything out on command-line.

ADD REPLYlink written 1 day ago by adampepper3130

You can still do that after installing software via bio/conda see https://conda.io/projects/conda/en/latest/ and https://bioconda.github.io/ for what it can do.

Caveat: In principle, I do agree with genomax that a competent sysadmin should install software, and that is the best way for you to get installed software. However, that habit of having sysadmins install required packages has seemingly grown out of fashion and everyone is maintaining their own mess in their home directories.

ADD REPLYlink modified 4 hours ago • written 4 hours ago by Michael Dondrup48k

Hi @MichaelDondrup,

I tried to run Trinity, but I am constantly receiving an update message:

** NOTE: Latest version of Trinity is v2.11.0, and can be obtained at: https://github.com/trinityrnaseq/trinityrnaseq/releases

Followed by,

Trinity Trinity-v2.11.0 requires salmon to be installed.  Get it here: https://combine-lab.github.io/salmon/  at /usr/local/bin/Trinity line 3973.

I am just unsure how to add this download onto command line and how download salmon.

ADD REPLYlink modified 1 day ago • written 1 day ago by adampepper3130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2234 users visited in the last hour