Question: How to set wall time for trinity assembler?
0
gravatar for seta
23 months ago by
seta840
Sweden
seta840 wrote:

Dear all,

I got about 350 million PE reads for performing de novo transcriptome assembly using trinity on a server with a 240G RAM. I must submit the job in PBS, however I don't have any experience in working with such a large dataset and PBS. Could you please tell me how to set wall time for this job?  your help and sharing your own job script would be highly appreciated.

rna-seq next-gen assembly • 651 views
ADD COMMENTlink modified 23 months ago by Philipp Bayer4.0k • written 23 months ago by seta840
1
gravatar for Philipp Bayer
23 months ago by
Philipp Bayer4.0k
Australia/Perth/UWA
Philipp Bayer4.0k wrote:

Well, with PBS you submit everything in the header, here's my header for the University of Queensland's PBS server on barrine, it could be that your local server is slightly differently set up. For example, UQ has four types of possible values for server type, "any", "medium", "xl", "large". You'd have to ask your sys admins what your values are.

#!/bin/bash
#PBS -N copy
#PBS -l select=1:ncpus=1:mem=12G:NodeType=medium
#PBS -l walltime=23:00:00
#PBS -A your_group_id
trinity -some-parameters

Submit using "qsub scriptname.sh"

-N is the name, that one is just to keep track of your jobs more easily. -l is the various parameters, how much memory requested, how many CPUs, which node type. In your case you have a maximum mem of 240G, don't know how much you actually need. The less you use, the more likely your job is to crash, but the more memory is left for others. Lastly, -l walltime is important for you; in this case, it's set to 23 hours, in your case you probably need more. Try setting it to something large like 100 hours, but again, your system administrators set the maximum allowed time, which also depends on your node type.

your_group_id is your group ID for billing purposes, depends on how your local server handles that.

ADD COMMENTlink modified 23 months ago • written 23 months ago by Philipp Bayer4.0k

Thanks for your example. Some people prefer running trinity as follow to reduce time,

stage 1:  --no_run_chrysalis
stage 2: --no_run_quantifygraph
stage 3: --no_run_butterfly
stage 4:  (exclude an --no_... parameter)

 have you ever any experience about it, say it can affect on the resulting output?

ADD REPLYlink modified 23 months ago • written 23 months ago by seta840
1

I am pretty sure Trinity check points every step. So in case you run out of wall time, all you have to do is resubmit the same PBS script again and Trinity will pick up right where it ended in the previous run. It is too much hassle to run them separately (at least for me).

ADD REPLYlink written 23 months ago by arnstrm1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1028 users visited in the last hour