Question: number of threads for cufflinks transcript assembly
gravatar for hassan giahi
2.6 years ago by
hassan giahi0 wrote:

Hello everyone I'm using cufflinks to build a transcript.gtf file on ubuntu server. i used this command for building it :

sudo ./cufflinks --max-bundle-length 10000000 -o /mnt/cuffliks_denovo/SRR500880/ -p 16 -g /mnt/Homo_sapiens.GRCh37.87.gtf -u /mnt/Maping_output/SRR500880/SRR500880Aligned.sortedByCoord.out.bam

but my problem is : when i used -p 16 or -p 160 or -p 2000 or even ignored it,There was no change at the time of the analysis... in the other words, It does not matter that those Argument is used or not,and it take a long times(about 4-8 days)... ... what is my problem ??? Where is my mistake?

My instance details is:

  • The bam file size used is 5 gigabytes
  • i used a instance with 64G RAM and 1T H.D.D and 16CPU.

Thank you for advising me

rna-seq assembly • 1.6k views
ADD COMMENTlink modified 2.6 years ago by colindaven1.9k • written 2.6 years ago by hassan giahi0

You should not run cufflinks as sudo.

ADD REPLYlink written 2.6 years ago by h.mon29k

But can how that be related to slowness? Any further explanations are highly appreciated.

ADD REPLYlink written 2.6 years ago by lakhujanivijay4.7k

No relation to speed at all, but it may save you from a lot of troubles - up to reinstalling the operating system kind of troubles.

ADD REPLYlink written 2.6 years ago by h.mon29k

also you have --max-bundle-length 10000000 what is the rationale for that? May cause the slowdown as well.

And indeed don't run bioinformatics code as sudo...

ADD REPLYlink written 2.6 years ago by Istvan Albert ♦♦ 82k
gravatar for WouterDeCoster
2.6 years ago by
WouterDeCoster42k wrote:

The manual doesn't mention a -p argument.

You should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using kallisto or salmon.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by WouterDeCoster42k
gravatar for colindaven
2.6 years ago by
Hannover Medical School
colindaven1.9k wrote:

Cufflinks does have a -p argument, alternatively --no-of-threads

cufflinks v2.2.1
linked against Boost version 104700
Usage:   cufflinks [options] <hits.sam>
General Options:
-o/--output-dir              write all output files to this directory              [ default:     ./ ]
-p/--num-threads             number of threads used during analysis                [ default:      1 ]

If you have a 16 core server do not increase the number of threads to use over 16. 1600 or 2000 are then clearly detrimental.

I have never used Cufflinks with more than 4-8 threads, but you can run many samples in parallel. It is generally quite a quick program, but I did have issues with some very deeply sequenced samples once (>200m reads).

If you really need quick results you could split the BAM by chromosome and run each cufflinks on a single BAM, then recombine the GTF afterwards.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by colindaven1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 852 users visited in the last hour