Question: number of threads for cufflinks transcript assembly
0
gravatar for hassan giahi
21 months ago by
IRAN
hassan giahi0 wrote:

Hello everyone I'm using cufflinks to build a transcript.gtf file on ubuntu server. i used this command for building it :

sudo ./cufflinks --max-bundle-length 10000000 -o /mnt/cuffliks_denovo/SRR500880/ -p 16 -g /mnt/Homo_sapiens.GRCh37.87.gtf -u /mnt/Maping_output/SRR500880/SRR500880Aligned.sortedByCoord.out.bam

but my problem is : when i used -p 16 or -p 160 or -p 2000 or even ignored it,There was no change at the time of the analysis... in the other words, It does not matter that those Argument is used or not,and it take a long times(about 4-8 days)... ... what is my problem ??? Where is my mistake?

My instance details is:

  • The bam file size used is 5 gigabytes
  • i used a instance with 64G RAM and 1T H.D.D and 16CPU.

Thank you for advising me
Giahi

rna-seq assembly • 1.1k views
ADD COMMENTlink modified 21 months ago by colindaven1.2k • written 21 months ago by hassan giahi0
2

You should not run cufflinks as sudo.

ADD REPLYlink written 21 months ago by h.mon24k

But can how that be related to slowness? Any further explanations are highly appreciated.

ADD REPLYlink written 21 months ago by Vijay Lakhujani4.0k

No relation to speed at all, but it may save you from a lot of troubles - up to reinstalling the operating system kind of troubles.

ADD REPLYlink written 21 months ago by h.mon24k

also you have --max-bundle-length 10000000 what is the rationale for that? May cause the slowdown as well.

And indeed don't run bioinformatics code as sudo...

ADD REPLYlink written 21 months ago by Istvan Albert ♦♦ 80k
1
gravatar for WouterDeCoster
21 months ago by
Belgium
WouterDeCoster38k wrote:

The manual doesn't mention a -p argument.

You should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using kallisto or salmon.

ADD COMMENTlink modified 21 months ago • written 21 months ago by WouterDeCoster38k
1
gravatar for colindaven
21 months ago by
colindaven1.2k
Hannover Medical School
colindaven1.2k wrote:

Cufflinks does have a -p argument, alternatively --no-of-threads

cufflinks v2.2.1
linked against Boost version 104700
Usage:   cufflinks [options] <hits.sam>
General Options:
-o/--output-dir              write all output files to this directory              [ default:     ./ ]
-p/--num-threads             number of threads used during analysis                [ default:      1 ]

If you have a 16 core server do not increase the number of threads to use over 16. 1600 or 2000 are then clearly detrimental.

I have never used Cufflinks with more than 4-8 threads, but you can run many samples in parallel. It is generally quite a quick program, but I did have issues with some very deeply sequenced samples once (>200m reads).

If you really need quick results you could split the BAM by chromosome and run each cufflinks on a single BAM, then recombine the GTF afterwards.

ADD COMMENTlink modified 21 months ago • written 21 months ago by colindaven1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1584 users visited in the last hour