number of threads for cufflinks transcript assembly
2
0
Entering edit mode
6.8 years ago

Hello everyone I'm using cufflinks to build a transcript.gtf file on ubuntu server. i used this command for building it :

sudo ./cufflinks --max-bundle-length 10000000 -o /mnt/cuffliks_denovo/SRR500880/ -p 16 -g /mnt/Homo_sapiens.GRCh37.87.gtf -u /mnt/Maping_output/SRR500880/SRR500880Aligned.sortedByCoord.out.bam

but my problem is : when i used -p 16 or -p 160 or -p 2000 or even ignored it,There was no change at the time of the analysis... in the other words, It does not matter that those Argument is used or not,and it take a long times(about 4-8 days)... ... what is my problem ??? Where is my mistake?

My instance details is:

  • The bam file size used is 5 gigabytes
  • i used a instance with 64G RAM and 1T H.D.D and 16CPU.

Thank you for advising me
Giahi

RNA-Seq rna-seq Assembly • 3.3k views
ADD COMMENT
2
Entering edit mode

You should not run cufflinks as sudo.

ADD REPLY
0
Entering edit mode

But can how that be related to slowness? Any further explanations are highly appreciated.

ADD REPLY
0
Entering edit mode

No relation to speed at all, but it may save you from a lot of troubles - up to reinstalling the operating system kind of troubles.

ADD REPLY
0
Entering edit mode

also you have --max-bundle-length 10000000 what is the rationale for that? May cause the slowdown as well.

And indeed don't run bioinformatics code as sudo...

ADD REPLY
1
Entering edit mode
6.8 years ago

The manual doesn't mention a -p argument.

You should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using kallisto or salmon.

ADD COMMENT
1
Entering edit mode
6.7 years ago

Cufflinks does have a -p argument, alternatively --no-of-threads

cufflinks v2.2.1
linked against Boost version 104700
Usage:   cufflinks [options] <hits.sam>
General Options:
-o/--output-dir              write all output files to this directory              [ default:     ./ ]
-p/--num-threads             number of threads used during analysis                [ default:      1 ]

If you have a 16 core server do not increase the number of threads to use over 16. 1600 or 2000 are then clearly detrimental.

I have never used Cufflinks with more than 4-8 threads, but you can run many samples in parallel. It is generally quite a quick program, but I did have issues with some very deeply sequenced samples once (>200m reads).

If you really need quick results you could split the BAM by chromosome and run each cufflinks on a single BAM, then recombine the GTF afterwards.

ADD COMMENT

Login before adding your answer.

Traffic: 2039 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6