Running Time For Cuffmerge For 250 .Gtf Files
1
0
Entering edit mode
12.0 years ago
bio monkey ▴ 40

I'm trying to calculate expression levels for a set of about 250 aligned SAM files (about 2gb each). I'm planning to write a script to run cufflinks on all of these, then write a script to merge all of these transcripts.gtf cufflinks output files. I was wondering how long this will take, and whether it will crash the machine? (12 cores, 16gb memory).

Also, I want to do CuffDiff for all 250 of these as well.. would this much data cause crash?

cufflinks cuffmerge gtf bam sam • 3.2k views
ADD COMMENT
0
Entering edit mode
12.0 years ago

Hi,

I would think that a 12 core machine should have at least has 24 GB RAM, 2GB/core, this looks little weird. The best answer would be try running the job and if you have SGE (sun grid engine) then you can specify a parameter -l -h_vmem=15G which means the system will kill your job, if it exceeds 15GB of RAM. Make a shell script to run cufflinks and merge on user inputted argument and try to release 4 jobs (1 per file) and in the cufflinks, give -p 4. By this, the system will be efficiently used and each file will use four cores.

On the level of automation, make another script to check the output of qstat, which tells you about the job status. The easiest would be using if statement on qstat output (skip header as 2 lines), so qstat | wc -l if this is <6, then schedule another job, till the last file. For me the cufflinks took on an average, 1 and a half hour on 8 cores/file. So, in your case I would assume 2.5-3.5 hours/file.

Cheers

ADD COMMENT

Login before adding your answer.

Traffic: 3825 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6