Question: STAR aligner generating genomes takes long time
1
gravatar for xcalle91
3.9 years ago by
xcalle9120
European Union
xcalle9120 wrote:

Hi, I'm trying to use the STAR  ultrafast aligner, first I need to generate a genome to align to, so, as it is described in the manual I run this command line:

    /pathToStarDir/STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDir --genomeFastaFiles /path/to/genome/*.fa --runThreadN 11

 

GenomeDir contains one fasta file for each human chromosome.

I started it yesterday morning and after 24 hours it does not finish yet...

I wondering if I did something wrong and is stack in a never ending point. I show few post with similar problems but they did not come out with a solution that works for me.

 

I doing it in a computer with 8 cores i7 3.60GHz and 31.3 of memory ram

Thanks in advantage

rna-seq assembly • 9.9k views
ADD COMMENTlink modified 19 months ago by msimmer92210 • written 3.9 years ago by xcalle9120
3

What is your genome size? Star needs a lot of memory to generate and sort the suffix array. 32GB is very little RAM for this purpose and I just guess that the process started swapping. There are some parameters described in the documentation to lower the RAM requirements, e.g. -genomeChrBinNbits 12, see http://seqanswers.com/forums/showthread.php?t=27470 

ADD REPLYlink written 3.9 years ago by Michael Dondrup46k
2
gravatar for Chris Cole
3.9 years ago by
Chris Cole720
Scotland
Chris Cole720 wrote:

You have two potential problems here.

1) you're setting --runThreadN 11, but your machine only has 8 cores. You may have hyperthreading which allows 16 threads, but I don't find it's all that useful. Best to stick to a maximum --runThreadN 8

2) The human genome requires at least 30GB of free RAM to run. Indexing may require more. You're probably running out of memory, which then spills out into swap which is *VERY* slow. Get more RAM, be patient or use a different aligner which requires less memory.

ADD COMMENTlink written 3.9 years ago by Chris Cole720
0
gravatar for msimmer92
19 months ago by
msimmer92210
Uruguay
msimmer92210 wrote:

I´ve got a question. I am in the same situation, and I don´t know if it´s part of the normal process or I am putting too few threads. My supervisor recommended me to put --runThreadN 2 (two threads). In my case, my computer is a Macbook Pro, 17 inch Mid 2009, OS X El Capitan (v 10.11.6), Processor: 3.06 GHz Intel Core 2 Duo, Memory: 8GB 1067 MHz DDR3, Graphics NVIDIA GeForce 9400M 256 MB. Storage: 500GB (393 GB free). Memory: 2 memory slots of 4GB, each which accepts a 1067 MHz DDR3 memory module. How do you calculate the number of threads that is optimal for this?

ADD COMMENTlink written 19 months ago by msimmer92210
1

You'll probably need more memory before you need extra threads.

STAR is only fast at the expense of using a lot of memory. If you cannot acquire enough memory, then perhaps you should look at different aligners (e.g. bowtie2).

How big is your genome? If you are working with a human genome, the author of STAR (alexdobin) says that you can get by with 16GB of RAM in sparse mode (at the expense of speed), but you would need more like 30GB using the defaults. (see http://seqanswers.com/forums/showthread.php?t=27470)

ADD REPLYlink written 14 months ago by Christopher Bottoms190

It was human genome. I ended up moving to the lab's cluster to use STAR without any problems. Now that you say this, I understand why it didn´t work properly at my personal computer. Thank you for your input and the link!

ADD REPLYlink written 12 months ago by msimmer92210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1719 users visited in the last hour