Question: STAR aligner generating genomes takes long time
1
gravatar for xcalle91
4.8 years ago by
xcalle9120
European Union
xcalle9120 wrote:

Hi, I'm trying to use the STAR  ultrafast aligner, first I need to generate a genome to align to, so, as it is described in the manual I run this command line:

    /pathToStarDir/STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDir --genomeFastaFiles /path/to/genome/*.fa --runThreadN 11

 

GenomeDir contains one fasta file for each human chromosome.

I started it yesterday morning and after 24 hours it does not finish yet...

I wondering if I did something wrong and is stack in a never ending point. I show few post with similar problems but they did not come out with a solution that works for me.

 

I doing it in a computer with 8 cores i7 3.60GHz and 31.3 of memory ram

Thanks in advantage

rna-seq assembly • 12k views
ADD COMMENTlink modified 2.5 years ago by msimmer92260 • written 4.8 years ago by xcalle9120
3

What is your genome size? Star needs a lot of memory to generate and sort the suffix array. 32GB is very little RAM for this purpose and I just guess that the process started swapping. There are some parameters described in the documentation to lower the RAM requirements, e.g. -genomeChrBinNbits 12, see http://seqanswers.com/forums/showthread.php?t=27470

ADD REPLYlink modified 9 months ago by RamRS30k • written 4.8 years ago by Michael Dondrup47k
2
gravatar for Chris Cole
4.8 years ago by
Chris Cole760
Scotland
Chris Cole760 wrote:

You have two potential problems here.

1) you're setting --runThreadN 11, but your machine only has 8 cores. You may have hyperthreading which allows 16 threads, but I don't find it's all that useful. Best to stick to a maximum --runThreadN 8

2) The human genome requires at least 30GB of free RAM to run. Indexing may require more. You're probably running out of memory, which then spills out into swap which is *VERY* slow. Get more RAM, be patient or use a different aligner which requires less memory.

ADD COMMENTlink written 4.8 years ago by Chris Cole760
0
gravatar for msimmer92
2.5 years ago by
msimmer92260
Uruguay
msimmer92260 wrote:

I´ve got a question. I am in the same situation, and I don´t know if it´s part of the normal process or I am putting too few threads. My supervisor recommended me to put --runThreadN 2 (two threads). In my case, my computer is a Macbook Pro, 17 inch Mid 2009, OS X El Capitan (v 10.11.6), Processor: 3.06 GHz Intel Core 2 Duo, Memory: 8GB 1067 MHz DDR3, Graphics NVIDIA GeForce 9400M 256 MB. Storage: 500GB (393 GB free). Memory: 2 memory slots of 4GB, each which accepts a 1067 MHz DDR3 memory module. How do you calculate the number of threads that is optimal for this?

ADD COMMENTlink written 2.5 years ago by msimmer92260
1

You'll probably need more memory before you need extra threads.

STAR is only fast at the expense of using a lot of memory. If you cannot acquire enough memory, then perhaps you should look at different aligners (e.g. bowtie2).

How big is your genome? If you are working with a human genome, the author of STAR (alexdobin) says that you can get by with 16GB of RAM in sparse mode (at the expense of speed), but you would need more like 30GB using the defaults. (see http://seqanswers.com/forums/showthread.php?t=27470)

ADD REPLYlink written 2.1 years ago by Christopher Bottoms190

It was human genome. I ended up moving to the lab's cluster to use STAR without any problems. Now that you say this, I understand why it didn´t work properly at my personal computer. Thank you for your input and the link!

ADD REPLYlink written 23 months ago by msimmer92260
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1648 users visited in the last hour