Question: STAR genome index Error
0
gravatar for sallyey2
5 months ago by
sallyey20
sallyey20 wrote:

Hi,

I am running my genome indexing STAR code in a company's cluster:

STAR --runThreadN 1 --runMode genomeGenerate --genomeDir /home/id/ 
--genomeFastaFiles /home/id/GRCm38.primary_assembly.genome.fa --sjdbGTFfile /home/id/gencode.vM24.primary_assembly.annotation.gtf --sjdbOverhang 100

But, I keep running into the same error saying,

Apr 01 08:44:03 ..... started STAR run Apr 01 08:44:03 ... starting to generate Genome files Apr 01 08:45:09 ... starting to sort Suffix Array. This may take a long time... Apr 01 08:45:25 ... sorting Suffix Array chunks and saving them to disk... Apr 01 09:10:06 ... loading chunks from disk, packing SA...

EXITING because of FATAL problem while generating the suffix array The number of indices read from chunks = 2269570266 is not equal to expected nSA=5305567000 SOLUTION: try to re-run suffix array generation, if it still does not work, report this problem to the author

Apr 01 09:10:49 ...... FATAL ERROR, exiting

After I googled about the problem, I thought the low number of thread could be a problem so I already increased to 4. Also, I assigned 36GB to the cluster to run STAR. But still I got the same error message.. Could you help me know how to resolve this issue? Thank you so much.

index rna-seq star • 278 views
ADD COMMENTlink modified 5 months ago by Biostar ♦♦ 20 • written 5 months ago by sallyey20

The command looks ok. Also 36 GB is enough. But try increase the memory, just for sake of it. Was there some problem in downloading the genome? Did you unzip it after downloading?

ADD REPLYlink modified 5 months ago • written 5 months ago by piyushjo520

I did unzip the two files. I confirmed that I am able to see gtf and fa files in my /home/id/. Now I am running the code with the number of thread 8... How many of thread do you recommend me to try? (Just found out I got the same error message with 8 of thread)

The source of the two files:

  1. Genome sequence, primary assembly (GRCm38) ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M24/GRCm38.primary_assembly.genome.fa.gz

  2. Comprehensive gene annotation ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M24/gencode.vM24.primary_assembly.annotation.gtf.gz

ADD REPLYlink modified 5 months ago • written 5 months ago by sallyey20

I don't think number of threads are issue, it will just make the program run faster. And I mentioned about the memory and not thread.

It could be that you don't have enough free space. Someone got the same error, probably due to less than 100GB space. Look here: https://github.com/alexdobin/STAR/issues/534

ADD REPLYlink written 5 months ago by piyushjo520

Thanks. I already referred to the page a few hours ago. I am still running my code with different size of memory. Fingers crossed..

ADD REPLYlink written 5 months ago by sallyey20
1

In my experience, I have successfully generated index on my desktop machine which had 40 GB RAM. Are you running it on your cloud? I hope you have enough space in the drive. That's the only suggest Alex (the Creator) himself mentioned which did the trick.

ADD REPLYlink written 5 months ago by piyushjo520
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1069 users visited in the last hour