STAR Genome index Error
0
0
Entering edit mode
6 months ago
Prasanna • 0

I tried to run STAR command for RNAseq but I got the following error.

/home/pshekar/RNAseq/STAR-2.7.11a/source/STAR --runMode genomeGenerate \
> > --genomeDir GRCh38.79.chrom1 \
> > --genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa \
> > --sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf \
        /home/pshekar/RNAseq/STAR-2.7.11a/source/STAR --runMode genomeGenerate --genomeDir GRCh38.79.chrom1 --genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa --sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf --sjdbOverhang 62
> > --sjdbOverhang 62
*****!!!!! WARNING: --genomeSAindexNbases 14 is too large for the genome size=248956422, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 12*****
Oct 03 18:27:25 ... starting to sort Suffix Array. This may take a long time...
Oct 03 18:27:26 ... sorting Suffix Array chunks and saving them to disk...
Killed
STAR • 2.0k views
ADD COMMENT
1
Entering edit mode

How much memory does you machine have?

ADD REPLY
0
Entering edit mode

Agreed it's a memory problem, I'm wondering if using the appropriate genomeSAindexNbases value helps in this case.

ADD REPLY
0
Entering edit mode

From the documentation

For small genomes, the parameter --genomeSAindexNbases must to be scaled down, with a typical value of min(14, log2(GenomeLength)/2 - 1). For example, for 1 megaBase genome, this is equal to 9, for 100 kiloBase genome, this is equal to 7.

Using the size provided in the error, the value this person should be using is 13.

ADD REPLY
1
Entering edit mode

It's actually 12 - the creators have said on a forum to take the floor when rounding log2(GenomeLength)/2-1 which isn't well represented in the documentation.

https://github.com/alexdobin/STAR/issues/972

ADD REPLY
0
Entering edit mode
STAR --runMode genomeGenerate \
--genomeDir GRCh38.79.chrom1 \
--genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa \
--sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf \
--sjdbOverhang 62

This is the code i am using... What should i modify to prevent the error

ADD REPLY
1
Entering edit mode

add the flag --genomeSAindexNbases 12

ADD REPLY
0
Entering edit mode
[pshekar@login002 RNAseq]$ /home/pshekar/RNAseq/STAR-2.7.11a/source/STAR --runMode genomeGenerate \
> --genomeDir GRCh38.79.chrom1 \
> --genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa \
> --sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf \
> --genomeSAindexNbases 12
        /home/pshekar/RNAseq/STAR-2.7.11a/source/STAR --runMode genomeGenerate --genomeDir GRCh38.79.chrom1 --genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa --sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf --genomeSAindexNbases 12
        STAR version: 2.7.11a   compiled: 2023-10-03T18:15:28-04:00 login002.palmetto.clemson.edu:/home/pshekar/RNAseq/STAR-2.7.11a/source
Oct 04 10:56:22 ..... started STAR run
Oct 04 10:56:22 ... starting to generate Genome files
Oct 04 10:56:25 ..... processing annotations GTF
Oct 04 10:56:26 ... starting to sort Suffix Array. This may take a long time...
Oct 04 10:56:27 ... sorting Suffix Array chunks and saving them to disk...
Killed
[pshekar@login002 RNAseq]$ --sjdbOverhang 62
ADD REPLY
0
Entering edit mode

Why did you add it in the middle instead of at the end? Run this:

STAR --runMode genomeGenerate \
--genomeDir GRCh38.79.chrom1 \
--genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa \
--sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf \
--sjdbOverhang 62 \
--genomeSAindexNbases 12
ADD REPLY
0
Entering edit mode

Could you answer my question?

ADD REPLY
0
Entering edit mode

I have no idea... I am new to bio informatics.. I am sorry

ADD REPLY
0
Entering edit mode

Ask the system administrator. From your prompt, it looks like you're on a login node. DO NOT RUN high memory/long-running jobs on a login node. Contact your sysadmin about requesting a long running high memory node to run your job.

ADD REPLY
0
Entering edit mode

Can you tell me how to find it

ADD REPLY
0
Entering edit mode

I am using my college's cluster computer

ADD REPLY
0
Entering edit mode

Are you using a compute node or just the login node? If you're using a compute node, how much RAM did you request for allocation?

ADD REPLY
0
Entering edit mode

I am using through online cluster

ADD REPLY
0
Entering edit mode

What does that even mean? What is an "online cluster"?

ADD REPLY
0
Entering edit mode
[pshekar@login002 RNAseq]$ /home/pshekar/RNAseq/STAR-2.7.11a/source/STAR --runMode genomeGenerate \
> --genomeDir GRCh38.79.chrom1 \
> --genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa \
> --sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf \
> --genomeSAindexNbases 12
        /home/pshekar/RNAseq/STAR-2.7.11a/source/STAR --runMode genomeGenerate --genomeDir GRCh38.79.chrom1 --genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa --sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf --genomeSAindexNbases 12
        STAR version: 2.7.11a   compiled: 2023-10-03T18:15:28-04:00 login002.palmetto.clemson.edu:/home/pshekar/RNAseq/STAR-2.7.11a/source
Oct 04 10:56:22 ..... started STAR run
Oct 04 10:56:22 ... starting to generate Genome files
Oct 04 10:56:25 ..... processing annotations GTF
Oct 04 10:56:26 ... starting to sort Suffix Array. This may take a long time...
Oct 04 10:56:27 ... sorting Suffix Array chunks and saving them to disk...
Killed
[pshekar@login002 RNAseq]$ --sjdbOverhang 62
ADD REPLY
0
Entering edit mode

It still shows error

/home/pshekar/RNAseq/STAR-2.7.11a/source/STAR --runMode genomeGenerate \
--genomeDir GRCh38.79.chrom1 \
--genomeFastaFiles genome/Homo_sapiens.GRCh38.dna.chromosome.1.fa \
--sjdbGTFfile gtf/Homo_sapiens.GRCh38.79.chrom1.gtf \
--genomeSAindexNbases 12
--sjdbOverhang 62
ADD REPLY
0
Entering edit mode

Do you have write permissions to genomeDir specified above. It appears that your process is killed right after it tries to write to the disk location.

ADD REPLY
0
Entering edit mode

I didn't write permissions for those... How do I do that

ADD REPLY
0
Entering edit mode

If you execute touch GRCh38.79.chrom1/testfile do you get an error? If you do then you don't have write permissions to the directory.

This may be a good point to stop and familiarize yourself with unix command line, if you are not aware of unix file permissions. Simply issuing commands (without understanding the logic) is not a good way of doing analysis.

ADD REPLY
0
Entering edit mode

Thanks There is no error

ADD REPLY
0
Entering edit mode

Can you please suggest how can I modify the code ... Thanks!

ADD REPLY
0
Entering edit mode

Code itself looks fine. If you are trying to run this on a cluster without going through the job scheduler then it is possible that admins may have set something up to kill jobs that are run outside of the job scheduler. You will need to ask the local support if you are not familiar with how the cluster is setup.

Generally admins frown upon people running analysis jobs on login nodes (which is where you seem to be running this from). Jobs are meant to be run via a job scheduler (SLURM/SGE etc).

ADD REPLY
0
Entering edit mode

Thank you very much

ADD REPLY
0
Entering edit mode

Your code had a backslash instead of a forward slash (GRCh38.79.chrom1\testfile - I fixed it) - the touch test will not yield useful results with that.

ADD REPLY

Login before adding your answer.

Traffic: 1469 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6