Question: STAR Create Custom Refence Genome
0
gravatar for b.verhagen
8 days ago by
b.verhagen0
b.verhagen0 wrote:

Dear all,

I want to use STAR to align my reads to a small reference genome (~12.500 UTRs). Although I have created a .fa-file of my library STAR is not able to generate a genome, as it is unable to read the file. I was wondering if anyone else has experienced the same problem? Or whether anyone has a solution to this problem?

Best, Bram Verhagen

Code:

./STAR --runMode genomeGenerate --genomeDir home/hub_tanenbaum/bverhagen/star-genome/ --genomeFastaFiles home/hub_tanenbaum/bverhagen/star-genome/sequence.fa
rna-seq star • 107 views
ADD COMMENTlink modified 7 days ago • written 8 days ago by b.verhagen0

As you are building an index with thousands of sequences, you will have to play with the value of --genomeChrBinNbits parameter - check STAR manual on how to set a proper value.

ADD REPLYlink written 8 days ago by h.mon21k

Now I wrote the following script script.sh):

#!/bin/bash 
/home/hub_tanenbaum/bverhagen /STAR/source/STAR --runMode genomeGenerate –genomeDir /hpc/hub_tanenbaum/Bram/ --genomeFastaFiles /hpc/hub_tanenbaum/Bram/sequence.fa

And submitted it with the following command:

qsub -q all.q -l h_vmem=30G -S /bin/bash -cwd -o TEST  script.sh

I get the following error, although a folder with the name STARtemp is made:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
/opt/sge/default/spool/n0085/job_scripts/9137340: line 2: 89103 Aborted                 (core dumped) /home/hub_tanenbaum/bverhagen/STAR/source/STAR --runMode genomeGenerate --genomeDir /hpc/hub_tanenbaum/Bram --genomeFastaFiles /hpc/hub_tanenbaum/Bram/sequence.fa --runThreadN 8

Moreover if I schange the script to:

#!/bin/bash 
$STAR --runMode genomeGenerate –genomeDir /hpc/hub_tanenbaum/Bram/ --genomeFastaFiles /hpc/hub_tanenbaum/Bram/sequence.fa

With $STAR being: /home/hub_tanenbaum/bverhagen /STAR/source/STAR

I receive an error stating that the path is not found. Do I need to specify the path in another way? In the script?

ADD REPLYlink modified 7 days ago by genomax59k • written 7 days ago by b.verhagen0

Please use the code button to format code, commands and such. 101010 Button

Also, do not post questions on the space reserved for answers.

ADD REPLYlink written 7 days ago by h.mon21k

How did you acquire STAR? Did you download the executable directly or compiled it from source? If you got the executable did you get the one appropriate for your operating system?

ADD REPLYlink written 7 days ago by genomax59k

With $STAR being: /home/hub_tanenbaum/bverhagen /STAR/source/STAR

I receive an error stating that the path is not found. Do I need to specify the path in another way? In the script?

  1. Did you set the variable up as STAR=/home/hub_tanenbaum/bverhagen/STAR/source/STAR? Probably not a great idea to use a variable name as the program executable.
  2. /home/hub_tanenbaum/bverhagen /STAR/source/STAR As posted there is a space between bverhagen and next /STAR.
  3. You are running STAR with 8 threads but there is no corresponding request for cores in the job scheduler.
ADD REPLYlink modified 7 days ago • written 7 days ago by genomax59k
2
gravatar for genomax
8 days ago by
genomax59k
United States
genomax59k wrote:

You are likely missing a / before home/hub_tanenbaum/bverhagen/star-genome/ (in both places) unless the path is relative to whereever you are running this from and there is a home directory in there.

ADD COMMENTlink written 8 days ago by genomax59k

Thanks,

However after changing this, I obtain the following:

Dec 04 14:43:26 ... starting to generate Genome files terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted

ADD REPLYlink written 8 days ago by b.verhagen0

See my comment above. Also, how much RAM memory do you have? And what is the size of the reference?

ADD REPLYlink modified 8 days ago • written 8 days ago by h.mon21k

I am running the script on a HPC server using a mobile terminal (5GB RAM on server). The library is not very big though, roughly 8000 sequences.

ADD REPLYlink written 8 days ago by b.verhagen0

If you got that error then allocate more RAM for the job. Use 20G if possible. Sometimes it is hard to predict what star may need. Take into consideration @h.mon's comment on your original question.

ADD REPLYlink written 8 days ago by genomax59k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1661 users visited in the last hour