Star for Soybean RNA-seq aligment
1
0
Entering edit mode
6.9 years ago
valopes ▴ 30

Hi all

It is my first time working with STAR and I need to generate the genome index. However I have just the gff3 file intead of gtf. My gff3 file shows gene, mRNA, CDS, five_prime_UTR and three_prime_UTR annotation. Plus, everytime I try to run it, I am exceeding the RAM usage.

I am try this command:

STAR --runMode genomeGenerate     \
          --genomeDir ./genomeDir       \
          --genomeFastaFiles genomeDir/Gmax_275_v2.0.fa  \
          --runThreadN 8      \
          --limitGenomeGenerateRAM 32000000000 \
          --genomeChrBinNbits 10
          --sjdbGTFfile genomeDir/Gmax_275_Wm82.a2.v1.gene.gff3   \
          --sjdbGTFtagExonParentGene ID  \
          --sjdbGTFfeatureExon CDS  \
          --sjdbOverhang 99

Am I doing it right? It looks it is not working for me.

Thank you all

RNA-Seq • 1.8k views
ADD COMMENT
0
Entering edit mode

Am I correct in assuming that the soy bean genome is really really large? How much RAM do you have?

ADD REPLY
0
Entering edit mode

The soybean genome has 20 chromosomes and an estimated size of 1,115 Mbp.

ADD REPLY
0
Entering edit mode
6.9 years ago
Michael 54k

Despite the name, the sjdbGTFfile parameter accepts GFF3, we have tried this without problems. Well, RAM usage was high with the default settings, if I remember correctly. We needed about 300GB for generating the index for our 670Mb/33k scaffolds genome. Generating an index for the Salmon genome required almost 1TB. There are parameter settings that should allow for much lower RAM usage. See: http://seqanswers.com/forums/showthread.php?t=27470

ADD COMMENT

Login before adding your answer.

Traffic: 2911 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6