Question: STAR index doesn't finish
0
gravatar for alvarocentron91
11 months ago by
alvarocentron9110 wrote:

Hello, I'm having some problems indexing my genome with STAR, I have used the following line code:

STAR --runThreadN 5 --runMode genomeGenerate --genomeDir STAR_index --genomeFastaFiles p3_t237631_Ust_maydi_v2GB.scaf.fa --sjdbGTFfile p3_t237631_Ust_maydi_v2GB.gtf --sjdbOverhang 88 --limitGenomeGenerateRAM 80000000000

I have modified the cores to 10, 20, 40 and the RAM to 40, 50, 60, 70, 80 GB and I always have the same problem, the program gets stuck in "finished successfully" but never ends.

Here it is the end of the Log.out:

STAR version=STAR_2.6.0c
STAR compilation time,server,dir=vie sep 7 11:05:55 CEST 2018 nodo00.ladon.local:/apps/ebuild/.local/easybuild/build/STAR/2.6.0c/foss-2018a/STAR-2.6.0c/source
    -------------------------------
##### Final effective command line:
/apps/ebuild/.local/easybuild/software/STAR/2.6.0c-foss-2018a/bin/STAR   --runMode genomeGenerate   --runThreadN 5   --genomeDir STAR_index   --genomeFastaFiles p3_t237631_Ust_maydi_v2GB.scaf.fa      --limitGenomeGenerateRAM 80000000000   --sjdbGTFfile p3_t237631_Ust_maydi_v2GB.gtf   --sjdbOverhang 88

##### Final parameters after user input--------------------------------:
versionSTAR                       20201
versionGenome                     20101   20200   
parametersFiles                   -   
sysShell                          -
runMode                           genomeGenerate
runThreadN                        5
runDirPerm                        User_RWX
runRNGseed                        777
genomeDir                         STAR_index
genomeLoad                        NoSharedMemory
genomeFastaFiles                  p3_t237631_Ust_maydi_v2GB.scaf.fa   
genomeChainFiles                  -   
genomeSAindexNbases               14
genomeChrBinNbits                 18
genomeSAsparseD                   1
genomeSuffixLengthMax             18446744073709551615
genomeFileSizes                   0   
genomeConsensusFile               -
readFilesType                     Fastx   
readFilesIn                       Read1   Read2   
readFilesPrefix                   -
readFilesCommand                  -   
readMatesLengthsIn                NotEqual
readMapNumber                     18446744073709551615
readNameSeparator                 /   
inputBAMfile                      -
bamRemoveDuplicatesType           -
bamRemoveDuplicatesMate2basesN    0
limitGenomeGenerateRAM            80000000000
limitIObufferSize                 150000000
limitOutSAMoneReadBytes           100000
limitOutSJcollapsed               1000000
limitOutSJoneRead                 1000
limitBAMsortRAM                   0
limitSjdbInsertNsj                1000000
outFileNamePrefix                 ./
outTmpDir                         -
outTmpKeep                        None
outStd                            Log
outReadsUnmapped                  None
outQSconversionAdd                0
outMultimapperOrder               Old_2.4
outSAMtype                        SAM   
outSAMmode                        Full
outSAMstrandField                 None
outSAMattributes                  Standard   
outSAMunmapped                    None   
outSAMorder                       Paired
outSAMprimaryFlag                 OneBestScore
outSAMreadID                      Standard
outSAMmapqUnique                  255
outSAMflagOR                      0
outSAMflagAND                     65535
outSAMattrRGline                  -   
outSAMheaderHD                    -   
outSAMheaderPG                    -   
outSAMheaderCommentFile           -
outBAMcompression                 1
outBAMsortingThreadN              0
outBAMsortingBinsN                50
outSAMfilter                      None   
outSAMmultNmax                    18446744073709551615
outSAMattrIHstart                 1
outSAMtlen                        1
outSJfilterReads                  All
outSJfilterCountUniqueMin         3   1   1   1   
outSJfilterCountTotalMin          3   1   1   1   
outSJfilterOverhangMin            30   12   12   12   
outSJfilterDistToOtherSJmin       10   0   5   10   
outSJfilterIntronMaxVsReadN       50000   100000   200000   
outWigType                        None   
outWigStrand                      Stranded   
outWigReferencesPrefix            -
outWigNorm                        RPM   
outFilterType                     Normal
outFilterMultimapNmax             10
outFilterMultimapScoreRange       1
outFilterScoreMin                 0
outFilterScoreMinOverLread        0.66
outFilterMatchNmin                0
outFilterMatchNminOverLread       0.66
outFilterMismatchNmax             10
outFilterMismatchNoverLmax        0.3
outFilterMismatchNoverReadLmax    1
outFilterIntronMotifs             None
outFilterIntronStrands            RemoveInconsistentStrands
clip5pNbases                      0   
clip3pNbases                      0   
clip3pAfterAdapterNbases          0   
clip3pAdapterSeq                  -   
clip3pAdapterMMp                  0.1   
winBinNbits                       16
winAnchorDistNbins                9
winFlankNbins                     4
winAnchorMultimapNmax             50
winReadCoverageRelativeMin        0.5
winReadCoverageBasesMin           0
scoreGap                          0
scoreGapNoncan                    -8
scoreGapGCAG                      -4
scoreGapATAC                      -8
scoreStitchSJshift                1
scoreGenomicLengthLog2scale       -0.25
scoreDelBase                      -2
scoreDelOpen                      -2
scoreInsOpen                      -2
scoreInsBase                      -2
seedSearchLmax                    0
seedSearchStartLmax               50
seedSearchStartLmaxOverLread      1
seedPerReadNmax                   1000
seedPerWindowNmax                 50
seedNoneLociPerWindow             10
seedMultimapNmax                  10000
seedSplitMin                      12
alignIntronMin                    21
alignIntronMax                    0
alignMatesGapMax                  0
alignTranscriptsPerReadNmax       10000
alignSJoverhangMin                5
alignSJDBoverhangMin              3
alignSJstitchMismatchNmax         0   -1   0   0   
alignSplicedMateMapLmin           0
alignSplicedMateMapLminOverLmate    0.66
alignWindowsPerReadNmax           10000
alignTranscriptsPerWindowNmax     100
alignEndsType                     Local
alignSoftClipAtReferenceEnds      Yes
alignEndsProtrude                 0   ConcordantPair   
alignInsertionFlush               None
peOverlapNbasesMin                0
peOverlapMMp                      0.1
chimSegmentMin                    0
chimScoreMin                      0
chimScoreDropMax                  20
chimScoreSeparation               10
chimScoreJunctionNonGTAG          -1
chimMainSegmentMultNmax           10
chimJunctionOverhangMin           20
chimOutType                       Junctions   
chimFilter                        banGenomicN   
chimSegmentReadGapMax             0
chimMultimapNmax                  0
chimMultimapScoreRange            1
chimNonchimScoreDropMin           20
sjdbFileChrStartEnd               -   
sjdbGTFfile                       p3_t237631_Ust_maydi_v2GB.gtf
sjdbGTFchrPrefix                  -
sjdbGTFfeatureExon                exon
sjdbGTFtagExonParentTranscript    transcript_id
sjdbGTFtagExonParentGene          gene_id
sjdbOverhang                      88
sjdbScore                         2
sjdbInsertSave                    Basic
varVCFfile                        -
waspOutputMode                    None
quantMode                         -   
quantTranscriptomeBAMcompression    1
quantTranscriptomeBan             IndelSoftclipSingleend
twopass1readsN                    18446744073709551615
twopassMode                       None
----------------------------------------


EXITING because of fatal ERROR: could noSep 19 12:48:03 ... starting to generate Genome files
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 0  "Um_chr01" chrStart: 0
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 1  "Um_chr02" chrStart: 2621440
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 2  "Um_chr03" chrStart: 4718592
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 3  "Um_chr04" chrStart: 6553600
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 4  "Um_chr05" chrStart: 7602176
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 5  "Um_chr06" chrStart: 9175040
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 6  "Um_chr07" chrStart: 10223616
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 7  "Um_chr08" chrStart: 11272192
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 8  "Um_chr09" chrStart: 12320768
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 9  "Um_chr10" chrStart: 13107200
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 10  "Um_chr11" chrStart: 13893632
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 11  "Um_chr12" chrStart: 14680064
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 12  "Um_chr13" chrStart: 15466496
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 13  "Um_chr14" chrStart: 16252928
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 14  "Um_chr15" chrStart: 17039360
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 15  "Um_chr16" chrStart: 17825792
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 16  "Um_chr17" chrStart: 18612224
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 17  "Um_chr18" chrStart: 19398656
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 18  "Um_chr19" chrStart: 20185088
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 19  "Um_chr20" chrStart: 20971520
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 20  "Um_chr21" chrStart: 21495808
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 21  "Um_chr22" chrStart: 22020096
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 22  "Um_chr23" chrStart: 22544384
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 23  "Um_scaf_contig_1.256" chrStart: 23068672
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 24  "Um_scaf_contig_1.264" chrStart: 23330816
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 25  "Um_scaf_contig_1.265" chrStart: 23592960
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 26  "Um_scaf_contig_1.271" chrStart: 23855104
Number of SA indices: 39283312
Sep 19 12:48:03 ... starting to sort Suffix Array. This may take a long time...
Number of chunks: 5;   chunks size limit: 78566624 bytes
Sep 19 12:48:04 ... sorting Suffix Array chunks and saving them to disk...
Writing 2698344 bytes into STAR_index/SA_4 ; empty space on disk = 16819015385088 bytes ... done
Writing 77908240 bytes into STAR_index/SA_3 ; empty space on disk = 16818816155648 bytes ...Writing 77601568 bytes into STAR_index/SA_0 ; empty space on disk = 16818884313088 bytes ...WritinWriting 77984464 bytes into STAR_index/SA_2 ; empty space on disk = 16818749046784 bytWritinWriting 78073880 bytes into STAR_index/SA_1 ; empty space on disk = 16818749046784 byt done
 done
 done
 done
Sep 19 12:48:53 ... loading chunks from disk, packing SA...
Sep 19 12:48:56 ... finished generating suffix array
Sep 19 12:48:56 ... generating Suffix Array index
Sep 19 12:49:03 ... completed Suffix Array index
Sep 19 12:49:03 ..... processing annotations GTF
Processing pGe.sjdbGTFfile=p3_t237631_Ust_maydi_v2GB.gtf, found:
        6786 transcripts
        9745 exons (non-collapsed)
        2951 collapsed junctions
Sep 19 12:49:03 ..... finished GTF processing
Sep 19 12:49:03   Loaded database junctions from the GTF file: p3_t237631_Ust_maydi_v2GB.gtf: 2951 total junctions

WARNING: long repeat for junction # 1350 : Um_chr06 1022586 1023311; left shift = 255; right shift = 2
Sep 19 12:49:03   Finished preparing junctions
Sep 19 12:49:03 ..... inserting junctions into the genome indices
Sep 19 12:49:04   Finished SA search: number of new junctions=2951, old junctions=0
Sep 19 12:49:05   Finished sorting SA indicesL nInd=1038370
Sep 19 12:49:05   Finished inserting junction indices
Sep 19 12:49:10   Finished SAi
Sep 19 12:49:10 ..... finished inserting junctions into genome
Sep 19 12:49:10 ... writing Genome to disk ...
Writing 24639575 bytes into STAR_index/Genome ; empty space on disk = 16818882215936 bytes ... done
SA size in bytes: 166326942
Sep 19 12:49:11 ... writing Suffix Array to disk ...
Writing 166326942 bytes into STAR_index/SA ; empty space on disk = 16819014336512 bytes ... done
Sep 19 12:49:13 ... writing SAindex to disk
Writing 8 bytes into STAR_index/SAindex ; empty space on disk = 16820313522176 bytes ... done
Writing 120 bytes into STAR_index/SAindex ; empty space on disk = 16820313522176 bytes ... done
Writing 1565873491 bytes into STAR_index/SAindex ; empty space on disk = 16820313522176 bytes ... done
Sep 19 12:49:36 ..... finished successfully
DONE: Genome generation, EXITING

I canceled the run and I have the files generated but I don't know if I can trust them :/

Thank you

rna-seq star • 832 views
ADD COMMENTlink written 11 months ago by alvarocentron9110
2

Can you separate the stdout and stderr output into two files instead of writing them to log.out? Do something like

STAR command 2> log.error > log.output

There is some indication of something going wrong but the output is not clearly captured.

EXITING because of fatal ERROR: could noSep 19 12:48:03

ADD REPLYlink modified 11 months ago • written 11 months ago by genomax70k

I'm not sure how to do that, however, I re-ran the program increasing the RAM up to 90GB and now the message:

EXITING because of fatal ERROR: could noSep 19 12:48:03 doesn't show anymore.

There is only one warning message:

WARNING: long repeat for junction # 1350 : Um_chr06 1022586 1023311; left shift = 255; right shift = 2

But the problem is still the same, the program doesn't end even when the last 2 lanes of the log are:

Sep 19 15:53:53 ..... finished successfully
DONE: Genome generation, EXITING

It has been stuck there for almost 30 min and I don't know if I must stop it or not. (All the files are generated)

ADD REPLYlink written 11 months ago by alvarocentron9110
1

Try running the command this way:

STAR --runThreadN 5 --runMode genomeGenerate --genomeDir STAR_index --genomeFastaFiles p3_t237631_Ust_maydi_v2GB.scaf.fa --sjdbGTFfile p3_t237631_Ust_maydi_v2GB.gtf --sjdbOverhang 88 --limitGenomeGenerateRAM 80000000000 2> log.error > log.output

Then show us what log.error and log.output file have in them.

Are you running the command directly on the command line or are you using a job scheduler of some sort?

ADD REPLYlink modified 11 months ago • written 11 months ago by genomax70k

There are no log.error files generated, and in the subsection "Log files" from STAR's manual there are no indications on how to generate them, just the log.out and the log.progress.out (which has not been generated).

I'm running the command directly in a Slurm job allocation.

The thing is that 1 week ago I didn't have this problem when I generated the index for the maize genome. I may try to generate the index with HISAT2 to know if the problem comes because of my data

ADD REPLYlink written 11 months ago by alvarocentron9110
1

Is it possible that even 80GB of RAM are not enough? Depending on your genome, STAR indexes can be very large and their creation might require even more.

ADD REPLYlink written 11 months ago by Martombo2.5k
1

Based on file names it appears to be a fungal genome so more than likely 80G should be enough.

ADD REPLYlink written 11 months ago by genomax70k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1623 users visited in the last hour