Running STAR
0
0
Entering edit mode
8 weeks ago
Chris ▴ 20

Hi bioinformaticians,

How can I verify that the indices files I created are correct?

When I start running alignment with:

STAR --runThreadN 8 \
--genomeDir /home/doanc2/hg38/hg38_index \


I got this error:

EXITING because of FATAL ERROR: could not open genome file /home/doanc2/hg38/hg38_index//genomeParameters.txt
SOLUTION: check that the path to genome files, specified in --genomeDir is correct and the files are present, and have user read permsissions


Thank you so much!

STAR • 737 views
1
Entering edit mode

What is the output of ls /home/doanc2/hg38/hg38_index/ and what was the command line to create the index?

0
Entering edit mode

The output of ls /hg38_index/

chrLength.txt      chrStart.txt      geneInfo.tab  SA_1   SA_2  SA_5  SA_8                      transcriptInfo.tab
chrNameLength.txt  exonGeTrInfo.tab  Log.out       SA_10  SA_3  SA_6  SA_9
chrName.txt        exonInfo.tab      SA_0          SA_11  SA_4  SA_7  sjdbList.fromGTF.out.tab


Command line to create the Index:

STAR --runThreadN 8 \
--runMode genomeGenerate \
--genomeDir /home/doanc2/hg38/hg38_index \
--genomeFastaFiles /home/doanc2/hg38/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa \
--sjdbGTFfile /home/doanc2/hg38/Homo_sapiens.GRCh38.107.gtf \
--sjdbOverhang 99


Thank you!

1
Entering edit mode

It seems incomplete. Did any errors come up or did the indexing job got killed? Any log messages or errors?

0
Entering edit mode

You are right. The indexing job got killed. But I still get 27Gb created so I try to align. I am rerunning the Index to let you know the exact error.

0
Entering edit mode

Yes, rerun. Indexing is an all or nothing thing, you cannot do anything with an incomplete index.

0
Entering edit mode

The last time I run, I got this:

client_loop: send disconnect: Broken pipe


after running for a quite long time. So I guess my inputs were incorrect.

1
Entering edit mode

How was this run and on which machine? Broken pipe is usually related to disconnection of a terminal session from a remote host. No, if inputs were wrong it would not even start building the index -- STAR is smart enough to check that up front.

0
Entering edit mode

This was run on a cluster using ssh. I am still waiting for the rerun:

Aug 01 10:03:54 ..... started STAR run
Aug 01 10:03:54 ... starting to generate Genome files
Aug 01 10:05:27 ..... processing annotations GTF
Aug 01 10:06:57 ... starting to sort Suffix Array. This may take a long time...
Aug 01 10:08:03 ... sorting Suffix Array chunks and saving them to disk...

Thank you so much!

1
Entering edit mode

Try to either submit it to the cluster scheduler if that exists, or at least run it via something like screen. These logs look ok, just wait until finished. Be sure though that you are not running this on the head node but really on a dedicated cluster node.

0
Entering edit mode

Aug 01 13:02:12 ..... started STAR run
Aug 01 13:02:12 ... starting to generate Genome files
Aug 01 13:03:46 ..... processing annotations GTF
Aug 01 13:05:17 ... starting to sort Suffix Array. This may take a long time...
Aug 01 13:06:22 ... sorting Suffix Array chunks and saving them to disk...
client_loop: send disconnect: Broken pipe

I made sure the script doesn't run on a head node by submitting the script to a grid engine. It has run for about 120 minutes which I think should be 30 minutes and stopped with the error above.

0
Entering edit mode

Hi ATpoint,

STAR has run for almost a day. This seems incorrect.

job-ID prior name user state submit/start at queue slots ja-task-ID

101045 0.60500 star doanc2 r 08/01/2022 17:32:17 all.q@fenn07 8