Hello,
My institute installed a new HPC cluster for bioinformatics and I am installing everything from scratch. Some people are still using STAR version 10a but I installed the latest (2.7.11b). Once there, I created the index with the same STAR I just downloaded, so it is all compatible within the new setup. This is how my reference folder looks like:
ls
chrLength.txt SA_105 SA_120 SA_135 SA_150 SA_165 SA_180 SA_195 SA_210 SA_225 SA_240 SA_255 SA_270 SA_285 SA_300 SA_315 SA_75 SA_90
chrNameLength.txt SA_106 SA_121 SA_136 SA_151 SA_166 SA_181 SA_196 SA_211 SA_226 SA_241 SA_256 SA_271 SA_286 SA_301 SA_316 SA_76 SA_91
chrName.txt SA_107 SA_122 SA_137 SA_152 SA_167 SA_182 SA_197 SA_212 SA_227 SA_242 SA_257 SA_272 SA_287 SA_302 SA_317 SA_77 SA_92
chrStart.txt SA_108 SA_123 SA_138 SA_153 SA_168 SA_183 SA_198 SA_213 SA_228 SA_243 SA_258 SA_273 SA_288 SA_303 SA_318 SA_78 SA_93
exonGeTrInfo.tab SA_109 SA_124 SA_139 SA_154 SA_169 SA_184 SA_199 SA_214 SA_229 SA_244 SA_259 SA_274 SA_289 SA_304 SA_319 SA_79 SA_94
exonInfo.tab SA_110 SA_125 SA_140 SA_155 SA_170 SA_185 SA_200 SA_215 SA_230 SA_245 SA_260 SA_275 SA_290 SA_305 SA_65 SA_80 SA_95
gencode.v41.primary_assembly.annotation.gtf SA_111 SA_126 SA_141 SA_156 SA_171 SA_186 SA_201 SA_216 SA_231 SA_246 SA_261 SA_276 SA_291 SA_306 SA_66 SA_81 SA_96
geneInfo.tab SA_112 SA_127 SA_142 SA_157 SA_172 SA_187 SA_202 SA_217 SA_232 SA_247 SA_262 SA_277 SA_292 SA_307 SA_67 SA_82 SA_97
GRCh38.primary_assembly.genome.fa SA_113 SA_128 SA_143 SA_158 SA_173 SA_188 SA_203 SA_218 SA_233 SA_248 SA_263 SA_278 SA_293 SA_308 SA_68 SA_83 SA_98
Log.out SA_114 SA_129 SA_144 SA_159 SA_174 SA_189 SA_204 SA_219 SA_234 SA_249 SA_264 SA_279 SA_294 SA_309 SA_69 SA_84 SA_99
SA_100 SA_115 SA_130 SA_145 SA_160 SA_175 SA_190 SA_205 SA_220 SA_235 SA_250 SA_265 SA_280 SA_295 SA_310 SA_70 SA_85 sjdbList.fromGTF.out.tab
SA_101 SA_116 SA_131 SA_146 SA_161 SA_176 SA_191 SA_206 SA_221 SA_236 SA_251 SA_266 SA_281 SA_296 SA_311 SA_71 SA_86 _STARtmp
SA_102 SA_117 SA_132 SA_147 SA_162 SA_177 SA_192 SA_207 SA_222 SA_237 SA_252 SA_267 SA_282 SA_297 SA_312 SA_72 SA_87 transcriptInfo.tab
SA_103 SA_118 SA_133 SA_148 SA_163 SA_178 SA_193 SA_208 SA_223 SA_238 SA_253 SA_268 SA_283 SA_298 SA_313 SA_73 SA_88
SA_104 SA_119 SA_134 SA_149 SA_164 SA_179 SA_194 SA_209 SA_224 SA_239 SA_254 SA_269 SA_284 SA_299 SA_314 SA_74 SA_89
When I tried to run the alignment, I got the error message:
"EXITING because of FATAL ERROR: could not open genome file /lhome/scu003/_TEST/Mel_test/Ref//genomeParameters.txt SOLUTION: check that the path to genome files, specified in --genomeDir is correct and the files are present, and have user read permissions"
But genomeParameters.txt is from older versions of STAR, right? it shouldn't be requested by the latest one. Or am I missing something? Also, the path to the genomeDir is properly set and the permissions are ok. So it's not that. (but , again, that file is not on the Ref folder, as you can see) I add the information of being a new HPC and me establishing everything from scratch in case there is something else I have to modify that I don't know about upfront.
Thank you!
P.S.: the files I used to generate the index were these:
# wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_41/gencode.v41.primary_assembly.annotation.gtf.gz
# wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_41/GRCh38.primary_assembly.genome.fa.gz
And this is the line for generating the index:
STAR --runThreadN 760 --runMode genomeGenerate --genomeDir /home/sc003/Ref --genomeFastaFiles ${FASTA} --sjdbGTFfile ${GTF} --sjdbOverhang 100
and this is the line for the alignment:
STAR --runThreadN 760 --genomeDir /home/scu003/Ref --readFilesIn /home/scu003/XX_R1.fastq //home/scu003/XX_R2.fastq --bamRemoveDuplicatesType UniqueIdentical --outFileNamePrefix /home/scu003/Mapping/XX --outTmpDir /home/scu003/Mapping/startmp/XX --outSAMtype BAM SortedByCoordinate --quantMode GeneCounts
I'm not sure your index generation process ran to completion - the
SA
needs to be a single file, not a bunch of what look like temporary files.oh, then definitely there was some issue there... I will see if there is some hint on the output logs of the index generation, then! thanks for the observation!
I checked further and there was an issue with SLURM, it was configured by another admin to quit processes after 10 mins (which of course, for such processes such as indexing and aligning is too little). So the jobs were being timed out. I am sure this is behind the issue. Since the error was a bit cryptic, it took me some time to reach there. I also googled and read some misleading information that "the new version of STAR produces those SA files", so I did not know that was a hint - but it was just badly redacted. Thank you!
Did it solve your problem?