I am currently trying to index a huge genome (8.3 Gbp) and provided the exons and splice sites, as recommended in the HISAT2 manual. As you can imagine, running this has been taking up a lot of memory, but after a long time the code is still running, and it says it is at its 7th generation. My question is: how many generations does the index builder normally go through (are we almost there, or is it time to abort the attempt of building the index?)
Would it be faster/more convenient to try to build the index without providing the exon and splice site data, and how relevant would that index still be for downstream transcriptomics analysis?
Thanks for any clarification, Nienke