When (not to) pre-build a STAR index
1
0
Entering edit mode
4.7 years ago
jhanks1981 ▴ 10

I am soon to lose my institutional HPC access, and want to port some of my projects to run on a commercial cloud. To avoid excessive costs, I only want to pay when I am using it, and not pay for any more resources than I need. Ideally, I would rent a high memory system for a short period of time in order to build genome indexes for STAR (which requires 64GB of memory, far more than any of my subsequent jobs should need). I also might need to provide other people with pre-built virtual machines that already have genome indexes built on them.

However, it seems that people recommend against using pre-built indexes. Or at least it was the overwhelming consensus in this thread: Pre made STAR Index?

So my question is: How I know when I need to build a new index? Every time I create a new installation? When I change versions of STAR? What kinds of problems can arise from pre-building a genome index?

STAR genome-index • 1.2k views
ADD COMMENT
2
Entering edit mode
4.7 years ago
Thibault D. ▴ 700

AFAIK, unless it is clearly written by developers, I build one index for a genome version/patch, for one version of a software (either majors or minors).

There is not software I know deep enough to be 100% sure the last commit did not change the index, even by a single maybe-very-important octet.

For STAR, 2.7.xxx, if you're upgrading from 2.6.xxx, then it requires new index build, and release notes are clear about that:

Important: the genome index has to be re-generated with the latest 2.7.0x release.

Authors did not mention new index changes with the last version. It's likely to be unnecessary, but I would re-build index.

ADD COMMENT

Login before adding your answer.

Traffic: 1478 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6