Question

STAR genome index generation: NF-core vs manual parameters (seeking clarification)

0

Entering edit mode

1 day ago

Amb@r85 ▴ 10

Hi everyone, I’m trying to better understand the practical differences between generating a STAR genome index manually versus generating it through an NF-core RNA-seq pipeline.

Most STAR tutorials and forum posts emphasize that large genomes (e.g., human) don’t necessarily require extremely high RAM if parameters such as --limitGenomeGenerateRAM, --genomeChrBinNbits, or --genomeSAindexNbases are adjusted correctly.

However, NF-core abstracts these settings, and I’m not fully clear on how much control the user has over STAR parameters during index generation inside the workflow.

So I’d like to ask the community:

Does NF-core override or restrict STAR’s index-generation parameters? For example, can you pass optimized values for --limitGenomeGenerateRAM, --genomeSAindexNbases, etc., or does NF-core rely mostly on defaults?
In your experience, is there any performance/efficiency advantage to building the index within NF-core, compared to building it manually with STAR? Or is it generally better to generate the index directly with STAR (especially when you want full control)?
Are there cases where NF-core’s index building is actually slower or more resource-intensive due to conservative defaults? I’ve seen some reports of unusually long runtimes.
Finally, is there any practical downside to simply providing NF-core with a manually generated STAR index instead of letting the workflow generate its own?

I’m not debating workflows—just trying to understand how much flexibility exists and whether manual index generation is still preferred when dealing with large genomes or limited compute environments.

Thanks!

transcriptome R coding • 171 views

ADD COMMENT • link updated 1 day ago by Pierre Lindenbaum 166k • written 1 day ago by Amb@r85 ▴ 10

score 1 · Answer 1 · 2025-12-01

However, NF-core abstracts these settings, and I’m not fully clear on how much control the user has over STAR parameters during index generation inside the workflow.

most (all?) nf-core processes provide a way to customize any part of the code. It's usually a parameter 'args' that is blank by default.

https://github.com/nf-core/modules/blob/master/modules/nf-core/star/genomegenerate/main.nf#L22

if you want to customize the creation of the STAR genome, you can provide an extra config file (eg. with -c local.config containing the instructions to override 'args'

process {
withName: "STAR_GENOMEGENERATE" {
    ext.args =  "--limitGenomeGenerateRAM 12345 "
    }

}