Question

Nextflow STAR alignment process only aligning one fastq file when 6 are being passed in

0

Entering edit mode

6 weeks ago

smarcotte11 ▴ 10

Hello!

I am currently writing a Nextflow RNA-Seq alignment pipeline. I am new to nextflow and trying to learn about how the data is passed between channels in the workflow. My TRIMMOMATIC process runs perfectly and TRIMMOMATIC.out.fastq.view() shows the correct output (which is the path to six trimmed fastq files). However, once my STAR process begins running, it is only processing one of my trimmed fastq files, not all six. Can someone explain to me how to solve this and the conceptual part behind it?

process STAR {

publishDir params.aligned_bams, mode: 'copy'

input:
path trimmed_fqs
path star_idx

output:
path "${trimmed_fqs.simpleName}.Aligned.sortedByCoord.out.bam", emit: bam

script:
"""
STAR --runThreadN 4 --genomeDir $star_idx --readFilesIn $trimmed_fqs --outSAMtype BAM SortedByCoordinate --outFileNamePrefix "${trimmed_fqs.simpleName}."  
"""
}

workflow {
// run fastqc
fastqs = channel.fromPath(params.fastq_dir)
FASTQC(fastqs)

// run index_fa
genome_fa = channel.fromPath(params.genome_fa)
genome_gtf = channel.fromPath(params.genome_gtf)
INDEX_FA(genome_fa, genome_gtf)

// run trimmomatic
TRIMMOMATIC(fastqs)

// run alignment
STAR(TRIMMOMATIC.out.fastq, INDEX_FA.out.star_index)     

// index bams       
INDEX_BAM(STAR.out.bam)

// generate the counts matrix
FEATURECOUNTS(STAR.out.bam.collect())
}

Nextflow RNA-seq • 659 views

ADD COMMENT • link 5 weeks ago by smarcotte11 ▴ 10

0

Entering edit mode

can you please show us the output of

TRIMMOMATIC.out.view()

and the hidden file '.command.sh' generated by the process STAR in the STAR workfing directory

ADD REPLY • link 6 weeks ago by Pierre Lindenbaum 166k

score 3 · Answer 1 · 2025-08-06

3

Entering edit mode

6 weeks ago

Corentin ▴ 660

Hi,

That's probably because your INDEX_FA.out.star_index is a queue channel with a single value. So it is "consumed" when it is used with the first FASTQ file in your STAR process. I know it can be quite confusing, details can be found in the Nextflow documentation about Channel: https://www.nextflow.io/docs/latest/channel.html

One solution would be to "force" your channel to be a value Channel instead. For example you can use the collect operator:

index_ch = INDEX_FA.out.star_index.collect()

ADD COMMENT • link 6 weeks ago by Corentin ▴ 660

1

Entering edit mode

This solved my issue! Thank you!!

ADD REPLY • link 5 weeks ago by smarcotte11 ▴ 10

score 2 · Answer 2 · 2025-08-07

2

Entering edit mode

5 weeks ago

colindaven 7.9k

It might also be your input channel code.

I use *.fastq.gz like here to denote input files

fastq_path = params.fastq + "/*.fastq.gz"        
input_fastq_ch = Channel.fromPath(fastq_path, checkIfExists: true)

ADD COMMENT • link 5 weeks ago by colindaven 7.9k