Nextflow STAR alignment process only aligning one fastq file when 6 are being passed in
2
0
Entering edit mode
6 weeks ago
smarcotte11 ▴ 10

Hello!

I am currently writing a Nextflow RNA-Seq alignment pipeline. I am new to nextflow and trying to learn about how the data is passed between channels in the workflow. My TRIMMOMATIC process runs perfectly and TRIMMOMATIC.out.fastq.view() shows the correct output (which is the path to six trimmed fastq files). However, once my STAR process begins running, it is only processing one of my trimmed fastq files, not all six. Can someone explain to me how to solve this and the conceptual part behind it?

process STAR {

publishDir params.aligned_bams, mode: 'copy'

input:
path trimmed_fqs
path star_idx

output:
path "${trimmed_fqs.simpleName}.Aligned.sortedByCoord.out.bam", emit: bam

script:
"""
STAR --runThreadN 4 --genomeDir $star_idx --readFilesIn $trimmed_fqs --outSAMtype BAM SortedByCoordinate --outFileNamePrefix "${trimmed_fqs.simpleName}."  
"""
}

workflow {
// run fastqc
fastqs = channel.fromPath(params.fastq_dir)
FASTQC(fastqs)

// run index_fa
genome_fa = channel.fromPath(params.genome_fa)
genome_gtf = channel.fromPath(params.genome_gtf)
INDEX_FA(genome_fa, genome_gtf)

// run trimmomatic
TRIMMOMATIC(fastqs)

// run alignment
STAR(TRIMMOMATIC.out.fastq, INDEX_FA.out.star_index)     

// index bams       
INDEX_BAM(STAR.out.bam)

// generate the counts matrix
FEATURECOUNTS(STAR.out.bam.collect())
} 
Nextflow RNA-seq • 659 views
ADD COMMENT
0
Entering edit mode

can you please show us the output of

TRIMMOMATIC.out.view()

and the hidden file '.command.sh' generated by the process STAR in the STAR workfing directory

ADD REPLY
3
Entering edit mode
6 weeks ago
Corentin ▴ 660

Hi,

That's probably because your INDEX_FA.out.star_index is a queue channel with a single value. So it is "consumed" when it is used with the first FASTQ file in your STAR process. I know it can be quite confusing, details can be found in the Nextflow documentation about Channel: https://www.nextflow.io/docs/latest/channel.html

One solution would be to "force" your channel to be a value Channel instead. For example you can use the collect operator:

index_ch = INDEX_FA.out.star_index.collect()

ADD COMMENT
1
Entering edit mode

This solved my issue! Thank you!!

ADD REPLY
2
Entering edit mode
5 weeks ago

It might also be your input channel code.

I use *.fastq.gz like here to denote input files

fastq_path = params.fastq + "/*.fastq.gz"        
input_fastq_ch = Channel.fromPath(fastq_path, checkIfExists: true)
ADD COMMENT

Login before adding your answer.

Traffic: 5122 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6