Hi,
I was hoping someone with nextflow experience could help me with this issue.
My script is a run of the mill Hisat2/Stringtie nextflow script. However my issue is that the final process of the script takes one file at random from the previous channel, and finishes the script without any errors. Here is a portion of the code showing where the inputs to the final process originate from:
#!/usr/bin/env nextflow
params.genome = "Reference/chr22.fa"
genome_fasta = files( params.genome )
params.annot = "Annotation/chr22.gtf"
Channel
.fromPath( params.annot )
.into { gtf1; gtf2; gtf3 }
params.reads = "trimmed_reads/*_r{1,2}.trimmed.fastq.gz"
Channel
.fromFilePairs( params.reads )
.set { read_ch }
......
extract exons, splice sites, align with hisat2, pipe to bam
......
process Sort_Index_Bams {
publishDir "BAMS/", mode:'copy'
input:
set val(key), file(bam) from hisat_bams
output:
set val(key), file("${key}.bam") into hisat_bams1
file "${key}.bam.bai" into indexed
script:
def avail_mem = task.memory == null ? '' : "-m ${task.memory.toBytes() / task.cpus}"
"""
samtools sort \\
$bam \\
-@ ${task.cpus} $avail_mem \\
-o ${key}.bam
samtools index ${key}.bam
"""
}
hisat_bams1.into { hisat_bams2; hisat_bams3 }
process Assemble_Transcripts{
publishDir "Assembly/", mode:'copy'
input:
set val(key), file(bam) from hisat_bams2
file(gtf) from gtf2
output:
file("${key}.gtf") into hisat_transcripts
script:
"""
stringtie \
${bam} \
-G ${gtf} \
-l ${key} \
-o ${key}.gtf \
-p ${task.cpus} \
"""
}
I have tried to alter the final process as such:
set val(key), file (bam) from hisat_bams2.collect()
. This returned an error of "Input tuple does not match input set cardinality declared by process Assemble_Transcripts".- Set the input GTF file to a value channel as described on this stackoverflow post here. The user reported same issue, however it did not solve my problem.
Any suggestions would be greatly, greatly appreciated.
Regards,
Barry
It often helps when you add a tag:
I usually just write:
when I need things like the bam index: