Question

Nextflow, transform multiple outputs of one process to paired outputs and use them as input to the next proccess.

0

Entering edit mode

10 months ago

MolGeek ▴ 50

Hi everyone,

I am trying to learn making workflows using Nextflow. I want to make an ATAC Seq workflow. I have 2 set of paired end ATAC Seq data.

First i perform adapter trimming using trimgalore. The ouputs of trimGalore consist of the two trimmed fqs for each set.

Then i want those trimmed fqs to be used as input in bowtie2, but i havent found a way of transforming pairedTrimmedCh to paired file channel in order to run the alignment proccess.

Any help?

Thanks in advance!

#!/usr/bin/env nextflow

params.reads = "./*{1,2}.fastq"
params.outdir = "bams"
params.INDEX = "path_to_index"
params.cpus = 10

log.info """\

    A T A C S E Q - N F   P I P E L I N E
    ===================================
    Genome: ${params.INDEX}
    reads        : ${params.reads}
    outdir       : ${params.outdir}
    """
    .stripIndent(true)



process trimReads {

    publishDir "$params.outdir/", mode: 'copy'
    input:
    tuple val(sampleid), path(reads)

    output:
    path "./trimmed/" 

    script:
    """
TrimGalore-0.6.7/trim_galore --cores 7 --paired --no_report_file ${reads[0]} ${reads[1]}  -o ./trimmed/

    """
}


process alignment {
    publishDir "$params.outdir/", mode: 'copy'

    input:
    tuple val(sampleid), path(reads)

    output:
    path "${sampleid}.mm10.sorted.bam", emit: bams
    path "${sampleid}.mm10.sorted.bam.bai"

    script:
    """
    bowtie2 --local -X 2000 -p ${params.cpus} -x ${params.INDEX} -1 ${trimmed_reads[0]} -2 ${trimmed_reads[1]} | samtools view -b -h -S -q 10 -f 0x2 | samtools sort -@ ${params.cpus}  > ${sampleid}.mm10.sorted.bam
    samtools index ${sampleid}.mm10.sorted.bam
    """
}

workflow {
    // Create a channel with fastqs. If paired-end, use .fromFilePairs
    Channel
        .fromFilePairs(params.reads, checkIfExists: true)
        .set { read_ch }

    pairedTrimmedCh = trimReads(read_ch).groupTuple()

    align_ch = alignment(pairedTrimmedCh)
}

Nextflow • 1.1k views

ADD COMMENT • link 3 months ago by MolGeek ▴ 50

0

Entering edit mode

The way you provide the index will cause a problem, see for a solution bowtie2 Mapping Using Pre-built Index in Nextflow

ADD REPLY • link 10 months ago by ATpoint 82k

0

Entering edit mode

I am aware of it. I provide the index as a variable such as

params.dir = path_to_Bowtie2Index/genome

and then

 ${params.INDEX}

ADD REPLY • link 10 months ago by MolGeek ▴ 50

0

Entering edit mode

That is not going to work I assume. You cannot stage basenames. You can stage the folder with the index files, but then need some trick to find the files in it, as I described in that other thread.

ADD REPLY • link 10 months ago by ATpoint 82k

0

Entering edit mode

It works! :)

ADD REPLY • link 10 months ago by MolGeek ▴ 50

score 1 · Answer 1 · 2023-07-04

not tested, in trimReads I usually do something like (I'm not sure about the files generated by trim_galore, check this please):

(...)
output:
   path("fastqs.tsv"),emit:output
script:
"""
(...)
find  \${PWD}/trimmed  -type f -name "*trimmed.fq.gz" | sort | paste - - | awk 'BEGIN {printf("sample\tR1\tR2\\n");} {printf("{sampleid}\t%s\\n",\$0;}' > fastqs.tsv
"""

then read the output and use it in alignment

alignment( pairedTrimmedCh.output.splitCsv(sep:'\t',header:true) )

in alignment:

 input:
     val(row)

    output:
    path "${row.sample}.mm10.sorted.bam", emit: bams
    path "${row.sample}.mm10.sorted.bam.bai"

    script:
    """
    bowtie2 --local -X 2000 -p ${params.cpus} -x ${params.INDEX} -1 ${row.R1} -2 ${row.R2} | samtools view -b -h -S -q 10 -f 0x2 | samtools sort -@ ${params.cpus}  > ${row.sample}.mm10.sorted.bam
    samtools index ${row.sample}.mm10.sorted.bam
    """