how to copy and rename files in nextflow
3
1
Entering edit mode
3 months ago

Hi everyone. I've been trying to copy and rename files created by a pipeline executed in nextflow. I have implemented a qiime2 workflow and got the taxonomy (.qza) and biom files (see image below). Wen I run qiime tools export qiime2 creates a directory containing the tsv and/or biom file . The entire pipeline is executed in container = quay.io/qiime2/core:2023.7 with runOptions = -u $(id -u):$(id -g).

enter image description here

The process which exports the taxonomy is the following:

process taxonomy{
  publishDir params.outdir, mode:'copy'

  input:
  path "table-denoised.qza"
  path "vsearch-taxonomyITS.qza"
  path "blast-taxonomyITS.qza"
  path "sklearn-taxonomyITS.qza"

  output:
  path "feature-table", emit: feature_table
  path "vsearch-taxonomy", emit: vsearch_taxonomy
  path "blast-taxonomy", emit: blast_taxonomy
  path "sklearn-taxonomy", emit: sklearn_taxonomy

  script:
  """
  qiime tools export \
  --input-path table-denoised.qza \
  --output-path feature-table

  qiime tools export \
  --input-path vsearch-taxonomyITS.qza \
  --output-path vsearch-taxonomy

  qiime tools export \
  --input-path blast-taxonomyITS.qza \
  --output-path blast-taxonomy

  qiime tools export \
  --input-path sklearn-taxonomyITS.qza \
  --output-path sklearn-taxonomy
  """
}

Now I need to copy and rename the tsv files. I've tried:

process replace_header{
  publishDir params.outdir, mode:'copy'

  input:
  path "vsearch-taxonomy/taxonomy.tsv"

  output:
  path "taxonomy.tsv"

  script:
  """
  cp vsearch-taxonomy/taxonomy.tsv taxonomy.tsv
  """
}

Any idea?

nextflow • 844 views
ADD COMMENT
2
Entering edit mode

please validate/comment the answers to your previous questions: How to install non conda software ; Age nodes in R

ADD REPLY
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY
2
Entering edit mode
3 months ago

Solution:

1) Add output name as dir/filename after qualifier in the process taxonomy:

`process taxonomy{
  publishDir params.outdir, mode:'copy'

  input:    
  file "table-denoised.qza"    
  file "sklearn-taxonomyITS.qza"    
  file "vsearch-taxonomyITS.qza"    
  file "blast-taxonomyITS.qza"

  output:
  path ("feature-table/feature-table.biom"), emit: feature_table_biom
  path ("sklearn-taxonomy/taxonomy.tsv"), emit: sklearn_taxonomy_tsv
  path ("vsearch-taxonomy/taxonomy.tsv"), emit: vsearch_taxonomy_tsv
  path ("blast-taxonomy/taxonomy.tsv"), emit: blast_taxonomy_tsv

  script:    
  """
  qiime tools export \
  --input-path table-denoised.qza \
  --output-path feature-table

  qiime tools export \
  --input-path sklearn-taxonomyITS.qza \
  --output-path sklearn-taxonomy

  qiime tools export \
  --input-path vsearch-taxonomyITS.qza \
  --output-path vsearch-taxonomy

  qiime tools export \
  --input-path blast-taxonomyITS.qza \
  --output-path blast-taxonomy
  """
} 

`

2) process replace_header: define the paths in output chunck as the names of the files; In the script chunck provide full path to the file to be edited and redirect the output into a new file.

process REPLACE_HEADER {
  publishDir params.outdir, mode:'copy'

  input:
  path "vsearch-taxonomy/taxonomy.tsv"
  path "sklearn-taxonomy/taxonomy.tsv"
  path "blast-taxonomy/taxonomy.tsv"

  output:
  path ("vsearch_taxonomy.tsv"), emit: file_vsearch_taxonomy_tsv
  path ("sklearn_taxonomy.tsv"), emit: file_sklearn_taxonomy_tsv
  path ("blast_taxonomy.tsv"), emit: file_blast_taxonomy_tsv

  script:
  """
  sed 's/Feature ID/#otu-id/g' vsearch-taxonomy/taxonomy.tsv > vsearch_taxonomy.tsv
  sed 's/Feature ID/#otu-id/g' sklearn-taxonomy/taxonomy.tsv > sklearn_taxonomy.tsv
  sed 's/Feature ID/#otu-id/g' blast-taxonomy/taxonomy.tsv > blast_taxonomy.tsv
  """
}
ADD COMMENT
1
Entering edit mode
3 months ago

unless I'm wrong your first process could contain the following output:

  path ("vsearch-taxonomy/taxonomy.tsv"), emit: vsearch_taxonomy_tsv

then, you want to rename/move this output. This is probably NOT what you really want to do. Just use the named output 'vsearch_taxonomy_tsv` and reuse it in another place.

you can always rename it at the end using publish

ADD COMMENT
0
Entering edit mode

Thanks for replying, Pierre!

I've tried your sugestion but when I try to copy (or whatever action) the tsv file I got No such file or directory.

Every qiime tools export in the process taxonomy results a folder (defined by the paramerter --output-path) inside which is the tsv file, as you can see below. There's no way to save the file in the current folder.

enter image description here

ADD REPLY
1
Entering edit mode
3 months ago
pbioinf ▴ 70

Use the saveAs parameter from the publishDir directive to save the outputfile with a specific name.

https://www.nextflow.io/docs/latest/process.html#publishdir

Untested:

publishDir params.outdir, mode:'copy', saveAs {it -> 'taxonomy.tsv'}

ADD COMMENT
0
Entering edit mode

Thanks for replying. I got the error:

N E X T F L O W ~ version 23.10.0 Launching../github/its_pipeline/main.nf` [soggy_wright] DSL2 - revision: 2b94ede702 ERROR ~ Script compilation error

  • file : /home/mapa/github/its_pipeline/main.nf
  • cause: The current parameter list already contains a parameter of the name it @ line 390, column 49. s.outdir, mode:'copy', saveAs {it -> 'ta
                               ^
    

1 error

-- Check '.nextflow.log' file for details`

ADD REPLY

Login before adding your answer.

Traffic: 1657 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6