error while testing nextflow with --genome GRCh38
2
0
Entering edit mode
4 months ago
Ankit ▴ 370

Hi I get the following error while testing the nextflow rna-seq pipeline.

Error executing process > ‘NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN (WT_REP1)’
Caused by:
 java.lang.UnsupportedOperationException

My command to test :

nextflow run nf-core/rnaseq -r 3.8.1 --genome GRCh38 -profile test,singularity --outdir /mnt/hom/

Without --genome GRCh38 option it was working ok.

I would appreciate any help.

Thanks

nextflow error java • 883 views
ADD COMMENT
1
Entering edit mode

Dedicated resources for nextflow queries:

https://nf-co.re/join/slack help and rnaseq channels in your case.

&

https://gitter.im/nextflow-io/nextflow

ADD REPLY
0
Entering edit mode

Nextflow's community is now on Slack as well, you can join here: https://www.nextflow.io/slack-invite.html

ADD REPLY
1
Entering edit mode

there must be a hidden file .nextflow.log in your working directory. Show use the lines in the context of the error (java stack trace).

ADD REPLY
4
Entering edit mode
4 months ago
Harshil ▴ 50

What version of Nextflow are you using?

Also, as Matthias mentioned, when you use -profile test you shouldn't use the --genome parameter because these come bundled with the test data for the pipeline as indicated in the main README here.

Once you have tested that the pipeline works, I would use the command and reference files he recommends here for the GRCh38 genome on your own data.

ADD COMMENT
0
Entering edit mode

Now it is more clear to me. Thanks

I am trying it on my data.

I am using the latest one 3.8.1

ADD REPLY
0
Entering edit mode

Sorry, I meant the version of NF and not the pipeline.

ADD REPLY
0
Entering edit mode

N E X T F L O W ~ version 22.04.0

ADD REPLY
0
Entering edit mode
4 months ago

I suppose because you are actually providing two reference genomes here at the same time? When you specify the test profile, it loads a small test dataset from Github to run the pipeline on: The test data for the RNA-seq pipeline is a down-sampled yeast dataset and reference genome (iGenomes S. cerevisiae R64-1-1 Ensembl release).

Indeed, that error message is not really helpful, but I suspect it is the expected outcome when you load two reference genomes. I suppose, that somewhere in the pipeline an operation is performed that is only allowed on strings, but since you specified two reference genomes the respective variables are now lists and then the java.lang.UnsupportedOperationException error is thrown?

ADD COMMENT
0
Entering edit mode

Hi thanks for the information. But then how should I use only one genome of my choice? How the syntax will change?

ADD REPLY
1
Entering edit mode

Hi Ankit

You actually did the right thing. Whatever is specified on the command line overwrite what is specified in the configuration files and the profile.

But in your case, I'm assuming since you specified --genome GRCh38 that you want to use the igenomes files from this particular genome.

Can you try adding --igenomes_ignore=false?

I try to reproduce your error but couldn't. Please let us know if it works.

ADD REPLY
1
Entering edit mode

Well, if I launch the pipeline as Ankit did, I get that weird mixture of true GRCh38 and yeast reference files:

  Reference genome options
  genome                    : GRCh38
  fasta                     : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/genome.fasta
  gtf                       : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/genes.gtf.gz
  gff                       : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/genes.gff.gz
  gene_bed                  : /sw/data/igenomes//Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.bed
  transcript_fasta          : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/transcriptome.fasta
  additional_fasta          : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/gfp.fa.gz
  star_index                : /sw/data/igenomes//Homo_sapiens/NCBI/GRCh38/Sequence/STARIndex/
  hisat2_index              : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/hisat2.tar.gz
  rsem_index                : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/rsem.tar.gz
  salmon_index              : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/salmon.tar.gz
  save_reference            : true
  igenomes_base             : /sw/data/igenomes/

That is a prebuild STAR index on a human genome, but gene annotation and Salmon index for yeast. I stand by my assessment that this likely explain the error raised.

As far as the pipeline testing is concerned:

If nextflow run nf-core/rnaseq -r 3.8.1 -profile test,singularity --outdir /mnt/hom/ works, then the pipeline and Nextflow are functional, and you are ready to process your actual samples!

Just omit the test profile and instead specify your real sample sheet as input:

nextflow run nf-core/rnaseq -r 3.8.1 -profile singularity --outdir /mnt/hom/ --genome GRCh38 --input /path/to/your/samplesheet.csv

Also mind that the iGenomes reference for GRCh38 is quite outdated, so maybe better download your own and use this:

wget -L ftp://ftp.ensembl.org/pub/release-106/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
wget -L ftp://ftp.ensembl.org/pub/release-106/gtf/homo_sapiens/Homo_sapiens.GRCh38.106.gtf.gz

nextflow run nf-core/rnaseq \
    --fasta Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz \
    --gtf Homo_sapiens.GRCh38.106.gtf.gz \
    --remove_ribo_rna \
    --save_reference \
    --outdir /mnt/hom/  \ 
    -profile singularity \
    -r 3.8.1 \
   --input /path/to/your/samplesheet.csv

Ankit, I hope that helps! PS: If you are running the pipeline on a shared compute system like a university cluster, you might also want to check if there is already a config profile for this available in the nf-core configs.

ADD REPLY
0
Entering edit mode

Thanks Matthias!

Very useful suggestion.

I will test your suggested commands on my data.

ADD REPLY
0
Entering edit mode

HI I tried your suggestion.

nextflow -log run2.log run nf-core/rnaseq -r 3.8.1 -c add.config --igenomes_ignore=false --genome GRCh38 -profile test,singularity  --outdir /mnt/hom/

But again the same error

Staging foreign file: s3://ngi-igenomes/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/STARIndex

Error executing process > 'NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN (RAP1_UNINDUCED_REP2)'

Caused by: java.lang.UnsupportedOperationException

No idea how to resolve it.

ADD REPLY

Login before adding your answer.

Traffic: 833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6