Error while running nf-core/rnaseq pipeline
0
0
Entering edit mode
16 months ago

I ran the following command:

nextflow run nf-core/rnaseq
-r 3.10.1
--input samplesheet.csv
--outdir output
--fasta chr22_with_ERCC92.fa
-profile docker
--gtf chr22_with_ERCC92.gtf
--max_memory 3.7GB
--max_cpus 4

The error I got is as follows:

-[nf-core/rnaseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE (CONTROL_REP3)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE (CONTROL_REP3)` terminated with an error exit status (255)

Command executed:

  [ ! -f  CONTROL_REP3_1.fastq.gz ] && ln -s HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz CONTROL_REP3_1.fastq.gz
  [ ! -f  CONTROL_REP3_2.fastq.gz ] && ln -s HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz CONTROL_REP3_2.fastq.gz
  trim_galore \
      --fastqc_args '-t 4' \
      --cores 1 \
      --paired \
      --gzip \
      CONTROL_REP3_1.fastq.gz \
      CONTROL_REP3_2.fastq.gz

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE":
      trimgalore: $(echo $(trim_galore --version 2>&1) | sed 's/^.*version //; s/Last.*$//')
      cutadapt: $(cutadapt --version)
  END_VERSIONS

Command exit status:
  255

Command output:
  (empty)

Command error:
  16    2       0.0     1       0 2
  21    2       0.0     1       0 2
  24    1       0.0     1       0 1
  25    1       0.0     1       0 1
  26    1       0.0     1       0 1
  27    1       0.0     1       0 1
  28    2       0.0     1       0 2
  30    2       0.0     1       0 2
  31    2       0.0     1       0 2
  36    4       0.0     1       0 4
  37    2       0.0     1       0 2
  39    1       0.0     1       0 1
  42    2       0.0     1       0 2
  44    1       0.0     1       0 1
  51    2       0.0     1       0 2
  53    1       0.0     1       0 1
  56    2       0.0     1       0 2
  58    2       0.0     1       0 2
  62    2       0.0     1       0 2
  66    1       0.0     1       0 1
  67    1       0.0     1       0 1
  69    1       0.0     1       0 1
  71    1       0.0     1       0 1
  77    1       0.0     1       0 1
  78    2       0.0     1       0 2
  83    3       0.0     1       0 3
  85    3       0.0     1       0 3
  89    1       0.0     1       0 1
  92    1       0.0     1       0 1
  93    1       0.0     1       0 1
  94    1       0.0     1       0 1
  98    1       0.0     1       0 1
  99    1       0.0     1       0 1
  100   2       0.0     1       0 2

  RUN STATISTICS FOR INPUT FILE: CONTROL_REP3_2.fastq.gz
  =============================================
  118571 sequences processed in total
  The length threshold of paired-end sequences gets evaluated later on (in the validation step)

  Validate paired-end files CONTROL_REP3_1_trimmed.fq.gz and CONTROL_REP3_2_trimmed.fq.gz
  file_1: CONTROL_REP3_1_trimmed.fq.gz, file_2: CONTROL_REP3_2_trimmed.fq.gz


  >>>>> Now validing the length of the 2 paired-end infiles: CONTROL_REP3_1_trimmed.fq.gz and CONTROL_REP3_2_trimmed.fq.gz <<<<<
  Writing validated paired-end Read 1 reads to CONTROL_REP3_1_val_1.fq.gz
  Writing validated paired-end Read 2 reads to CONTROL_REP3_2_val_2.fq.gz

  Read 2 output is truncated at sequence count: 118571, please check your paired-end input files! Terminating...

Work dir:
  /home/eesha/work/8f/4b24724187219cec48cb42dc0e3d82

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details

What is the workaround for this?

nf-core RNA-seq • 2.2k views
ADD COMMENT
1
Entering edit mode

The message is quite clear to me:

Read 2 output is truncated at sequence count....

check your fastqs:

gunzip -t CONTROL_REP3_1.fastq.gz && echo "OK"
gunzip -t CONTROL_REP3_2.fastq.gz && echo "OK"
gunzip -c CONTROL_REP3_1.fastq.gz | wc -l
gunzip -c CONTROL_REP3_2.fastq.gz | wc -l
ADD REPLY
0
Entering edit mode

I'm sorry, but I don't understand what exactly to do. These are the names of my fastq files:

HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz
HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz
HBR_Rep2_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz
HBR_Rep2_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz
HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz
HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz

ADD REPLY
2
Entering edit mode

The log message says that the FastQ file is truncated - cut short. This suggests that your input FastQ files are corrupted, likely by a broken download or similar.

Pierre is suggesting that you check your FastQ files to make sure that they are valid gzip files, and count the number of lines included within to make sure that the pairs correspond (the read1 and read2 files should have identical line count numbers).

ADD REPLY
0
Entering edit mode

Thank you for the explanation! I will look into it.

ADD REPLY
0
Entering edit mode

This is the observation I got regarding the FastQ files:

for F in HBR_Rep_ERCC-Mix2_Build37-ErccTranscripts-chr22.read.fastq.gz ; do echo "Testing ${F}" && gunzip -t ${F} && echo "OK zip ${F}" && gunzip -c "${F}" | wc -l ; done

Testing HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz OK zip HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz

474284

Testing HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz OK zip HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz

474284

Testing HBR_Rep2_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz OK zip HBR_Rep2_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz

579304

Testing HBR_Rep2_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz OK zip HBR_Rep2_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz

579304

Testing HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz OK zip HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz

519144

Testing HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz OK zip HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz

519144

Does this mean there is no error in the input FastQ files since the Read 1 and Read 2 lengths are same for all reps?

ADD REPLY
1
Entering edit mode
for F in HBR_Rep*_ERCC-Mix2_Build37-ErccTranscripts-chr22.read*.fastq.gz ; do echo "Testing ${F}" && gunzip -t ${F} && echo "OK zip ${F}" && gunzip -c "${F}" | wc -l ; done
ADD REPLY
0
Entering edit mode

Thanks a lot. Will try it out!

ADD REPLY
0
Entering edit mode

Hi Pierre, I used the command you posted and got same read lengths for read 1 and 2 in each rep. But it still gives the error (as posted in the comment above). Could you please help me with the workaround?

ADD REPLY

Login before adding your answer.

Traffic: 1934 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6