I have downloaded Illumina reads from SRA (SRR769347). I want to run a de novo assembly with SPAdes but I am getting an error and I am not sure how to fix it.
Here are all the steps I've performed:
I used fastq-dump --split-files
on the SRA file to extract the 2 files for each paired reads. I used sed
to add /1
and /2
respectively to each paired files. I therefore have 2 files: file1.fastq
and file2.fastq
A read from file1.fastq
looks like this:
@SRR769347.1/1 1 length=101
+SRR769347.1 1 length=101
A read from file2.fastq
looks like this:
@SRR769347.1/2 1 length=101
+SRR769347.1 1 length=101
I then use these files as input to SPAdes 3.1.0 (available version on our cluster) as follows : (I also provide PacBio reads for the assembly)
spades.py -1 file1.fastq -2 file2.fastq --pacbio pacbio.fastq -o SPAdes_output
Invariably I am getting the following error:
== Error == file not found: file2.fastq (right reads, library number: 1, library type: paired-end)
I also tried to use another fastq-dump
option: --split3
for which the reads are correctly labelled as /1
and /2
but strangely each read is in multiple copies... and also gives me the same error in SPAdes...
Any help would be great!
Try to download fastq files from ftp://ftp.ddbj.nig.ac.jp/ddbj_database/dra/fastq/SRA068/SRA068445/SRX247326/
Also use full path to fastq files, e.g.
thanks, it was indeed a PATH issue. It works now ! Sorry about that !