Question: Difficulty using SPAdes - Error relating to access and specificity
0
gravatar for gfrsims
2.4 years ago by
gfrsims0
gfrsims0 wrote:

Hello,

Apologies if the post is unjustified, however I am struggling to use SPAdes assembly software.

I have 18 single reads for which I am trying to build contigs from using SPAdes.

Unfortunately I have very limited experience of coding and cmd line and am finding it very difficult.

I have successfully installed the program and test runs correctly. However have encountered errors when trying to run (see below)

''Gareths-MacBook-Pro:bin Gareth$ python spades.py --careful -o SPAdes_out -s Macintosh HD/Spades/input/input.fas

== Error == Please specify option (e.g. -1, -2, -s, etc) for the following paths: HD/Spades/input/input.fas''

Any advice would be greatly appreciated

ADD COMMENTlink modified 2.4 years ago by h.mon27k • written 2.4 years ago by gfrsims0

Read the manual carefully: http://spades.bioinf.spbau.ru/release3.10.1/manual.html#sec3.1

Example:

spades.py -k 21,33,55,77 -t 12 -m 50 --careful -o outputdir -1 read_1.fq.gz -2 read_2.fq.gz
ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by shenwei3564.8k

I have 18 single reads for which I am trying to build contigs from using SPAdes.

Can you clarify what you mean by that? What format are these reads in? Using SPAdes for that small a number of reads may either be an overkill or the wrong program for the problem at hand.

ADD REPLYlink written 2.4 years ago by genomax70k

Apologies, perhaps my initial post was somewhat hasty. I will try harder to outline what I am trying to achieve.

I have been set a task where I have 18 - 20bp reads I need to assemble into a sequence. Whilst the task was envisioned to be performed cut and stick style, I have taken the opportunity to expand my knowledge and attempt via software.

Whilst I have solved the issues with my input upon closer inspection of the Manual thanks ^ Shenwei356. I am now faced with the issue that I am having difficulty in the respect that my input is in FASTA and the software appears to require FASTQ.

**''Gareths-MacBook-Pro:spades1 Gareth$ python ~/SPAdes-3.10.1-Darwin/bin/spades.py --careful -s input.fa -o spadesout1
== Error ==  to run read error correction, reads should be in FASTQ format (.fq, .fastq, .bam, .fq.gz, .fastq.gz, .bam.gz are supported): /Users/Gareth/spades1/input.fa (single reads, library number: 1, library type: single)
In case you have troubles running SPAdes, you can write to spades.support@cab.spbu.ru
Please provide us with params.txt and spades.log files from the output directory.
Gareths-MacBook-Pro:spades1 Gareth$ python ~/SPAdes-3.10.1-Darwin/bin/spades.py --careful -s input.fas -o spadesout1
== Error ==  file not found: /Users/Gareth/spades1/input.fas (single reads, library number: 1, library type: single)
In case you have troubles running SPAdes, you can write to spades.support@cab.spbu.ru
Please provide us with params.txt and spades.log files from the output directory."**

Is it a case that I should convert the file into FASTQ - selecting a uniform value for the quality e.g. ~ for all values, or is there a way to input fasta files into SPAdes as the manual would suggest but not explain?''

I really appreciate your reply

G

ADD REPLYlink modified 2.4 years ago by genomax70k • written 2.4 years ago by gfrsims0

Did you try running without --careful?

ADD REPLYlink written 2.4 years ago by genomax70k

I have just tried it, with the following error:

Gareths-MacBook-Pro:spades1 Gareth$ python ~/SPAdes-3.10.1-Darwin/bin/spades.py -s input.fa -o spadesout2


== Error ==  to run read error correction, reads should be in FASTQ format (.fq, .fastq, .bam, .fq.gz, .fastq.gz, .bam.gz are supported): /Users/Gareth/spades1/input.fa (single reads, library number: 1, library type: single)
ADD REPLYlink written 2.4 years ago by gfrsims0

So tried assemble only to potentially avoid the above error, receiving the following error:

Gareths-MacBook-Pro:spades1 Gareth$ python ~/SPAdes-3.10.1-Darwin/bin/spades.py -s input.fa -o spadesout2 --only-assembler
Command line: /Users/Gareth/SPAdes-3.10.1-Darwin/bin/spades.py  -s  /Users/Gareth/spades1/input.fa  -o  /Users/Gareth/spades1/spadesout2    --only-assembler

OMITTED SECTION OF OUTPUT AS TOO LONG FOR CHARACTER LIMIT

== Error ==  system call for: "['/Users/Gareth/SPAdes-3.10.1-Darwin/bin/spades', '/Users/Gareth/spades1/spadesout2/K21/configs/config.info']" finished abnormally, err code: -6

So tried FASTQ file made with generic value for sequence read quality (~)

Gareths-MacBook-Pro:spades1 Gareth$ python ~/SPAdes-3.10.1-Darwin/bin/spades.py -s input.fa -o spadesout2 --only-assembler
Command line: /Users/Gareth/SPAdes-3.10.1-Darwin/bin/spades.py  -s  /Users/Gareth/spades1/input.fa  -o  /Users/Gareth/spades1/spadesout2    --only-assembler

OMITTED SECTION OF OUTPUT AS TOO LOG FOR CHARACTER LIMIT

== Error ==  system call for: "['/Users/Gareth/SPAdes-3.10.1-Darwin/bin/hammer', '/Users/Gareth/spades1/spadesout1/corrected/configs/config.info']" finished abnormally, err code: -8

Any advice would be very welcome! thank you taking a look!

G

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by gfrsims0
0
gravatar for h.mon
2.4 years ago by
h.mon27k
Brazil
h.mon27k wrote:

I do not think SPAdes is designed for the very short reads (18-22bp) you have. Anyway, you may try with -k 11,15,19, but I am not even sure it accepts such short kmers.

Also, from your original post, I think you have folders with spaces in their names, save yourself a lot of trouble by NEVER using spaces on file and folder names.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by h.mon27k

Thank you for your advice, noted that file names with SPACES is dumb I will avoid doing this again.

I have tried using the above specified K-mer lengths to no avail. Producing the following error message:

== Error ==  system call for: "['/Users/Gareth/SPAdes-3.10.1-Darwin/bin/spades', '/Users/Gareth/spades1/spadesout2/K19/configs/config.info']" finished abnormally, err code: -6

I appreciate the issue well be the small size of my reads and total sequence length. Could you recommend an alternative piece of freeware that would be able to assemble the reads into a sequence, using either the K-mers or Greedy method?

Many thanks

G

ADD REPLYlink written 2.4 years ago by gfrsims0

What kind of genome is this and what is the expected size? Can you comment on why you have reads that are this short? This may be an exercise in futility if the library was not properly made/sequenced.

You could give tadpole.sh from BBMap suite a try instead of SPAdes.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by genomax70k

Try SSAKE or SHARCGS, two assemblers developed for very short reads.

ADD REPLYlink written 2.4 years ago by h.mon27k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1435 users visited in the last hour