I want to assemble a reptile genome with the software spades since it has given me no problems installing, unlike many other programs (e.g. Masurca, velvet, SOAPdenovo, etc). I'm wondering if a diploid genome of size 1.4 Gb is too large for this program.
I have read papers that assembled genomes of almost 3 Gb with spades (if I remember the reference I will post it), so the length would no represent a problem. However, personally I had problems with complex genomes (with large, short and tandem repeats), it causes very fragmented and redundant assemblies. In addition, to assemble 50 Mb of diploid genome from 80 million of paired reads I needed about 40-50 gb of RAM using default parameters, increasing the kmer size it was impossible.
From the SPAdes manual:
Note, that SPAdes was initially designed for small genomes. It was tested on bacterial (both single-cell MDA and standard isolates), fungal and other small genomes. SPAdes is not intended for larger genomes (e.g. mammalian size genomes). For such purposes you can use it at your own risk.
As Buffo noted, it is possible to use SPades with large genomes, and I have used it myself. But it was hit or miss, very often it would fail due to using to much memory or SPAdes would spit some error. Again as Buffo noted, complex genomes, or data with lower quality, can hugely increase memory usage, rendering SPAdes impractical.
Regarding installation problems, (mini)conda may be of great help.