I recently ran SPAdes on two samples and encountered an error code of -6.
I noticed that one of the samples I used was big and the other one had much fewer reads. I assume the error could come from the smaller dataset not having enough reads.
Number of FASTQ records in the smaller sample:
sample1.cleaned.merged.fq.gz 2166 sample1.cleaned.unmerged1.fq.gz 9355 sample1.cleaned.unmerged2.fq.gz 9355
Number of FASTQ records in the larger sample:
sample2.cleaned.merged.fq.gz 341864 sample2.cleaned.unmerged1.fq.gz 3009684 sample2.cleaned.unmerged2.fq.gz 3009684
Note that these are actually two metagenomics datasets but since the
--meta flag isn't implemented for more than one paired-end library, I left out that parameter in the command line call.
I ran SPAdes as follows:
spades.py \ -t 36 \ --only-assembler \ -o spades_output \ -k 21,33,55,77,99 \ -m 200 \ --pe1-1 sample1.cleaned.unmerged1.fq.gz \ --pe1-2 sample1.cleaned.unmerged2.fq.gz \ --pe1-s sample1.cleaned.merged.fq.gz \ --pe2-1 sample2.cleaned.unmerged1.fq.gz \ --pe2-2 sample2.cleaned.unmerged2.fq.gz \ --pe2-s sample2.cleaned.merged.fq.gz System information: SPAdes version: 3.13.0 Python version: 3.7.2 OS: Linux-4.14.101-75.76.amzn1.x86_64-x86_64-with-debian-9.6
1) Do you have a recommended minimum number of reads in a sample coming from a metagenomics source? I.e. should I only attempt to assemble samples of a certain size?
2) These two samples actually came from the same physical source, which is why I wanted to assemble them together. I assumed that together, the samples would have enough reads to warrant attempting an assembly. Does SPAdes treat the two samples separately? Would it be more appropriate to concatenate the input files and treat the two samples as one sample?
3) If I would concatenate these two sequencing samples, I could also use the
--meta flag. Would you expect that this will improve my chances of a good assembly?
If anyone has any been-there-done-that experience, it would be appreciated!