I am attempting to perform de novo assembly of sunflower with Supernova 2.0.0.
I am having some difficulty getting it to finish within the wallclock limit for resources I am using. I have a wallclock limit of 48 hours on SDSC Comet (64 cores, 1.4TB memory) and 72 hours on Savio here at UCB (16 cores, 512GB memory).
I typically have not been including --maxreads in my scripts, assuming that will produce the best quality assembly, but this is not realistic considering my wallclock limits. One question I have is whether or not I should limit the number of reads to what the sequencing company has given in their report. Our sequences are from HiSeq X, and it says that the number of reads are 261M reads. Is this "reads" from the sequencing company different than the reads (as in maxreads) for supernova?
Also, do you set localcores and localmem? Or do you just let the program use the resources available on that node?
I should also add that the genome is 3.6G-bases, quite large. I also expect some heterozygosity.