Question

Canu 2.1.1 Storage issue at correction/1-overlapper

0

Entering edit mode

3.2 years ago

arteen.marashi • 0

Hello,

I am running canu 2.1.1 to assemble pacbio sequel II reads for an insect with an estimated genome size of 1.9 Gbps. I am running all of this over SLURM. I initially ran the default parameters of canu but ran out of storage space (20T), so I looked at their FAQ page and found options that were recommended to add under "My assembly is running out of space, is too slow?".

namely the options:

corMhapFilterThreshold=0.0000000002 corMhapOptions="--threshold 0.80 --num-hashes 512 --num-min-matches 3 --ordered-sketch-size 1000 --ordered-kmer-size 14 --min-olap-length 2000 --repeat-idf-scale 50" mhapMemory=60g mhapBlockSize=500 ovlMerDistinct=0.975

The canu command I ran was:

canu -p veletis -d output corMhapFilterThreshold=0.0000000002 corMhapOptions="--threshold 0.80 --num-hashes 512 --num-min-matches 3 --ordered-sketch-size 1000 --ordered-kmer-size 14 --min-olap-length 2000 --repeat-idf-scale 50" mhapMemory=60g mhapBlockSize=500 ovlMerDistinct=0.975 gridOptions="--time=15:00:00" genomeSize=1.96g -pacbio ~/projects/def-bsincla7/arteen/veletis/raw_reads/pacbio/fastq/longread_c1.fastq ~/projects/def-bsincla7/arteen/veletis/raw_reads/pacbio/fastq/longread_c2.fastq

I let this run but it is about halfway through the jobs for correction/1-overlapper and is at 15T out of 20T that I have available. Most of the size is in correction/1-overlapper/results in the *.ovb files.

Are there any other options/tweaks that I can potentially use to reduce the storage size that anyone knows of? I hope that made sense, I can clarify anything if necessary.

Thanks for the help!

Assembly genome • 600 views

ADD COMMENT • link 3.2 years ago by arteen.marashi • 0