Question: How does the memory requirements of abyss scale ?
0
gravatar for lieven.sterck
2.4 years ago by
lieven.sterck8.0k
VIB, Ghent, Belgium
lieven.sterck8.0k wrote:

I have been running a few test runs with the abyss assembler on a subset of my input data (to speed up things) in order to optimise the Kmer. I'm now wondering if I can guesstimate the mem requirements of my full run on the mem usage of these trial runs .

More specifically does the mem scale with the input file size or rather with the genome size? I was thinking that eg. doubling my input size will not result in double mem used as the number of distinct Kmer to keep in mem is most likely plateauing?

any help or other user experiences is much appreciated.

thx

abyss assembly • 871 views
ADD COMMENTlink modified 2.4 years ago by benv710 • written 2.4 years ago by lieven.sterck8.0k
3
gravatar for benv
2.4 years ago by
benv710
Canada
benv710 wrote:

@lieven.sterck,

You have the right idea with respect to distinct k-mers -- the memory usage of ABySS is linear w.r.t. the number of distinct k-mers in the input reads.

The number of distinct k-mers in the data set depends jointly on: (i) genome size, (ii) sequencing error rate (sequencing errors create unique k-mers), and (iii) read coverage.

You can determine the number of distinct k-mers in a data set by using a k-mer counter tool. I recommend "ntCard" from our own lab because it is quite fast.

ABySS does not currently have a feature to estimate memory requirements before running the assembly, unfortunately. It is mostly a trial-and-error affair at the moment.

If you find that you do not have adequate RAM to assemble your target genome, the ABySS Bloom filter assembly mode is worth a look (see the README).

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by benv710

@benv

I started my full run and with a 10-fold increase in input data I only observe a 2-fold increase of the mem-usage. So this seems to confirm our reasoning :-)

thx, L.

ADD REPLYlink written 2.4 years ago by lieven.sterck8.0k

perhaps even more informative, I went from 25 billion Kmers to 37 billion (kmer = 85) . which indeed roughly corresponds to the 2-fold mem increase.

ADD REPLYlink written 2.4 years ago by lieven.sterck8.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1580 users visited in the last hour