Question: compiling kmergenie 1.7038. Is it possible?
0
gravatar for torsten
9 months ago by
torsten0
torsten0 wrote:

I have downloaded kmergenie 1.7038 and attempted to compile it on (1) Ubuntu 14.01, (2) a cluster which I think is based on Suse Linux, and (3) Mac OS X (10.10.5). The compilation instructions are very simple ("make"), but have failed on all three platforms. The failures seems to be related to the bundled ntcard software while linking. On Ubuntu, the long string of 'undefined reference' errors and 'access beyond end' errors concludes with:

/usr/bin/ld: ntcard-ntcard.o: access beyond end of merged section (36032) 
/usr/bin/ld: ntcard-ntcard.o(.debug_info+0x69e5): reloc against `.debug_str': error 2
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
make[2]: *** [ntcard] Error 1

The errors superficially do not look the same on different platforms. In order to check ntcard itself, I have downloaded and compiled that application separately (no errors).

I would be grateful for any suggestions of how to get this to compile. Thanks!

kmer kmergenie compilation • 560 views
ADD COMMENTlink modified 8 months ago by Rayan Chikhi1.2k • written 9 months ago by torsten0

I don't believe that KmerGenie has a solid theoretical ground for its claims.

KmerGenie estimates the best k-mer length for genome de novo assembly. Given a set of reads, KmerGenie first computes the k-mer abundance histogram for many values of k. Then, for each value of k, it predicts the number of distinct genomic k-mers in the dataset, and returns the k-mer length which maximizes this number. Experiments show that KmerGenie's choices lead to assemblies that are close to the best possible over all k-mer lengths.

Why? I don't know. It doesn't make any sense to me.

However, BBMap has a tool called TadWrapper that will rapidly do assemblies at various kmer lengths and tell you which assembly actually had the best contiguity. You can use it like this:

tadwrapper.sh in=reads.fq out=contigs_k%.fa k=31,62,93,124 bisect expand

Will that tell you the exactly optimal kmer length for the assembler that you eventually plan to use? No, that's impossible; the only way to do that is to assemble at multiple kmer lengths with the actual assembler you will use. But, it will give you a very close approximation, since it actually does an assembly with that kmer length.

If you do want to follow KmerGenie's approach and find out which kmer length yields the maximal number of unique kmers, you can do that with BBMap's "kmercountmulti.sh" tool, which is extremely fast. But I don't recommend that.

BBMap is already compiled, so you just unzip it and it will work as long as you have Java installed.

ADD REPLYlink modified 9 months ago • written 9 months ago by Brian Bushnell14k

Thank you for your suggestions, Brian. I will definitely look into it. /T

ADD REPLYlink written 9 months ago by torsten0

Hi Brian,

The theoretical foundations of kmergenie can be found in Section 2 of our article. Please feel free to email us if any detail was unclear there.. This article was published in 2013 but I continue to believe that the theoretical grounds there still hold for past and current Illumina single-k genome assemblies ;)

Rayan

ADD REPLYlink modified 8 months ago • written 8 months ago by Rayan Chikhi1.2k
2
gravatar for Wede
9 months ago by
Wede20
Wede20 wrote:

Hi, i had the same problem, remove ntcard directory in kmergenie directory, then make a git clone of ntcard

git clone https://github.com/bcgsc/ntCard.git

In ntCard/

./autogen.sh
./configure
make

Retry 'make' in kmergenie/

ADD COMMENTlink written 9 months ago by Wede20
1
gravatar for Rayan Chikhi
8 months ago by
Rayan Chikhi1.2k
France, Lille, CNRS
Rayan Chikhi1.2k wrote:

Hi, kmergenie has been updated to version 1.7039, hopefully it resolves this compilation issue. Please let me know if it doesn't. (kmergenie@cse.psu.edu)

ADD COMMENTlink modified 8 months ago • written 8 months ago by Rayan Chikhi1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 895 users visited in the last hour