Question: kmergenie generates histo files but no histograms!
0
gravatar for abujamel.t
3.0 years ago by
abujamel.t0
Canada
abujamel.t0 wrote:

I am using kmergenie version 1.6473 to estimate the optimal kmer for my assembly. The program runs perfect, but my problem is that it only generates .histo files but no histograms.

Here is the command I used:

kmergenie unligned_R1.fastq unligned_R2.fastq -o kmer -s 2 -l 13 -t 32

 

Linear estimation: ~51 M distinct 71-mers are in the reads
K-mer sampling: 1/15
| processing                                                                                         |
[going to estimate histograms for values of k: 121 119 117 115 113 111 109 107 105 103 101 99 97 95 93 91 89 87 85 83 81 79 77 75 73 71 69 67 65 63 61 59 57 55 53 51 49 47 45 43 41 39 37 35 33 31 29 27 25 23 21 19 17 15 13
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  628.354 s

The result was only a list of histo files from  kmer-k13.histo to kmer-k121.histo BUT no graphs!!!

Could anyone help me solve this problem?

Cheers,

TJ

kmergenie histogram • 2.0k views
ADD COMMENTlink modified 2.5 years ago by marcela.uliano30 • written 3.0 years ago by abujamel.t0

Hi, can you post the list of files that it generated, as well as the first few lines of one of the .histo files?

Also, did it output any error in stderr?

ADD REPLYlink written 3.0 years ago by Rayan Chikhi1.2k

Hi Rayan,

the files list is as following:

kmer-k13.histo
kmer-k15.histo
kmer-k17.histo
kmer-k19.histo
kmer-k21.histo
kmer-k23.histo
kmer-k25.histo
kmer-k27.histo
kmer-k29.histo
kmer-k31.histo
kmer-k33.histo
kmer-k35.histo
kmer-k37.histo
kmer-k39.histo
kmer-k41.histo
kmer-k43.histo
kmer-k45.histo
kmer-k47.histo
kmer-k49.histo
kmer-k51.histo
kmer-k53.histo
kmer-k55.histo
kmer-k57.histo
kmer-k59.histo
kmer-k61.histo
kmer-k63.histo
kmer-k65.histo
kmer-k67.histo
kmer-k69.histo
kmer-k71.histo
kmer-k73.histo
kmer-k75.histo
kmer-k77.histo
kmer-k79.histo
kmer-k81.histo
kmer-k83.histo
kmer-k85.histo
kmer-k87.histo
kmer-k89.histo
kmer-k91.histo
kmer-k93.histo
kmer-k95.histo
kmer-k97.histo
kmer-k99.histo
kmer-k101.histo
kmer-k103.histo
kmer-k105.histo
kmer-k107.histo
kmer-k109.histo
kmer-k111.histo
kmer-k113.histo
kmer-k115.histo
kmer-k117.histo
kmer-k119.histo
kmer-k121.histo

 

kmer-k121.histo shows:

1    1863
2    0
3    0
4    0
5    0
6    0
7    0
8    0
9    0
10    0
11    0
12    0
13    0
14    0
15    0
16    0
17    0

there was no error

TJ

ADD REPLYlink written 3.0 years ago by abujamel.t0

Thanks. Is this an actual sequencing dataset? There shouldn't be so few kmers that appear once, twice, 3 times, etc.. (that's the meaning of the hist files)

or is this a simulated dataset? Kmergenie only works with real data, or simulated according to realistic sequencing scenario

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Rayan Chikhi1.2k

it is an actual sequencing dataset of metagenomic sequencing generated from Illumina HiSeq2500 sequencing

TJ

 

ADD REPLYlink written 3.0 years ago by abujamel.t0

Thanks. Histogram generation should work on that data. (Note, however, that Kmergenie is not designed for metagenomic data).

Let's see if the fault comes from your system or from that specific dataset. Could you try a simple (small, for quick execution) genomic dataset, using the same parameters "-o kmer -s 2 -l 13 -t 32", and see if kmergenie completes successfully on your system?

ADD REPLYlink written 3.0 years ago by Rayan Chikhi1.2k

Thanks Rayan for your reply,

I tried sample paird end fastq files with 750K reads and 36 reads length. Same problem.

TJ

ADD REPLYlink written 3.0 years ago by abujamel.t0

Alright, so something is wrong with either the system or the software. I checked that command line, it works fine on my computer on a sample dataset.

What's the output of the "make check" command in kmergenie folder, does it print an error?

What's your system? (desktop, cluster?)

Could you try running kmergenie without any option except "-o kmer" on that sample data?

ADD REPLYlink written 3.0 years ago by Rayan Chikhi1.2k

This is the output of "make check"

scripts/test_install
Testing presence of specialk....
OK
Testing presence of Rscript....
R scripting front-end version 3.1.2 (2014-10-31)
OK
Testing basic Rscript functionality....
Rscript --no-init-file -e 'rnorm(1)'
[1] "rnorm(1)"
OK
Testing a simple KmerGenie example....
initial estimate of genomic kmers gaussian mean, sd, error proportion, shape: 3 1.4826 0.9596929 0
p$u.v: 4.114784
abundance    ratio_of_erroneous_over_correct_kmers
1   23228317
2   0.000002223243
3   0.000000007846881
4   0.0000000008400604
cutoff: 1
sum probs good 1.001097cutoff 1
non-repeated genomic distinct kmers:  42
repeated genomic distinct kmers:  0
sum of absolute differences of fit: 2.145231
42
Test successful if the number 42 was printed the line above. KmerGenie is ready, type `./kmergenie`.

My system is a Dell workstation running Biolinux 8 (Ubuntu 14.04)

I tried running the command as you specified, same thing happened but it tested only two kmers: 21 and 31 (only histo files no graphs)

TJ

 

ADD REPLYlink written 3.0 years ago by abujamel.t0

Thanks for this information. Your install looks normal, I'm getting puzzled.

One more debug I can think of, what's the output of the following command, executed in the folder where your .histo files reside, (replace XXX by the path to kmergenie folder): 

XXX/scripts/decide kmer

(kmer is the string given to the -o parameter)

This should run the second phase of Kmergenie manually and print some debug information in stdout.

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Rayan Chikhi1.2k

Hi Rayan,

it worked! it generated the graphs from the histo files.

So what do you think the issue is?

TJ

ADD REPLYlink written 3.0 years ago by abujamel.t0

For some reason the "decide" script doesn't seen to be getting executed. That is very strange, as the histogram creation program is correctly executed; both programs are called the same way in the kmergenie main program. To be honest, I do not know why this happens, and havn't seen that with any other user. Can you please paste the full output of the decide command, maybe I'll find something unusual?

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Rayan Chikhi1.2k
0
gravatar for mtollis
2.8 years ago by
mtollis0
United States
mtollis0 wrote:

Was this ever resolved? I have the same issue, thanks.

ADD COMMENTlink written 2.8 years ago by mtollis0

I haven't heard back from TJ. Could you please execute the following command, and paste the full output, so that I may gain some insight into the problem?

[path_to_kmergenie]scripts/decide [prefix]

where [prefix] is the string given to the -o parameter.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by Rayan Chikhi1.2k

I get a list of histo files but no pdfs when I disown kmergenie and log out (also does not proceed into the second round). When I disown only and keep the terminal alive, it runs fine. Don't know if this might provide any insight. Cheers, John

ADD REPLYlink written 15 months ago by John@brc0
0
gravatar for marcela.uliano
2.5 years ago by
European Union
marcela.uliano30 wrote:

Hi Rayan! You are very attentive, Thank you.

I got a similar issue. The program runned ok, made .histo, pdfs and everything but could not predict best k mer.

So, I've ran ./decide as you said and it worked. Interesting!

The message:

could not fit kmerGenie.histograms-k91.histo
table of predicted num. of genomic k-mers: kmerGenie.histograms.dat
recommended coverage cut-off for best k: 1
best k: 61

 

That's it! Thank you!

 

 

 

 

ADD COMMENTlink written 2.5 years ago by marcela.uliano30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 966 users visited in the last hour