Question: kmergenie [OSError: [Errno 2] No such file or directory]
0
gravatar for wukai199010
3.2 years ago by
United States
wukai1990100 wrote:

Hi,

When run the command:  kmergenie list_files.txt -t 16.

The list_files.txt is: 

/share/solid/kwu/my_projects/ipag_pj029/data/raw_data/ipagpj029hmc002.1_1.fastq
/share/solid/kwu/my_projects/ipag_pj029/data/raw_data/ipagpj029hmc002.1_2.fastq
/share/solid/kwu/my_projects/ipag_pj029/data/raw_data/ipagpj029hmc002.2_1.fastq
/share/solid/kwu/my_projects/ipag_pj029/data/raw_data/ipagpj029hmc002.2_2.fastq

 

I get the error :

 

Caught exception in fit_histogram worker thread (histfile = histograms-k81.histo):
Traceback (most recent call last):
  File "/share/work/lhuang/my_apps/kmergenie-1.6741/scripts/decide", line 62, in fit_histogram
    rc, stdout, stderr = run(command)
  File "/share/work/lhuang/my_apps/kmergenie-1.6741/scripts/decide", line 43, in run
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/home/kwu/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/home/kwu/lib/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Caught exception in fit_histogram worker thread (histfile = histograms-k71.histo):
Traceback (most recent call last):
  File "/share/work/lhuang/my_apps/kmergenie-1.6741/scripts/decide", line 62, in fit_histogram
    rc, stdout, stderr = run(command)
  File "/share/work/lhuang/my_apps/kmergenie-1.6741/scripts/decide", line 43, in run
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/home/kwu/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/home/kwu/lib/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Caught exception in fit_histogram worker thread (histfile = histograms-k41.histo):
Traceback (most recent call last):
  File "/share/work/lhuang/my_apps/kmergenie-1.6741/scripts/decide", line 62, in fit_histogram
    rc, stdout, stderr = run(command)
  File "/share/work/lhuang/my_apps/kmergenie-1.6741/scripts/decide", line 43, in run
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/home/kwu/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/home/kwu/lib/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Caught exception in fit_histogram worker thread (histfile = histograms-k61.histo):
Traceback (most recent call last):
  File "/share/work/lhuang/my_apps/kmergenie-1.6741/scripts/decide", line 62, in fit_histogram
    rc, stdout, stderr = run(command)
  File "/share/work/lhuang/my_apps/kmergenie-1.6741/scripts/decide", line 43, in run
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/home/kwu/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/home/kwu/lib/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
Execution of decide failed (return code 1)

 

I don't find the reason from the error message, so, can you tell me?

And, can kmergenie using to estimate the heterozygosity?

All comments are welcome. Thanks.

 

kmergenie • 2.4k views
ADD COMMENTlink modified 3.2 years ago by Rayan Chikhi1.2k • written 3.2 years ago by wukai1990100
1

Hi, can you post the output of the "make check" command please?

ADD REPLYlink written 3.2 years ago by Rayan Chikhi1.2k

Nice, Thank you!

ADD REPLYlink written 3.2 years ago by wukai1990100
1
gravatar for Rayan Chikhi
3.2 years ago by
Rayan Chikhi1.2k
France, Lille, CNRS
Rayan Chikhi1.2k wrote:

Looks like the "make check" test helped you solved it ;)

Also kmergenie cannot directly estimate hetezygosity precisely. Although looking at kmer histograms for low values of k can visually show you the proportion of heterozygous kmers vs homozygous ones. 

ADD COMMENTlink written 3.2 years ago by Rayan Chikhi1.2k

Ok, thanks! So nice for that.

Yes, when I "make check", I find the reason is my R version is too old.

For the estimate hetezygosity, I using the BGI's "gce" help me do this.

But, today, when using the "----diploid" option to run the kmergenie again, it's don't work now, the result is "Could not predict a best k value".

Also, the estimate genome size of "kergnie" and BGI's "SOAPec"(using the best k-mer of kmergenie prediction) is so different. About 3 times.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by wukai1990100

Please post the Kmergenie HTML report if possible (either the one with --diploid or without, doesn't matter much), I can take a look at it.

ADD REPLYlink written 3.2 years ago by Rayan Chikhi1.2k

Sorry, I don't know how to post the HTML result file to you. Can I using the e-mail?

ADD REPLYlink written 3.2 years ago by wukai1990100

Yes sure, kmergenie@cse.psu.edu

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by Rayan Chikhi1.2k

Thanks much for having sent the histograms to me.

Based on them, it appears that the dataset is quite low-coverage (below 30x), and therefore any assembly of it will be of poor quality. I recommend that you sequence more data.

Using the kmergenie reports, it is easy to detect low coverage: the separation between erroneous kmers and genomic kmers is almost nonexistent in all histograms, even for low k values. (see Kmergenie paper for histograms with a clear separation) Also, any heterozygous kmers peak would be undetectable in this situation, hence it is not surprising that kmergenie with --diploid option failed to fit the histograms.

Kmergenie was still able to fit the model (red curve) to the haploid histograms, and the fit appears reasonable. The estimated assembly size should be more or less accurate, however please note that it is not the genome size, it is the size of an assembly (with repeated and heterozygous regions collapsed into single regions), thus lower than the true genome size. It's easy to verify this prediction by just assembling the dataset with any assembler.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by Rayan Chikhi1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1017 users visited in the last hour