Question

Execution of decide failed (return code 0)

0

Entering edit mode

10.7 years ago

Anaïs Vittu • 0

Hello,

I am trying to run kmergenie on paired-end reads. So I created a file with the 2 files and ran kmergenie like below.

anais$ ls -l *.fq
-rw-r--r--  1 anais  staff  326796250 22 oct 11:58 C0CYDACXX_1.fq
-rw-r--r--  1 anais  staff  326796250 22 oct 11:58 C0CYDACXX_2.fq

anais$** ls -l *.fq > filesToAssemble.txt

anais$ kmergenie filesToAssemble.txt
running histogram estimation
File filesToAssemble.txt starts with character "f", hence is interpreted as a list of file names
Reading 0 read files
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

Kmergenie interpreted my input as a file, right but after it was not able to read the 2 files.

So I decided to input directly 1 fastq file and it did not recognize it as a fastq file.

anais$ kmergenie C0CYDACXX_1.fq
running histogram estimation
File C0CYDACXX_1.fq starts with character "C", hence is interpreted as a list of file names
Reading 0 read files
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

Anyone have a suggestion on what is going on?

The make check was successful.

anais$ make check
scripts/test_install
Testing presence of specialk....
OK
Testing presence of Rscript....
R scripting front-end version 2.15.1 (2012-06-22)
OK
Testing basic Rscript functionality....
Rscript --no-init-file -e 'rnorm(1)'
[1] "rnorm(1)"
OK
Testing a simple KmerGenie example....
initial estimate of genomic kmers gaussian mean, sd, error proportion, shape: 3 1.4826 0.9596929 0
p$u.v: 4.114784
abundance    ratio_of_erroneous_over_correct_kmers
1   23228317
2   0.000002223243
3   0.000000007846881
4   0.0000000008400604
cutoff: 1
sum probs good 1.001097cutoff 1
non-repeated genomic distinct kmers:  42
repeated genomic distinct kmers:  0
sum of absolute differences of fit: 2.145231
42
Test successful if the number 42 was printed the line above. KmerGenie is ready, type `./kmergenie`.

Thank you for helping me !

Anaïs

KmerGenie • 3.9k views

ADD COMMENT • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Anaïs Vittu • 0

Ram · Answer 1 · 2014-11-05

0

Entering edit mode

10.7 years ago

Rayan Chikhi ★ 1.6k

Hello,

ls -l won't work to create a list of files, try ls -1 (I know, it's subtle!)

Regarding the other issue, when inputting a file directly, can you paste the first few lines of C0CYDACXX_1.fq?

Kmergenie seems to think that the first character is C, and not >/@ as expected in fasta/fastq

ADD COMMENT • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Rayan Chikhi ★ 1.6k

0

Entering edit mode

Ok, there are very similar!

anais$ ls -1 *.fq
C0CYDACXX_1.fq
C0CYDACXX_2.fq

anais$ ls -1 *.fq > filesToAssemble.txt

anais$ kmergenie filesToAssemble.txt
running histogram estimation
File filesToAssemble.txt starts with character "f", hence is interpreted as a list of file names
Reading 0 read files
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

anais$ head C0CYDACXX_1.fq
@HWI-ST999:54:C0CYDACXX:3:1101:1110:2025 1:N:0:
NGGTAACATATTCCTACGGAAATGTTTCTAGATTATTTTGCACCTTTTTAGTAAAGTGCAATGAAATCGTTCCGATGTGCTATTATGCTAATTTTGTCATGGAG
+
#1:A:B?BFBBFCFFFFF:GIAHDFFHIEEF4CBBFFEI>GE3???FFI@@G?@<*??D<8B=@FEFGFIEFEEA?4;;>?BDD@@@@;AAADDBA;@ABB@A<
@HWI-ST999:54:C0CYDACXX:3:1101:1150:2032 1:N:0:
NAGTAAGTAATGTAAGCCATAAAAAACAACAGTAGTGGTAGGCTTCTGATGTACAGCTGTGCGGTTAATCACCACAAAAAGCTATTGCTTCTTTCTTATCTCAA
+
#1=DDFFDHGHHHJJJJJIJJIJJJJJJJJJJJJEGGIGGIJJGIJIGCIIGGHIIIIJJJJJIHIHHHHHFFFFACCDDDDDDDDCDDDDDDDDDDDDDDDDC
@HWI-ST999:54:C0CYDACXX:3:1101:1203:2036 1:N:0:
NAGCAATAAATAACGACTTACATGATCGATTTGATTTACACCAAAATGTTCACCTTCCATTTCCGTTCAGCTCTAAGGTTTATGACCCTTCATTCACTTTATTC

The first character is a "@" and not a ">", that's why I think.

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Anaïs Vittu • 0

0

Entering edit mode

Thanks for the follow-up, this is strange, can you please paste the output of the commands uname -a and cat /etc/*-release?

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Rayan Chikhi ★ 1.6k

0

Entering edit mode

anais$ uname -a
Darwin upr9022-127-Anais.local 13.4.0 Darwin Kernel Version 13.4.0: Sun Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64 x86_64
anais$ cat /etc/*-release
cat: /etc/*-release: No such file or directory

The version of KmerGenie that I am using is 1.6741.

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Anaïs Vittu • 0

0

Entering edit mode

Thanks. Please give me some time, I'll try on a Mac machine to see if I can somehow reproduce the problem.

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Rayan Chikhi ★ 1.6k

0

Entering edit mode

Hello,

I try on a Linux machine.

$ ls -1 *.fq > filesForKmergenie.txt

$ kmergenie filesForKmergenie.txt
running histogram estimation
File filesForKmergenie.txt starts with character "C", hence is interpreted as a list of file names
Reading 3 read files
error opening file:
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

And it did not work. It saw 3 files in my file but there are only 2.

I tried the fastq file directly and it worked !

$ kmergenie C0CYDACXX_1.fq
running histogram estimation
Linear estimation: ~53 M distinct 61-mers are in the reads
K-mer sampling: 1/10
| processing                                                                                         |
[going to estimate histograms for values of k: 101 91 81 71 61 51 41 31 21
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  55.9048 s
fitting model to histograms to estimate best k
fitting histogram for k = 101
fitting histogram for k = 21
fitting histogram for k = 31
fitting histogram for k = 41
fitting histogram for k = 51
fitting histogram for k = 61
fitting histogram for k = 71
fitting histogram for k = 81
fitting histogram for k = 91
estimation of the best k so far: 21
refining estimation around [15; 27], with a step of 2
running histogram estimation
Linear estimation: ~95 M distinct 24-mers are in the reads
K-mer sampling: 1/18
| processing                                                                                         |
[going to estimate histograms for values of k: 27 25 23 21 19 17 15
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  44.1746 s
fitting model to histograms to estimate best k
fitting histogram for k = 101
fitting histogram for k = 15
fitting histogram for k = 17
fitting histogram for k = 19
fitting histogram for k = 21
fitting histogram for k = 23
fitting histogram for k = 25
fitting histogram for k = 27
fitting histogram for k = 31
fitting histogram for k = 41
fitting histogram for k = 51
fitting histogram for k = 61
fitting histogram for k = 71
fitting histogram for k = 81
fitting histogram for k = 91
table of predicted num. of genomic k-mers: histograms.dat
best k: 27

I really do not understand why.

I used kmergenie version 1.5579.

$ uname -a
Linux hpc-login 2.6.32-279.19.1.el6.x86_64 #1 SMP Tue Dec 18 17:22:54 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/*-release
Scientific Linux release 6.3 (Carbon)
Scientific Linux release 6.3 (Carbon)

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Anaïs Vittu • 0

0

Entering edit mode

Nice that you have a Linux machine handy. Regarding that problem, it seems that you have 3 lines in the .txt file, one of them is empty I suppose? If so, that's the problem. Could it be that your shell adds an empty line after each command automatically?

Regardless, I'll work on input cleaning of .txt files.

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Rayan Chikhi ★ 1.6k

0

Entering edit mode

Indeed, there was a "\n" in the third line. I removed it and Kmergenie worked well.

But it is on a Linux machine and it is the version 1.5579. I will install the last version 1.6741 to try.

Otherwise, I check something else on my Mac :

anais$ kmergenie C0CYDACXX_1.fq
running histogram estimation
File C0CYDACXX_1.fq starts with character "C", hence is interpreted as a list of file names
Reading 0 read files
Invalid smallest (15) and largest kmer (1) sizes
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

anais$ cp C0CYDACXX_1.fq toto

anais$ kmergenie toto
running histogram estimation
File toto starts with character "t", hence is interpreted as a list of file names
Reading 0 read files
Invalid smallest (15) and largest kmer (1) sizes
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

anais$ cp C0CYDACXX_1.fq @_1.fq

anais$ kmergenie \@_1.fq
running histogram estimation
Linear estimation: ~53 M distinct 61-mers are in the reads
K-mer sampling: 1/10
| processing                                                                                         |
[going to estimate histograms for values of k: 101 91 81 71 61 51 41 31 21
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  56.8612 s
fitting model to histograms to estimate best k
fitting histogram for k = 101
fitting histogram for k = 21
fitting histogram for k = 31
fitting histogram for k = 41
fitting histogram for k = 51
fitting histogram for k = 61
fitting histogram for k = 71
fitting histogram for k = 81
fitting histogram for k = 91
estimation of the best k so far: 21
refining estimation around [15; 27], with a step of 2
running histogram estimation
Linear estimation: ~95 M distinct 24-mers are in the reads
K-mer sampling: 1/18
| processing                                                                                         |
[going to estimate histograms for values of k: 27 25 23 21 19 17 15
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  38.4895 s
fitting model to histograms to estimate best k
fitting histogram for k = 101
fitting histogram for k = 15
fitting histogram for k = 17
fitting histogram for k = 19
fitting histogram for k = 21
fitting histogram for k = 23
fitting histogram for k = 25
fitting histogram for k = 27
fitting histogram for k = 31
fitting histogram for k = 41
fitting histogram for k = 51
fitting histogram for k = 61
fitting histogram for k = 71
fitting histogram for k = 81
fitting histogram for k = 91
table of predicted num. of genomic k-mers: histograms.dat
best k: 27

Ok, so when I change the name of my input file, it works. But not all changes worked (example toto).

And when I create the file with the names of my 2 fastq files changed, it did not work. I obtained the same results with the version 1.5579.

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Anaïs Vittu • 0