Question: Execution of decide failed (return code 0)
0
gravatar for Anaïs Vittu
3.0 years ago by
European Union
Anaïs Vittu0 wrote:

Hello,

I am trying to run kmergenie on paired-end reads. So I created a file with the 2 files and ran kmergenie like below.

anais$ ls -l *.fq
-rw-r--r--  1 anais  staff  326796250 22 oct 11:58 C0CYDACXX_1.fq
-rw-r--r--  1 anais  staff  326796250 22 oct 11:58 C0CYDACXX_2.fq
anais$ ls -l *.fq > filesToAssemble.txt
anais$ kmergenie filesToAssemble.txt
running histogram estimation
File filesToAssemble.txt starts with character "f", hence is interpreted as a list of file names
Reading 0 read files
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

Kmergenie interpreted my input as a file, right but after it was not able to read the 2 files.

So I decided to input directly 1 fastq file and it did not recognize it as a fastq file.

anais$ kmergenie C0CYDACXX_1.fq
running histogram estimation
File C0CYDACXX_1.fq starts with character "C", hence is interpreted as a list of file names
Reading 0 read files
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

Anyone have a suggestion on what is going on ?

The make check was successful.

anais$ make check
scripts/test_install
Testing presence of specialk....
OK
Testing presence of Rscript....
R scripting front-end version 2.15.1 (2012-06-22)
OK
Testing basic Rscript functionality....
Rscript --no-init-file -e 'rnorm(1)'
[1] "rnorm(1)"
OK
Testing a simple KmerGenie example....
initial estimate of genomic kmers gaussian mean, sd, error proportion, shape: 3 1.4826 0.9596929 0
p$u.v: 4.114784
abundance    ratio_of_erroneous_over_correct_kmers
1   23228317
2   0.000002223243
3   0.000000007846881
4   0.0000000008400604
cutoff: 1
sum probs good 1.001097cutoff 1
non-repeated genomic distinct kmers:  42
repeated genomic distinct kmers:  0
sum of absolute differences of fit: 2.145231
42
Test successful if the number 42 was printed the line above. KmerGenie is ready, type `./kmergenie`.

 

Thank you for helping me !

 

Anaïs

fail kmergenie • 1.6k views
ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by Anaïs Vittu0
0
gravatar for Rayan Chikhi
3.0 years ago by
Rayan Chikhi1.2k
France, Lille, CNRS
Rayan Chikhi1.2k wrote:

Hello,

 

"ls -l" won't work to create a list of files, try "ls -1" (I know, it's subtle!)

 

Regarding the other issue, when inputting a file directly, can you paste the first few lines of  C0CYDACXX_1.fq?

Kmergenie seems to think that the first character is "C", and not ">"/"@" as expected in fasta/fastq

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by Rayan Chikhi1.2k
0
gravatar for Anaïs Vittu
3.0 years ago by
European Union
Anaïs Vittu0 wrote:

Ok, there are very similar!

anais$ ls -1 *.fq
C0CYDACXX_1.fq
C0CYDACXX_2.fq

anais$ ls -1 *.fq > filesToAssemble.txt

anais$ kmergenie filesToAssemble.txt
running histogram estimation
File filesToAssemble.txt starts with character "f", hence is interpreted as a list of file names
Reading 0 read files
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

 

anais$ head C0CYDACXX_1.fq
@HWI-ST999:54:C0CYDACXX:3:1101:1110:2025 1:N:0:
NGGTAACATATTCCTACGGAAATGTTTCTAGATTATTTTGCACCTTTTTAGTAAAGTGCAATGAAATCGTTCCGATGTGCTATTATGCTAATTTTGTCATGGAG
+
#1:A:B?BFBBFCFFFFF:GIAHDFFHIEEF4CBBFFEI>GE3???FFI@@G?@<*??D<8B=@FEFGFIEFEEA?4;;>?BDD@@@@;AAADDBA;@ABB@A<
@HWI-ST999:54:C0CYDACXX:3:1101:1150:2032 1:N:0:
NAGTAAGTAATGTAAGCCATAAAAAACAACAGTAGTGGTAGGCTTCTGATGTACAGCTGTGCGGTTAATCACCACAAAAAGCTATTGCTTCTTTCTTATCTCAA
+
#1=DDFFDHGHHHJJJJJIJJIJJJJJJJJJJJJEGGIGGIJJGIJIGCIIGGHIIIIJJJJJIHIHHHHHFFFFACCDDDDDDDDCDDDDDDDDDDDDDDDDC
@HWI-ST999:54:C0CYDACXX:3:1101:1203:2036 1:N:0:
NAGCAATAAATAACGACTTACATGATCGATTTGATTTACACCAAAATGTTCACCTTCCATTTCCGTTCAGCTCTAAGGTTTATGACCCTTCATTCACTTTATTC

The first character is a "@" and not a ">", that's why I think.

ADD COMMENTlink written 3.0 years ago by Anaïs Vittu0

Thanks for the follow-up, this is strange, can you please paste the output of the commands "uname -a" and "cat /etc/*-release"?

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Rayan Chikhi1.2k
anais$ uname -a
Darwin upr9022-127-Anais.local 13.4.0 Darwin Kernel Version 13.4.0: Sun Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64 x86_64
anais$ cat /etc/*-release
cat: /etc/*-release: No such file or directory

The version of KmerGenie that I am using is 1.6741.

ADD REPLYlink written 3.0 years ago by Anaïs Vittu0

Thanks. Please give me some time, I'll try on a Mac machine to see if I can somehow reproduce the problem.

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Rayan Chikhi1.2k

Hello,

I try on a Linux machine.

$ ls -1 *.fq > filesForKmergenie.txt

$ kmergenie filesForKmergenie.txt
running histogram estimation
File filesForKmergenie.txt starts with character "C", hence is interpreted as a list of file names
Reading 3 read files
error opening file:
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

And it did not work. It saw 3 files in my file but there are only 2.

I tried the fastq file directly and it worked !

$ kmergenie C0CYDACXX_1.fq
running histogram estimation
Linear estimation: ~53 M distinct 61-mers are in the reads
K-mer sampling: 1/10
| processing                                                                                         |
[going to estimate histograms for values of k: 101 91 81 71 61 51 41 31 21
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  55.9048 s
fitting model to histograms to estimate best k
fitting histogram for k = 101
fitting histogram for k = 21
fitting histogram for k = 31
fitting histogram for k = 41
fitting histogram for k = 51
fitting histogram for k = 61
fitting histogram for k = 71
fitting histogram for k = 81
fitting histogram for k = 91
estimation of the best k so far: 21
refining estimation around [15; 27], with a step of 2
running histogram estimation
Linear estimation: ~95 M distinct 24-mers are in the reads
K-mer sampling: 1/18
| processing                                                                                         |
[going to estimate histograms for values of k: 27 25 23 21 19 17 15
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  44.1746 s
fitting model to histograms to estimate best k
fitting histogram for k = 101
fitting histogram for k = 15
fitting histogram for k = 17
fitting histogram for k = 19
fitting histogram for k = 21
fitting histogram for k = 23
fitting histogram for k = 25
fitting histogram for k = 27
fitting histogram for k = 31
fitting histogram for k = 41
fitting histogram for k = 51
fitting histogram for k = 61
fitting histogram for k = 71
fitting histogram for k = 81
fitting histogram for k = 91
table of predicted num. of genomic k-mers: histograms.dat
best k: 27

I really do not understand why.

I used kmergenie version 1.5579.

$ uname -a
Linux hpc-login 2.6.32-279.19.1.el6.x86_64 #1 SMP Tue Dec 18 17:22:54 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/*-release
Scientific Linux release 6.3 (Carbon)
Scientific Linux release 6.3 (Carbon)
ADD REPLYlink written 3.0 years ago by Anaïs Vittu0

Nice that you have a Linux machine handy. Regarding that problem, it seems that you have 3 lines in the .txt file, one of them is empty I suppose? If so, that's the problem. Could it be that your shell adds an empty line after each command automatically?

Regardless, I'll work on input cleaning of .txt files.

ADD REPLYlink written 3.0 years ago by Rayan Chikhi1.2k

Indeed, there was a "\n" in the third line. I removed it and Kmergenie worked well.

But it is on a Linux machine and it is the version 1.5579. I will install the last version 1.6741 to try.

Otherwise, I check something else on my Mac :

anais$ kmergenie C0CYDACXX_1.fq
running histogram estimation
File C0CYDACXX_1.fq starts with character "C", hence is interpreted as a list of file names
Reading 0 read files
Invalid smallest (15) and largest kmer (1) sizes
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)
anais$ cp C0CYDACXX_1.fq toto
anais$ kmergenie toto
running histogram estimation
File toto starts with character "t", hence is interpreted as a list of file names
Reading 0 read files
Invalid smallest (15) and largest kmer (1) sizes
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

anais$ cp C0CYDACXX_1.fq @_1.fq

anais$ kmergenie \@_1.fq
running histogram estimation
Linear estimation: ~53 M distinct 61-mers are in the reads
K-mer sampling: 1/10
| processing                                                                                         |
[going to estimate histograms for values of k: 101 91 81 71 61 51 41 31 21
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  56.8612 s
fitting model to histograms to estimate best k
fitting histogram for k = 101
fitting histogram for k = 21
fitting histogram for k = 31
fitting histogram for k = 41
fitting histogram for k = 51
fitting histogram for k = 61
fitting histogram for k = 71
fitting histogram for k = 81
fitting histogram for k = 91
estimation of the best k so far: 21
refining estimation around [15; 27], with a step of 2
running histogram estimation
Linear estimation: ~95 M distinct 24-mers are in the reads
K-mer sampling: 1/18
| processing                                                                                         |
[going to estimate histograms for values of k: 27 25 23 21 19 17 15
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  38.4895 s
fitting model to histograms to estimate best k
fitting histogram for k = 101
fitting histogram for k = 15
fitting histogram for k = 17
fitting histogram for k = 19
fitting histogram for k = 21
fitting histogram for k = 23
fitting histogram for k = 25
fitting histogram for k = 27
fitting histogram for k = 31
fitting histogram for k = 41
fitting histogram for k = 51
fitting histogram for k = 61
fitting histogram for k = 71
fitting histogram for k = 81
fitting histogram for k = 91
table of predicted num. of genomic k-mers: histograms.dat
best k: 27

Ok, so when I change the name of my input file, it works. But not all changes worked (example toto).
 And when I create the file with the names of my 2 fastq files changed, it did not work. I obtained the same results with the version 1.5579.

 

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Anaïs Vittu0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 744 users visited in the last hour