I have some k-mers that neither Jellyfish nor DSK was able to count. Is that a bug in the programs?
This is my script for JellyFish
grep TTACATAACACCCATTGTGGCGGCTGCAAGT ABCD.fq | wc
41 41 5166
The 31 length k-mer is in the sample file and there are many of them.
jellyfish count -m 31 -s 100M -C ABCD.fq -o 1.jf
- jellyfish dump 1.jf -c > 1.txt
grep TTACATAACACCCATTGTGGCGGCTGCAAGT 1.txt
Then I tried DSK,
- dsk -kmer-size 31 -abundance-min 0 -file ABCD.fq -out-dir ABCD -out ABCD.h5
- dsk2ascii -file ABCD.h5 -out ABCD.txt
grep TTACATAACACCCATTGTGGCGGCTGCAAGT *.txt
still returns nothing
As you can see, the 31 k-mer is in the input file (ABCD.fq) and there are many of them not being filtered away. I don't see a problem in the input file - it came out from a simulator. It is a valid FASTQ file. Why both programs weren't able to count my k-mer?