Question: k-mer tools - probability based models
0
gravatar for sam
4.9 years ago by
sam130
United States
sam130 wrote:

I have recently been looking at different k-mer tools (E.g., jellyfish). They all perform well with different computational complexities. However, most of them are counting tools. I'm interested in a tool that finds k-mers that are more than expected (more of a probability-based approach). I was wondering if anyone has worked with or seen a tool that generates k-mer counts + a background distribution?

k-mers rna-seq • 1.9k views
ADD COMMENTlink modified 4.9 years ago by edrezen720 • written 4.9 years ago by sam130
3
gravatar for edrezen
4.9 years ago by
edrezen720
France
edrezen720 wrote:

You can use DSK from the GATB project, which is a kmer counter that also provides an histogram of kmer abundance (see README file for more information). For instance:

dsk -file myreads.fa -kmer-size 31

It will produce a HDF5 file from which you can extract the kmers histogram with the following (the h5dump tool is provided with DSK) :

h5dump -y -d dsk/histogram myreads.h5 | grep [0-9] | grep -v [A-Z].* | paste - -

You can plot directly with gnuplot :

h5dump -y -d dsk/histogram myreads.h5 | grep [0-9] | grep -v [A-Z].* | paste - - | gnuplot -p -e 'plot [][0:100] "-" with lines'

There is also a tool 'dsk2ascii' that gives the list of (kmers,count) in a human readable format, so you can do some processing on it.

ADD COMMENTlink written 4.9 years ago by edrezen720
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 817 users visited in the last hour