Entering edit mode
3.0 years ago
MarVi
▴
30
Hello everyone,
I would like to ask if someone knows an efficient way to quantify the overlap of kmers between two databases jellyfish? Knowing that both jf databases are large database and both are generated with the same kmer length.
So far, I dump one database and query each kmer in the other, saving the output which will then be read to get the number of non-zero queries, but this is taking forever!
I thank in advance your advices!
MarVi
Not sure what exactly you are trying to do but if you are comparing genomic or metagenomic datasets then take a look at
sourmash
(LINK). A tutorial is available.I am sorry if I wasn't clear.
I have RNA-seq of two different conditions. I used jellyfish tool to counting the occurrences of all 33-mers for each condition, then I have the two jellyfish k-mer counts in a binary format (these are the jellyfish db that I am referring to).
What I need, is to be able to compare both conditions by quantifying their overlap and uniqueness of the 33-mers, using these jellyfish databases.