Question: How to extract reads that match k-mer profiles from a collection of sequences?
4 days ago
O.rka50 wrote:

Let's say you had 10 draft-genome assemblies from different sources with 100 contigs all together from a particular genus.

Are there any tools that allow you to use this "database" of assemblies to then grab any reads that have even a remotely similar k-mer usage to the "database"?

I know about kneaddata but that is mapping to a very specific reference sequence, I'm looking for a way to extract reads that have similar k-mer usage.

Is there a tool that I can use to do this?

You can use mash ( ), sourmash ( ) or from BBMap suite ( ) can help you with the classification but I am not sure if they have functionality to extract those reads.

4 days ago
United States
genomax57k wrote:

cookiecutter ( ) seems to do what you need. You will need to test and ascertain.

ADD COMMENTlink written 4 days ago by genomax57k
