Question: How to extract reads that match k-mer profiles from a collection of sequences?
0
gravatar for O.rka
9 weeks ago by
O.rka70
O.rka70 wrote:

Let's say you had 10 draft-genome assemblies from different sources with 100 contigs all together from a particular genus.

Are there any tools that allow you to use this "database" of assemblies to then grab any reads that have even a remotely similar k-mer usage to the "database"?

I know about kneaddata but that is mapping to a very specific reference sequence, I'm looking for a way to extract reads that have similar k-mer usage.

Is there a tool that I can use to do this?

sequencing • 197 views
ADD COMMENTlink modified 9 weeks ago by genomax59k • written 9 weeks ago by O.rka70

You can use mash (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0997-x ), sourmash ( https://sourmash.readthedocs.io/en/latest/ ) or bbsketch.sh from BBMap suite (https://sourceforge.net/projects/bbmap/ ) can help you with the classification but I am not sure if they have functionality to extract those reads.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by genomax59k

https://github.com/will-rowe/hulk/ ?

ADD REPLYlink written 9 weeks ago by shenwei3564.3k
1
gravatar for genomax
9 weeks ago by
genomax59k
United States
genomax59k wrote:

cookiecutter (https://github.com/ad3002/Cookiecutter ) seems to do what you need. You will need to test and ascertain.

ADD COMMENTlink written 9 weeks ago by genomax59k

Is that more for extract adapters or can it be extended to entire genome k-mer profiles?

ADD REPLYlink written 8 weeks ago by O.rka70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1679 users visited in the last hour