Extracting k-mer counts from multiple genome sequence files
3
0
Entering edit mode
17 months ago
yas ▴ 20

Good day everyone, I am new here.

So, I have downloaded 8 completed genome fasta files for 8 strains of Bacillus subtilis spp.

My aim is to do classification based on their k-mer abundance profiles.

I am wondering, is there any tools that I can use to generate and extract the k-mer counts for each of the 8 genome files in a single output?

k-mer • 1.2k views
ADD COMMENT
3
Entering edit mode
17 months ago

Two very popular tools specifically designed for k-mer counting:

  1. KMC3 [tool] [paper]

  2. Jellyfish [tool] [paper]

Other k-mer counting tools are listed in the benchmark study of k-mer counting methods.

ADD COMMENT
0
Entering edit mode

Thank you!

ADD REPLY
1
Entering edit mode
17 months ago
5heikki 11k

Mash is an excellent tool for this kind of thing. It's far more sophisticated than simple k-mer abundance counting..

ADD COMMENT
0
Entering edit mode

Thank you so much for the answer. But I need to generate the k-mer abundance profiles.

ADD REPLY
0
Entering edit mode
17 months ago

You could use kmercountmulti.sh from the BBTools Suite if you are specifically interested in k-mers.

ADD COMMENT
1
Entering edit mode

Thanks for the suggestion.

ADD REPLY

Login before adding your answer.

Traffic: 2639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6