Selecting sequences from a multi-fasta with certain kmers
0
0
Entering edit mode
7.9 years ago
michberr8 • 0

Hi,

I have a fasta file with about 30 9-mers. I also have some metagenomes in fasta format which are about 15 GB (~100 million reads, 125 bp) I would like to use my kmer file to filter out sequences from my metagenomes that only have a match to one of the kmers.

There's a lot of kmer counting software out there like jellyfish, tallymer, meryl, but as far as I can tell, none of these have the utility to select or filter sequences based on the presence of kmers.

Does anyone know of software that would do this efficiently?

Thanks

kmer fasta sequence • 2.3k views
ADD COMMENT
2
Entering edit mode

BBduk from BBMap should be able to do this.

ADD REPLY
0
Entering edit mode

that only have a match to one of the kmers.

do you mean

ONE hit match => discard SAME kmer found twice => keep TWO different kmer => keep

?

ADD REPLY
0
Entering edit mode

Sorry, i meant one or more matches to my set of 30 kmers

ADD REPLY

Login before adding your answer.

Traffic: 2835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6