Unique Shortest Kmer Size For It To Be Unique In The Genome
2
3
Entering edit mode
12.0 years ago
Abhi ★ 1.6k

Hey Guys

I am not recollecting the math to find the kmer length that will be unique , given a genome size... How does this change for a metagenome..I guess there is no real way to estimate it for metagenomes right as the genomic space is quite large ?

Thanks! -Abhi

genome • 4.2k views
ADD COMMENT
1
Entering edit mode

A kmer that's (length_of_genome / 2) + 1 NT long guaranteed to be unique ;-)

ADD REPLY
1
Entering edit mode

No, it is not. Simple example: "AAAA" as genome. I do agree though that this example is a bit far fetched :-)

ADD REPLY
0
Entering edit mode

well played ... touché ;-)

ADD REPLY
0
Entering edit mode

hehe.. I should add the smallest kmer size

ADD REPLY
2
Entering edit mode
12.0 years ago
Gjain 5.8k

Hi Abhi,

I think this paper will answer your question:

A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes

enter image description here

ADD COMMENT
2
Entering edit mode
12.0 years ago

I think that there is no such formula beyond of what Steve said.

At best you can estimate the probability of observing a certain kmer provided the genome or data was randomly sampled from a certain base distribution.

ADD COMMENT
0
Entering edit mode

I agree Istvan..so to rephrase it..I am looking for a kmer size given a genome size that you would expect it to map randomly to the genome with a very low probability say (10^ -3 or low). I did find some text about this but haven't had a chance to go through it in detail. Once I am done I will post about it.

ADD REPLY

Login before adding your answer.

Traffic: 2620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6