Closed:Kmer Frequency Distribution And Genome Complexity
0
1
Entering edit mode
10.1 years ago
Rohit ★ 1.5k

I have a question regarding the kmer frequency distribution and Genome complexity.

I have the kmer distribution numbers (from kmergenie) for a newly sequenced genome of lizards, which are known to be highly polymorphic and have a highly repetitive genome. The distribution shows peaks at different kmer frequency numbers. For instance 19, 25, 29, 37, 49, 59, 69, 79, 83, 89. I do not understand this multiple peaks and multiple sub-peaks, including sub-peaks at half the kmer of a bigger peak.

I have read that kmer frequency peak at lesser than 20, about 17 means bacterial contamination. And a sub peak at half the value of a main peak means polymorphism.

http://arxiv.org/pdf/1308.2012.pdf - Heterozygosity and halk peak

https://groups.google.com/forum/#!topic/bgi-soap/xKS39Nz4SCE - Polymorphism with multiple peaks

But I cannot make anything out of the pattern I have right now. Does it mean there are chances of contamination. Or it that the genome is heterozygous, polymorphic and repetitive genome? Or something that I am missing??

ngs genome • 2.3k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2047 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6