Appropriate k-mer size for mappability mask
2.5 years ago

As part of MSMC, you need a mappability mask. However, for non-model organisms, you'll likely have to generate the masks yourself. I am using 3 spined sticklebacks, and my sequencing for each individual is comprised of 3 different libraries; 100bp reads with 140bp and 300bp insert sizes, and 50bp reads with 3kb insert sizes.

The program SNPable is conceptualised with single-end reads in mind, so deciding on which size k-mer to use is difficult. A guide I read used 250-mers for a single paired end library, though they didn't state the size of the reads nor the insert.

My question is simple, what do I need to consider when deciding what size k-mer to use? The mate pair library makes this particularly difficult, or so I have been led to believe at least. Any help would be greatly appreciated.

