It's been a long time that the question of using Genome Mappability in my team has been around but no concrete effort to use it in our analyses has been undertaken by us to date.
Genome Mappability of a region is referred to as the ability of a given DNA stretch when produced by a sequencing experiment to map back unambiguously to its location on the reference genome. (see this paper from Derrien & al)
On the UCSC website, you can find some 'Mappability' tracks tagged as either Alignability, Uniqueness or Blacklisted Regions. (with a varying read length notably).
What I would like to know is :
Do you, in practice today, use this notion to fine tune your analyses (SNP / CNV calling, coverage ... ) ?
EDIT : Answering 'NO' with an explanation of the encountered difficulties to use it or the reason why you do not use it is a good answer as well :)
Concretely, the mappability (or uniqueness) is a score between 0 and 1 for each base along the reference genome. I am also wondering to which extent it can be used to define 'larger' (and then more usable in practice) mappable regions. Setting a hard threshold into this score could result in a very chopped genome.
BONUS : If someone possesses a BED file (not WIG) of mappable genome with read length = 100, I would be very interested.
Thanks in advance.