Fine Tune Ngs Downstream Analyses Using Genome Mappability ?
2
8
Entering edit mode
12.1 years ago
toni ★ 2.2k

Hi all,

It's been a long time that the question of using Genome Mappability in my team has been around but no concrete effort to use it in our analyses has been undertaken by us to date.

Genome Mappability of a region is referred to as the ability of a given DNA stretch when produced by a sequencing experiment to map back unambiguously to its location on the reference genome. (see this paper from Derrien & al)

On the UCSC website, you can find some 'Mappability' tracks tagged as either Alignability, Uniqueness or Blacklisted Regions. (with a varying read length notably).

What I would like to know is :

Do you, in practice today, use this notion to fine tune your analyses (SNP / CNV calling, coverage ... ) ?

EDIT : Answering 'NO' with an explanation of the encountered difficulties to use it or the reason why you do not use it is a good answer as well :)

Concretely, the mappability (or uniqueness) is a score between 0 and 1 for each base along the reference genome. I am also wondering to which extent it can be used to define 'larger' (and then more usable in practice) mappable regions. Setting a hard threshold into this score could result in a very chopped genome.

BONUS : If someone possesses a BED file (not WIG) of mappable genome with read length = 100, I would be very interested.

Thanks in advance.

genome next-gen sequencing mapping analysis • 3.8k views
ADD COMMENT
0
Entering edit mode

I would really like to know about - "Do you, in practice today, use this notion to fine tune your analyses (SNP / CNV calling, coverage ... ) ?" Can anyone please give answer about this?

ADD REPLY
2
Entering edit mode
12.1 years ago
Pascal ★ 1.5k

Very interesting question. Regarding the bonus question, why don't you simply download it from this UCSC data download page (file wgEncodeCrgMapabilityAlign100mer.bw.gz maybe?). Use bigWigToBedGraph to convert it to bedGraph format.

ADD COMMENT
0
Entering edit mode

FYI, you can download bigWigToBedGraph from http://hgdownload.cse.ucsc.edu/admin/exe/

ADD REPLY
0
Entering edit mode

Thank you Pascal, I am going to give this a try.

ADD REPLY
0
Entering edit mode

Dear Pascal, I am a bit confused because for instance the chr1 starts like this 'chr1 0 14 0.00277778'and there are 10000 N's at the beginning of chr1 sequence. So mappability must equal 0 at these positions ... Any idea why ? An offset or something ? Thanks

ADD REPLY
0
Entering edit mode

I did not notice at first that you pointed me to a hg18 link. With hg19, the bedgraph file looks better.

ADD REPLY
1
Entering edit mode
12.1 years ago
Ian 6.0k

If there is no mappabilty data for your genome/read length then you may find ProMap of use, it generates mappabilty profiles for use with PICS (an R-based ChIP-seq peak caller by the Gottardo lab).

This BioStars question may also help.

ADD COMMENT

Login before adding your answer.

Traffic: 2657 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6