Question: Fine Tune Ngs Downstream Analyses Using Genome Mappability ?
5
gravatar for toni
8.1 years ago by
toni2.2k
Lyon
toni2.2k wrote:

Hi all,

It's been a long time that the question of using Genome Mappability in my team has been around but no concrete effort to use it in our analyses has been undertaken by us to date.

Genome Mappability of a region is referred to as the ability of a given DNA stretch when produced by a sequencing experiment to map back unambiguously to its location on the reference genome. (see this paper from Derrien & al)

On the UCSC website, you can find some 'Mappability' tracks tagged as either Alignability, Uniqueness or Blacklisted Regions. (with a varying read length notably).

What I would like to know is :

Do you, in practice today, use this notion to fine tune your analyses (SNP / CNV calling, coverage ... ) ?

EDIT : Answering 'NO' with an explanation of the encountered difficulties to use it or the reason why you do not use it is a good answer as well :)

Concretely, the mappability (or uniqueness) is a score between 0 and 1 for each base along the reference genome. I am also wondering to which extent it can be used to define 'larger' (and then more usable in practice) mappable regions. Setting a hard threshold into this score could result in a very chopped genome.

BONUS : If someone possesses a BED file (not WIG) of mappable genome with read length = 100, I would be very interested.

Thanks in advance.

ADD COMMENTlink modified 8.1 years ago by Pascal1.5k • written 8.1 years ago by toni2.2k

I would really like to know about - "Do you, in practice today, use this notion to fine tune your analyses (SNP / CNV calling, coverage ... ) ?" Can anyone please give answer about this?

ADD REPLYlink written 8.1 years ago by Vikas Bansal2.4k
2
gravatar for Pascal
8.1 years ago by
Pascal1.5k
Barcelona
Pascal1.5k wrote:

Very interesting question. Regarding the bonus question, why don't you simply download it from this UCSC data download page (file wgEncodeCrgMapabilityAlign100mer.bw.gz maybe?). Use bigWigToBedGraph to convert it to bedGraph format.

ADD COMMENTlink written 8.1 years ago by Pascal1.5k

FYI, you can download bigWigToBedGraph from http://hgdownload.cse.ucsc.edu/admin/exe/

ADD REPLYlink written 8.1 years ago by Pascal1.5k

Thank you Pascal, I am going to give this a try.

ADD REPLYlink written 8.1 years ago by toni2.2k

Dear Pascal, I am a bit confused because for instance the chr1 starts like this 'chr1 0 14 0.00277778'and there are 10000 N's at the beginning of chr1 sequence. So mappability must equal 0 at these positions ... Any idea why ? An offset or something ? Thanks

ADD REPLYlink written 8.0 years ago by toni2.2k

I did not notice at first that you pointed me to a hg18 link. With hg19, the bedgraph file looks better.

ADD REPLYlink written 8.0 years ago by toni2.2k
1
gravatar for Ian
8.1 years ago by
Ian5.6k
University of Manchester, UK
Ian5.6k wrote:

If there is no mappabilty data for your genome/read length then you may find ProMap of use, it generates mappabilty profiles for use with PICS (an R-based ChIP-seq peak caller by the Gottardo lab).

This BioStars question may also help.

ADD COMMENTlink modified 6 months ago by RamRS26k • written 8.1 years ago by Ian5.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 985 users visited in the last hour