Question: Where to download blacklisted regions?
3
gravatar for saravanakumar992
4.5 years ago by
saravanakumar99250 wrote:

Where can one download the blacklisted regions from ucsc encode data? 

chip-seq blacklist • 8.5k views
ADD COMMENTlink modified 11 months ago by igor11k • written 4.5 years ago by saravanakumar99250
10
gravatar for igor
11 months ago by
igor11k
United States
igor11k wrote:

For those who still land on this question, there is now an updated version (v2) of the blacklists available here: https://github.com/Boyle-Lab/Blacklist/tree/master/lists

These blacklists are described in The ENCODE Blacklist: Identification of Problematic Regions of the Genome (June 2019):

Here, we define the ENCODE blacklist- a comprehensive set of regions in the human, mouse, worm, and fly genomes that have anomalous, unstructured, or high signal in next-generation sequencing experiments independent of cell line or experiment.

ADD COMMENTlink modified 8 months ago • written 11 months ago by igor11k
7
gravatar for Friederike
3.5 years ago by
Friederike5.8k
United States
Friederike5.8k wrote:

just as an update for those in need of more recent blacklist regions (ce10, mm10, hg38), Anshul Kundaje supplies them here

ADD COMMENTlink written 3.5 years ago by Friederike5.8k
1

It was reported have overlaps. I looked into the list and found the overlap: chr16 34586660 34587100 chr16 34587060 34587660

Some one have a updated or fixed one?

Thanks,

Weiyan

ADD REPLYlink written 2.3 years ago by weiyanjia200830

I'm afraid I don't understand the issue. Can you elaborate?

ADD REPLYlink written 2.3 years ago by Friederike5.8k
1

When I use this file as a blacklist to run deeptools-bamcompare, it was reported this list has overlaps. chr16 34586660 34587100 chr16 34587060 34587660

My understand is the bottom range overlaps with the top one. 34587060 < 34587100

ADD REPLYlink written 2.3 years ago by weiyanjia200830
2

you could use bedtools merge -i blacklist.bed to merge those overlapping regions.

ADD REPLYlink written 2.2 years ago by Friederike5.8k

Thank you very much! It works right now.

ADD REPLYlink written 22 months ago by weiyanjia200830

I downloaded the hg38 blacklist from Anshul Kundaje's repository as well and I found not only the overlap @weiyanjia2008 mentioned, but also chr20: 31067930 31069060 chr20: 31069000 31069280 . These two regions should also be merged. When I merged all of them (the ones from chr16 you identified and the ones I identified), it solved the problem for me. I just emailed Anshul Kundaje reporting this issue, in case he didn't notice so far, so that can be fixed and the next person that downloads it doesn't face the same problem.

ADD REPLYlink modified 16 months ago • written 16 months ago by msimmer92250

I have been working on these things lately. Here is some more information.

ADD REPLYlink written 3.5 years ago by venu6.6k

Hi @Friederike and @venu Why are hg38 and hg19 list different? EDIT: The hg38 list seems to be smaller probably because many regions have been fixed in hg38 assembly. Does the hg38 blacklist also contain mitochondrial homologs? I believe, the homologs remain whether it be hg19 or hg38. Any insights on this. Thanks!

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by abhishekniroula750

I have no deep insights into the specific differences between hg38 and hg19, but hg38 is generally considered to contain more "bait" sequences (meant to scavenge away reads that probably fell into some of the blacklisted regions before). I recommend to address Anshul Kundaje directly (and ideally share his response here).

ADD REPLYlink written 2.8 years ago by Friederike5.8k
1
gravatar for Devon Ryan
4.5 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

There's not yet a blacklist available for each species or even each version of mouse/human. You can get the mm9 blacklist here. The equivalent for hg19 is here. There's no equivalent for hg38 and I'm not sure that lifting things over will work, though you could certainly try. We have an mm10 version of this, but I don't know that it's publicly available.

ADD COMMENTlink written 4.5 years ago by Devon Ryan96k
1
gravatar for geocarvalho
13 months ago by
geocarvalho140
Brazil/Recife
geocarvalho140 wrote:

If someone wants the SV excludelist, the 10x genomics has a topic about SV Calling Filter File

ADD COMMENTlink modified 19 days ago • written 13 months ago by geocarvalho140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1306 users visited in the last hour