Where to download blacklisted regions?
5
6
Entering edit mode
8.8 years ago

Where can one download the blacklisted regions from ucsc encode data?

ChIP-Seq blacklist • 25k views
ADD COMMENT
18
Entering edit mode
5.2 years ago
igor 13k

For those who still land on this question, there is now an updated version (v2) of the blacklists available here: https://github.com/Boyle-Lab/Blacklist/tree/master/lists

These blacklists are described in The ENCODE Blacklist: Identification of Problematic Regions of the Genome (June 2019):

Here, we define the ENCODE blacklist- a comprehensive set of regions in the human, mouse, worm, and fly genomes that have anomalous, unstructured, or high signal in next-generation sequencing experiments independent of cell line or experiment.

ADD COMMENT
9
Entering edit mode
7.7 years ago

just as an update for those in need of more recent blacklist regions (ce10, mm10, hg38), Anshul Kundaje supplies them here

ADD COMMENT
1
Entering edit mode

It was reported have overlaps. I looked into the list and found the overlap: chr16 34586660 34587100 chr16 34587060 34587660

Some one have a updated or fixed one?

Thanks,

Weiyan

ADD REPLY
0
Entering edit mode

I'm afraid I don't understand the issue. Can you elaborate?

ADD REPLY
1
Entering edit mode

When I use this file as a blacklist to run deeptools-bamcompare, it was reported this list has overlaps. chr16 34586660 34587100 chr16 34587060 34587660

My understand is the bottom range overlaps with the top one. 34587060 < 34587100

ADD REPLY
3
Entering edit mode

you could use bedtools merge -i blacklist.bed to merge those overlapping regions.

ADD REPLY
0
Entering edit mode

Thank you very much! It works right now.

ADD REPLY
0
Entering edit mode

I downloaded the hg38 blacklist from Anshul Kundaje's repository as well and I found not only the overlap @weiyanjia2008 mentioned, but also chr20: 31067930 31069060 chr20: 31069000 31069280 . These two regions should also be merged. When I merged all of them (the ones from chr16 you identified and the ones I identified), it solved the problem for me. I just emailed Anshul Kundaje reporting this issue, in case he didn't notice so far, so that can be fixed and the next person that downloads it doesn't face the same problem.

ADD REPLY
0
Entering edit mode

I have been working on these things lately. Here is some more information.

ADD REPLY
0
Entering edit mode

Hi @Friederike and @venu Why are hg38 and hg19 list different? EDIT: The hg38 list seems to be smaller probably because many regions have been fixed in hg38 assembly. Does the hg38 blacklist also contain mitochondrial homologs? I believe, the homologs remain whether it be hg19 or hg38. Any insights on this. Thanks!

ADD REPLY
0
Entering edit mode

I have no deep insights into the specific differences between hg38 and hg19, but hg38 is generally considered to contain more "bait" sequences (meant to scavenge away reads that probably fell into some of the blacklisted regions before). I recommend to address Anshul Kundaje directly (and ideally share his response here).

ADD REPLY
0
Entering edit mode

Hi Can you comment on this issue:https://github.com/Boyle-Lab/Blacklist/issues/11

I really don't understand how the two files differ and which ones should be ultimately used.

ADD REPLY
5
Entering edit mode
2.4 years ago
ATpoint 85k

From "stable" repositories in 2022:

The general ENCODE blacklists at https://github.com/Boyle-Lab/Blacklist/raw/master/lists/

The ATAC-seq blacklists from the Buenrostro lab (mitochondrial homologs in the genome) at https://github.com/buenrostrolab/mitoblacklist/tree/master/peaks

ADD COMMENT
1
Entering edit mode
8.8 years ago

There's not yet a blacklist available for each species or even each version of mouse/human. You can get the mm9 blacklist here. The equivalent for hg19 is here. There's no equivalent for hg38 and I'm not sure that lifting things over will work, though you could certainly try. We have an mm10 version of this, but I don't know that it's publicly available.

ADD COMMENT
1
Entering edit mode
5.4 years ago
geocarvalho ▴ 370

If someone wants the SV excludelist, the 10x genomics has a topic about SV Calling Filter File

ADD COMMENT

Login before adding your answer.

Traffic: 908 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6