Is blacklisted regions filtering in ChIP-seq still needed? (let's talk about the state of the art)
3.9 years ago
msimmer92 ▴ 300

Hello, I am analysing ChIP-seq data and I saw the following post by @Devon Ryan (made 1 year, 7 months ago) https://bioinformatics.stackexchange.com/questions/458/when-to-account-for-the-blacklisted-genomic-regions-in-chip-seq-data-analyses/459#459?newreg=dca76bad61c443d7b4f0b1abd1487878 saying that, nowadays with the latest genome assemblies, one has less problems with blacklisted regions since they have been reduced.

I want to know, then, what's the state of the art of this situation? Should I remove them or not? (By the way, I was planning to use Deeptools to do it, but if it's not really necessary anymore I won't). Thank you!

Link to blacklists. A simple bedtools intersect -v -a your_regions.bed -b blacklist.bed will do.

@ATpoint do the bed files need to be sorted or something before doing that? thanks!

No, bedtools does not rely on sorted files.

I would suggest to remove them. I am not sure if you have blacklisted for GRCh38, but if you are using GRCh37/hg19, you should remove them.

3.9 years ago

It's still considered best-practice to remove these regions. For genomes like GRCh38, the blacklisted regions are largely comprised of things like major satellite repeats, which are primarily located in hard-masked telomeric and pericentromeric regions. Given that, these regions will still show aberrantly high signal in all of your samples (thereby skewing normalization and often adding meaningless peaks).

Thank you for your fast reply! I will include it in my pipeline, then. Thank you also for the explanation about the location of repeats and therefore why is good to filter them anyways, even if you use the latest version of the assembly. (P.S: I didn't clarify but I'm working mostly with mouse and human genome). Follow-up question: you have any thoughts about if it is better to filter these regions with deepTools or with bedtools intersect (as another person commented here)?

It doesn't much matter whether you filter with deepTools or bedtools intersect, you should get essentially the same results either way.