DNA copy number blacklist for excluding common unreliable regions
1
0
Entering edit mode
9 weeks ago

Can anyone recommend some sources for reliable DNA copy number blacklists for excluding unreliable regions in the genome prior to copy number analysis for human genome?

blacklist copy number • 2.5k views
ADD COMMENT
0
Entering edit mode

Please include target species you are working on. These availability of these kinds of resources will be quite variable depending on how established your model system is. If you work on a non-model system, you might have to compile this list yourself, or filter results accordingly. For example, immune genes like MHC are particularly difficult to get an accurate copy number for without careful experimental design.

ADD REPLY
0
Entering edit mode

Human genome, ideally hg38

ADD REPLY
1
Entering edit mode
19 days ago
Kevin Blighe ★ 90k

Hi,

For human hg38 copy number analysis, the go-to exclusion sets focus on mappability issues, repeats, and artifacts. Here's what I'd recommend—stick to these well-maintained ones:

  • ENCODE/DAC Blacklist (unified hg38): Comprehensive for high-signal artifact regions. Download BED: ENCODE portal. Also available via Boyle Lab GitHub.

  • Duke Excluded Regions (lifted to hg38): Filters out low-mappability areas from ENCODE pilots. Grab from UCSC (hg19 base, liftOver to hg38 via UCSC tool): wgEncodeDukeMapabilityExcluded.

  • Unified Blacklist (Stuart Lab): Merges ENCODE + Duke for hg38, great for CNV pipelines. Direct BED: stuartlab.org.

Combine via bedtools merge and intersect with your bins. A recent 2025 review confirms ENCODE's still the gold standard, but test overlaps for your data. Avoid over-filtering segdups if your assay handles them.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 3254 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6