Blacklisted regions for hg38?
0
1
Entering edit mode
7.4 years ago

Will blacklisted region annotations for hg38 be made?

There are some (to me) cryptic remarks in the 2015 UCSC paper:

One of the biggest innovations in the GRCh38 assembly is the replacement of megabase-sized gaps in human centromere regions with satellite sequence reference models. These models are generated using second-order Markov models of local ordering and frequency of repeat variants through an analysis of Sanger reads from the HuRef sequencing project (25). In the absence of these sequences, roughly 3% of the human genome represented by alpha satellite DNA is often misassigned to sites in the reference assembly, inflating enrichment peak signals and accounting for current blacklisted regions (or regions typically masked for sequence-based analysis).

To accommodate next-generation sequencing read alignment pipelines, the GRCh38 assembly offers an analysis set in which several regions have been masked to improve read mapping. To avoid false mapping of reads, duplicate copies of centromeric arrays on several chromosomes have been hard-masked (represented as a string of 'N' characters). The two pseudoautosomal regions on chromosome Y have also been hard-masked, and the Epstein-Barr virus sequence has been added as a decoy to attract contamination in samples. Two versions of the analysis set are available on the UCSC Genome Browser downloads page: one without the alternate chromosomes from this assembly and one that includes them.

Does this mean that a blacklist annotation is not needed for hg38?

blacklist • 4.1k views
ADD COMMENT
1
Entering edit mode

If by "blacklisted regions" you mean the one produced by encode, no, they are different. The first paragraph just means that GRCh38 contains computer-generated artificial sequence. [25] is a great work, but I am always concerned with putting these sequences in the reference genome.

ADD REPLY
0
Entering edit mode

Thanks. I'll stick with hg19 for now then to be able to remove reads falling into the blacklisted regions.

ADD REPLY
0
Entering edit mode

Note that there are fewer blacklisted regions on hg38 as it is a much better assembly. Nonetheless, before someone do a systematic study of bad regions on hg38, perhaps hg19 is still better.

ADD REPLY
0
Entering edit mode

Feel free to add that as an answer so that the question can be closed (or whatever the term is)

ADD REPLY

Login before adding your answer.

Traffic: 1743 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6