Error-prone human genomic sequence data from Illumina sequencing platform
1
0
Entering edit mode
3.7 years ago
rbpdee ▴ 50

I am curious if there is any publicly available bed or other forms of track showing sequencing error-prone sequences or regions in the human genome. To be precise, I am looking for DNA regions that often tend to show insertions or deletions due to sequencing errors from DNA pol slippage, e.g, stretches of As or Gs (homopolymeric regions).

Alternatively, a list of sequences that are difficult to sequence with the Illumina Sequencing platform could work too!

rbpdee

genome igv deletions insertions • 937 views
ADD COMMENT
1
Entering edit mode
3.7 years ago
donfreed ★ 1.6k

You might try the GA4GH genome stratifications, https://github.com/genome-in-a-bottle/genome-stratifications

The resource includes many regions (as BED files) that are difficult to sequence with short-read technology (low-mappability, low-complexity, etc.). In particular, the homopolymer and simple repeat regions in the LowComplexity folder will be enriched for "sequencing errors from DNA pol slippage".

ADD COMMENT
0
Entering edit mode

Thanks donfreed! I will look into it.

ADD REPLY

Login before adding your answer.

Traffic: 2751 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6