I am preparing a list of regions of the genome that are lucky to include CNVs. To do so I am excluding assembly gaps, regions with poor mappability, and repeat regions as reported in UCSC. I know from the literature that regions worth excluding are also those near centromeres/telomeres, and those having low/high GC. My questions are: a) what "near" a centromere/telomere means? and b) which are meaningful thresholds for GC content? Finally, c) is there any other feature I should be aware of?
Thank you very much!