I have called CNVs from my WGS data and want to do some QC. For this, I want to exclude segments overlapping with more than 50% of their length with the list below, and I have some doubts. Can you please help me?
. Immunoglobulin regions
. Extreme GC content (>90%, <10%): Do these threshold make any sense. My CNVs were called with GC option in Control-Freec so I dont know if this is really necessary. Any thoughts?
. Mappabiliy: Should I use uniqness or alignability definitions to do this? In case I chose uniqness, should I just filter out all regions with uniqness < 1? In case I use alignability, which threshold would you use?
. Repeat masker
. Common CNVs: I am currently using the dgv. Is this recommendable?
. Any other list you would recommend me to use to clean my data?
Thanks a lot