In case of CNV/SV filtration, what is the %age of overlap that is most widely used to filter out a CNV/SV segment after comparison with DGV and dbVar ?
Also in DGV, there are few segments which has both gain and loss. So, while filtering against our CNV data, how to filter out our variants (Individual CNV Loss and CNV Gain variants) against the DGV variants having both gains and loss?
Just to let you know CNVs and SVs have differences. CNVs will be way smaller than a SV and filtering CNVs based on SVs detected in healthy individuals based on the databases of DGV and dbVar is not the correct way. When you have SNVs from your data and you are not aware if these CNVs are germiline or somatic, you can then look for ExAC database for removing the CNVs that resemble an overlap with yours. Ideally the cut-off has to be selected based on your biological query and also how many genes are usually under that CNV. Or you can check in browsers like ADVISER. For finding lethal SVs in your data or even for that matter removing SVs that are seen in normal individuals then DGV or dbVar is ok. You have to annotate the CNVs or SVs and prioritize them.