Are you using Affymetrix or Illumina array data? Since so many CNVs are ostensibly caused by segmental duplication, your question is valid. In practice, we have found that the most important things in filtering your output are 1) merging long CNV calls interrupted by a minority of markers 2) dropping out telomere regions, 3) dropping out centromeric regions, 4)filtering out samples with excessively high LRR standard deviations, 5) filtering out samples with excessive CNV calls, 6) removing CNVs shorter than some threshold number of probes/length (e.g. 10 probes, 10kb depending on your chip.) The segmental duplications have not proved too much an obstacle, but if you think it could be upsetting your data, take your BED file of called CNVs and see the extent to which it overlaps segmental duplications in the UCSC genome browser. My coordinate for the hg18 centromeric and telomeric regions are below. Sorry for the length.