CNV and pseudogenes
I'm using ExomeDepth to detect CNVs and I'm providing a bed file for all our targeted gene panels.

We know there is a processed pseudogene in a region, am just wondering if this would confound CNV calling in other genes? If it knocks off the ‘average’ coverage ?

Is it worth excluding such regions ?

I think it's depends on your reads length and the size of homology you have between your gene and pseudogene. If you have NGS you tank load the data and look manually with IGV for example.

Even if you did exclude the region for the analysis, how could you even be sure that the original reads themselves did not derive [in part] from the pseudogene.

Unfortunately, some regions of the genome are impossible to be faithfully analysed by short read NGS technology. If your group is studying a particular gene of interest, I would suggest a non-NGS protocol for the purpose of studying it, or at least a targeted NGS protocol where you know that the primers are targeting unique sequence around your genes of interest.

True, I agree but we are more in a clinical setting than research so its generating data per panels.. i guess it's just finding an appropriate way of deal with this in a diagnostic environment

If it's a clinical setting, then you even more need to ensure the fidelity of the test. I would not trust any NGS exome copy number tool for use in a clinical test. PCR or some other technique is the way to go. Just my opinion, though, having worked in this area.

of course, we MLPA everything before reporting but also trying to build a reliable bioinformatics pipeline

What was your choice, in the end?

well we are still doing MLPA. Cannot rely on an invalidated CNV pipeline

