Question

CNV calling on targeted sequencing. Filter based on distance to target?

0

Entering edit mode

2.0 years ago

RNG_Daemon ▴ 20

I did some CNV on a targeted sequencing run. I was provided with a panel that contained a "SNP-Backbone", meaning 2000 targets more or less evenly distributed over the genome. I did the calling using CNVKit and PureCN. One aspect of the calling process is the segmentation, where adjacent targets are merged into one segment. This can lead to the calling considering a region (segment) lost/deleted, even though it is only supported by two targets that are millions of base pairs away. In the image below, the "Panel" shows the location of the targets. The "CNV_Freq" shows a large deletion spanning the centromer, which doesn't really make sense, but its surrounding target regions are deleted, so it is deleted as well

result of CNV calling for one chromosome

I am aware, that there is no real reason around this issue, but would it make sense to filter, for example gene losses, based on distance to the next target? The further away, the less reliable?

I will admit, that the neighboring targets are rarely "volatile", with one going up and the other one down and then up again.

rna-seq sequencing targeted cnv • 578 views

ADD COMMENT • link updated 2.0 years ago by markus.riester ▴ 550 • written 2.0 years ago by RNG_Daemon ▴ 20

score 1 · Answer 1 · 2022-04-15

1

Entering edit mode

2.0 years ago

markus.riester ▴ 550

I think the easiest and probably cleanest way is to treat these large gaps as missing data. That’s what I do in the PureCN internal normalization and PSCBS segmentation. You essentially force breakpoints at the beginning and end of the gaps so that the segment means don’t influence each other.

ADD COMMENT • link 2.0 years ago by markus.riester ▴ 550