Question: Cnv Calling In A Region With Homozygous Snps
0
gravatar for romsen
7.9 years ago by
romsen60
romsen60 wrote:

I genotyped 100 samples for CNVs with Taqman Assays and found 6 samples with a heterozygous duplication. I used two independent assays which target at the beginning and the end of the CNVs, according DGV. So I’m "quite" sure that these duplications are really Gains.

We have also SNP data (LRR and BAFvalues) from a genome wide SNP array in this area. But exactly there are only 17 SNPs. Not that much but sufficient for PennCNV to call CNVs. But only in 4 of the 6 samples we were able to reproduce the results from qPCR.

Therefore I plotted BAF and LRR values against the SNP- position and saw several things:

The 4 samples in which both methods (qPCR & PennCNV) called a Gain have many SNPs with heterozygous Alleles (8 to 11 of 17). Thus BAF values cluster around 4 points. (AAA,AAB,BBA,BBB) and the LRR values increase.

Nearly all SNPs in the 2 samples in which the qPCR analysis showed also a duplication but PennCNV not, are homozygous. (15-16 of 17). This is the reason why all SNPs have BAF values around 0 or1. (AAA or BBB) The surprising effect is that the LRR is not increasing. But actually it should increase in case of duplications.

Now my question: Could it be that the effect of non-increasing LRR values is due to the homozygous SNPs? The data was exported from Beadstudio. So I would imagine that the internal LRR calculation [LRR = log2(Robserved/Rexpected)] failed due to threshold mistakes or interpolation failures. (more than 60% of the 100 samples carry mainly homozygous SNPs)

(Rexpected is computed from linear interpolation of canonical genotype clusters (Peiffer et al. 2006))

Thanks

cnv snp • 2.3k views
ADD COMMENTlink modified 7.9 years ago by Matt Shirley9.5k • written 7.9 years ago by romsen60
1
gravatar for Matt Shirley
7.9 years ago by
Matt Shirley9.5k
Cambridge, MA
Matt Shirley9.5k wrote:

If you are calculating your genotype clusters from 17 SNPs, I would not trust any of your data. You should be using the entire array for cluster generation, and then subset the BAF and LRR values after.

ADD COMMENTlink written 7.9 years ago by Matt Shirley9.5k

Sorry I'm a complete rookie in this field. But if I understand you right ("entire array for cluster generation") I think I've done that. I used BAF and LRR values which were exported from beadstudio as final results. Not only the 17SNPs but the complete array.

ADD REPLYlink written 7.9 years ago by romsen60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1950 users visited in the last hour
_