I am validating CNV calls using micro arrays. For now I use only CNV probes, but using SNP probes will increase the resolution greatly. Arrays are cytoScan 750k and HD.
How can I utilize snp probes? I have log transformed intensities of snps, usually there are 2 of them per Genomic position. Am I right that I can transform them to overall intensity just using log(exp(x) +exp(y))? X and y here denote intensity of allelic variants. Will the intensity be equal if x and y worked, or 2y, or 2x?
I use non parametric tests, so small differences in intensity will not spoil everything.
If you go back to the CEL files, then Aroma can calculate copy number, utilising both CN and SNP probes.
If you have just the signal intensities as log [base 2] (log2) ratios, then you can perform Circular Binary Segmentation (CBS) using DNAcopy. The test data that they utilise in their manual is comprised of 2 samples with log2 signal ratios:
Yeap, this could be an option, but the thing I do is a bit more complicated :( it is an IRS test, http://software.broadinstitute.org/software/genomestrip/org_broadinstitute_sv_annotation_IntensityRankSumAnnotator.html so I try to validate cnvs without actually calling them. For that I need the raw intensity. For now I have the files processed with cytoscape software, so they have some cool fields such as weighted log2 intensity, i was thinking may be i can use these ones (for snp probes), but not sure how to merge intensities from snps into one :(
I have neither used nor heard of that program from the Broad Institute, unfortunately... Also, how did you use CytoScape? - for processing the array data?
Yes, I use processed (normalised) array data from CytoScape - so, in general, I expect a lot of noise to be removed due to their Bayesian shrinkage estimator. This IRS method is really good - it allows the validation of CNV sites without calling of CNVs using just Wilcoxon test - because the reliable resolution of most of the arrays is around 50KB, but CNVs may be much shorter, and it is possible to infer False Discovery Rate of CNVs of any size (obviously not the power of detection) without segmentation/calling of CNVs in arrays (however, CNV calls from e.g. NGS data are mandatory). But the problem for now is - how SNP probes may be used in this IRS test...there are 1-3 SNP probes per site, and how to merge them into one intensity - that's the question...
Yeap, this could be an option, but the thing I do is a bit more complicated :( it is an IRS test, http://software.broadinstitute.org/software/genomestrip/org_broadinstitute_sv_annotation_IntensityRankSumAnnotator.html so I try to validate cnvs without actually calling them. For that I need the raw intensity. For now I have the files processed with cytoscape software, so they have some cool fields such as weighted log2 intensity, i was thinking may be i can use these ones (for snp probes), but not sure how to merge intensities from snps into one :(
I have neither used nor heard of that program from the Broad Institute, unfortunately... Also, how did you use CytoScape? - for processing the array data?
Yes, I use processed (normalised) array data from CytoScape - so, in general, I expect a lot of noise to be removed due to their Bayesian shrinkage estimator. This IRS method is really good - it allows the validation of CNV sites without calling of CNVs using just Wilcoxon test - because the reliable resolution of most of the arrays is around 50KB, but CNVs may be much shorter, and it is possible to infer False Discovery Rate of CNVs of any size (obviously not the power of detection) without segmentation/calling of CNVs in arrays (however, CNV calls from e.g. NGS data are mandatory). But the problem for now is - how SNP probes may be used in this IRS test...there are 1-3 SNP probes per site, and how to merge them into one intensity - that's the question...