Data Cleanup Prior To Cnv Calling
Entering edit mode
8.6 years ago
Robert Sicko ▴ 630


We are running into a bottle neck in a study aiming to identify copy number variants in multiple separate groups of subjects. We are genotyping using Omni2.5-8 chips (~2.3 million markers) and analyzing the data and cleaning up in Genome Studio. The data cleanup, prior to CNV calling, is taking weeks to perform.

I am following the cleanup procedures described in “technote_infinium_genotyping_data_analysis” Basically, sorting on various metrics and zeroing SNPs that performed poorly. Following cleanup, I exclude failed SNPs and save the project with those SNPs excluded; the clean project is then used for CNV calling with multiple algorithms. I’ve written some C++ programs to speed up annotating the CNV calls.

The literature briefly (if at all) mentions data cleanup prior to CNV calling, so I’m in the dark if data cleanup normally takes weeks to complete and is just not mentioned since it is so mundane and standard. Or is there something I am missing that will make life a lot easier?

Thanks, Bob

edit: added microarray tag

cnv illumina copynumber microarray microarray • 1.8k views
Entering edit mode

Surely someone has experience with cleaning up data prior to CNV calling, right? I'm not complaining, if it does indeed take this much time, so be it. I just want to make sure I'm not missing something. Thanks.


Login before adding your answer.

Traffic: 2496 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6