I'm trying to do some CNV calling from Illumina Omni2.5 array data using penncnv.
My array data is from a few different 'versions' of the Omni2.5 arrays (some from 2011, some from 2013 and some from 2014) so the SNP names have changed between versions as dbSNP is updated. I've tried combining the 2011 samples with the later samples in GenomeStudio, to get everything consistent but it just ends up dropping the 99% sample call rates down to 88%. I'm unsure if this drop is due to inherent differences in the data, or because newer array versions have more targets. I figured I'd be able to get by, by changing SNP names to Chr:Position to have a consistent factor between the different array versions.
Penncnv requires a population frequency of B allele (pfb) file created from a few hundred population samples. We don't have a large group of population samples so I thought to use the 1000 GP EUR Omni2.5 data. After several days of manhandling the data through GenomeStudio I've come to making the pfb file and realised that the 1000 GP data is hg18 coordinates. Using UCSC's liftover was my first port of call, but this removes ~1000 SNP locations that have been deleted between genome versions, so the order of the output is completely different to the input meaning I can't match supplementary columns back to the lifted over genome coordinates. (and throws my idea of using Chr:Position instead of rsIDs out the window).
So my questions are:
- How do I get 1000 GP Omni2.5 data from hg18 to hg19 format?
and related to this:
- How do I handle multiple versions of Omni2.5 array data? Using a newer SNP manifest file in GenomeStudio does not do the trick.
I've started a conversation with Illumina Tech support and our Rep about this but currently their response stands at:
"We realise the problems associated with using chips from different versions within a project and generally recommend where possible to use a single chip version per project."
(which of course is great in a perfect world, but this is science)
If I get a solution from Illumina I'll post it below.
Thanks muchly for any thoughts / ideas!